Final Examination Answer
Final Examination Answer
Final Examination Answer
1. There are various types of graphical representations used in statistics to visualize data. Three common
types are:
a. Bar Chart: A bar chart represents data with rectangular bars. It's used to show and compare the
frequency or distribution of categorical data. For example, you can create a bar chart to display the
number of cars of different colors in a parking lot.
b. Line Graph: A line graph is used to show trends and changes in data over a continuous range. For
instance, you can use a line graph to illustrate the change in temperature over a week.
2. Skewness measures the asymmetry of a probability distribution. There are various methods to
measure skewness, including the most common one using the sample data:
A computed value is said to be Leptokurtic when it indicates a distribution with more extreme
values in its tails compared to a normal distribution. This means the distribution has higher peak
and heavier tails.
A computed value is considered Platykurtic when it suggests a distribution with fewer extreme
values in its tails compared to a normal distribution. This means the distribution has a flatter
peak and lighter tails.
4. To create a box-and-whisker plot for the given data, you'll first need to organize the data in ascending
order and then calculate the quartiles and other necessary statistics. After that, you can draw the plot
and interpret the results. Here's how you can do it step by step:
40, 46, 49, 54, 55, 55, 60, 60, 64, 68, 75, 76, 77, 79, 84, 89, 90, 90, 92, 94
There are 20 data points, so the median will be the average of the 10th and 11th values.
Median = (64 + 68) / 2 = 66
b) Calculate the first quartile (Q1), which is the median of the lower half of the data (excluding the
overall median):
There are 10 data points in the lower half, so the median will be the average of the 5th and 6th
values.
Q1 = (55 + 55) / 2 = 55
c) Calculate the third quartile (Q3), which is the median of the upper half of the data (excluding the
overall median):
There are 10 data points in the upper half, so the median will be the average of the 5th and 6th
values.
Now that you have the necessary statistics, you can create the box-and-whisker plot:
|-------|----------------|--------| |-------|
40 55 66 86.5 94
The horizontal line inside the box represents the median (Q2) at 66.
The box represents the interquartile range (IQR), which goes from Q1 (55) to Q3 (86.5).
The "whiskers" extend from the minimum value (40) to the maximum value (94).
The median (Q2) is 66, indicating that 50% of the students scored below 66 on the pre-test
exam.
The interquartile range (IQR) goes from Q1 (55) to Q3 (86.5), which represents the middle 50%
of the data. This range shows that the majority of students scored between 55 and 86.5.
The whiskers show the range of the data, with the lowest score being 40 and the highest being
94. There are no outliers in this data.
The box-and-whisker plot provides a clear visual summary of the distribution and spread of the
data.
5. To calculate the sample standard deviation for the given data, you can follow these steps:
2. Calculate the squared difference between each data point and the mean for all 12 data points:
4. Divide the sum of squared differences by (n-1), where n is the number of data points (12 in this
case) to calculate the sample variance.
5. Take the square root of the sample variance to obtain the sample standard deviation.
(4 - 10.67)^2 ≈ 44.89 (5 - 10.67)^2 ≈ 31.36 (6 - 10.67)^2 ≈ 21.96 (7 - 10.67)^2 ≈ 13.48 (8 - 10.67)^2 ≈ 7.14
(9 - 10.67)^2 ≈ 2.79 (12 - 10.67)^2 ≈ 1.79 (14 - 10.67)^2 ≈ 11.11 (14 - 10.67)^2 ≈ 11.11 (15 - 10.67)^2 ≈
12.06 (16 - 10.67)^2 ≈ 30.78 (18 - 10.67)^2 ≈ 54.61
44.89 + 31.36 + 21.96 + 13.48 + 7.14 + 2.79 + 1.79 + 11.11 + 11.11 + 12.06 + 30.78 + 54.61 ≈ 242.09
Sample Variance (s²) = Sum of squared differences / (n - 1) Sample Variance (s²) ≈ 242.09 / (12 - 1)
Sample Variance (s²) ≈ 242.09 / 11 ≈ 22.01 (rounded to two decimal places)
Sample Standard Deviation (s) = √Sample Variance Sample Standard Deviation (s) ≈ √22.01 ≈ 4.69
(rounded to two decimal places)
So, the sample standard deviation for the given data is approximately 4.69.
6. Here's how you would conduct a One-Way ANOVA to determine differences within a factor:
1. Collect Data: Collect data from multiple groups or levels under a single factor. For example, you
might be comparing the test scores of students who have received different types of tutoring.
Null Hypothesis (H0): There is no significant difference in the means of the groups.
Alternative Hypothesis (Ha): At least one group mean is different from the others.
Calculate the sum of squares within groups (SSW), which quantifies the variability within
each group.
Calculate the degrees of freedom (df) for both SSB and SSW.
The F-statistic is calculated as the ratio of SSB to SSW, adjusted for the degrees of
freedom.
Where df_B is the degrees of freedom for between groups and df_W is the degrees of
freedom for within groups.
Using the F-distribution table or a statistical software, determine the critical F-value at a
specified significance level (usually 0.05).
If the calculated F-statistic is greater than the critical F-value, you reject the null
hypothesis, indicating that there are statistically significant differences between at least
some of the group means.
If the calculated F-statistic is less than the critical F-value, you fail to reject the null
hypothesis, indicating that there are no significant differences between the group
means.
If the One-Way ANOVA indicates that there are significant differences between groups,
you can perform post-hoc tests (e.g., Tukey's HSD, Bonferroni, Scheffe) to identify which
specific group pairs have significant differences.
8. Interpretation:
If the null hypothesis is rejected, you can conclude that there are statistically significant
differences within the factor. If not rejected, there is no significant difference among the
groups.
7. To determine if there is enough evidence to discard the null hypothesis, we can perform a hypothesis
test. In this case, the null hypothesis (H0) is that the true mean weight of all residents in Negros is 160
lbs, and the alternative hypothesis (Ha) is that the true mean is different from 160 lbs. We will perform a
two-tailed t-test at a 95% confidence level.
4. Find the critical t-values for a two-tailed test with a 95% confidence level and 2.5% in each tail
(α/2 = 0.025). You can look up the t-table or use a t-distribution calculator. With 28 degrees of
freedom and α/2 = 0.025, the critical t-values are approximately ±2.048.
In this case, |3.73| > 2.048, which means that the calculated t-statistic falls in the rejection region. This
implies that you have enough evidence to reject the null hypothesis.
So, at a 95% confidence level, there is enough evidence to conclude that the true mean weight of
residents in Negros is different from 160 lbs based on the sample data.
8. The coefficient of correlation, often denoted as "r," measures the strength and direction of the linear
relationship between two variables. To calculate the coefficient of correlation using simple regression
coefficients, you typically use the formula for Pearson's correlation coefficient (r). Here's how you can do
it:
Let's assume you have two variables, X and Y, and you've already calculated the simple linear regression
coefficients:
Where ΣX is the sum of all X values, ΣY is the sum of all Y values, and "n" is the number of data points.
2. Calculate the sum of the products of the deviations from the means:
This means, for each pair of X and Y values, calculate the difference between each X value and the mean
of X, and the difference between each Y value and the mean of Y. Then, multiply these differences
together and sum them up for all data points.
Square the difference between each X value and the mean of X and sum them up, and do the same for Y
values.
In this formula, Σ represents the summation symbol, and √ represents the square root.
5. Once you calculate "r," it will give you a value between -1 and 1. The sign of "r" indicates the
direction of the relationship:
A positive "r" (r > 0) indicates a positive (direct) linear relationship. As X increases, Y also
increases.