(ENGDAT2) Exercise 3
(ENGDAT2) Exercise 3
(ENGDAT2) Exercise 3
22
Dayao, Audrey
Dugang, Cloe
Espina, Anton
Espos, Jazmine
Mapanoo, Darren
1. The National Chemical Plant consumes monthly electric power which is thought to be
related to the average ambient temperature (x1), the number of days in the month (x2),
the average product purity (x3), and the tons of product produced (x4). The past year’s
historical data are available and are presented in the following table:
1
a. Fit a multiple linear regression model to these data.
2
b. Estimate 𝛔2
3
d. Predict the power consumption for a month in which x1 = 750F, x2 = 24 days, x3 =
90%, and x4 = 98 tons.
e. Test the significance of regression using 𝛂= 0.05. What is the P-value for this test?
f. Use the t-test to assess the contribution of each regressor to the model. Using 𝛂 =
0.05. What conclusions can you draw?
𝐻0: β𝑗 = 0
𝐻𝑎: β𝑗 ≠ 0
Since the p-value for x1, 0.03, is less than the significance level of 0.05, reject H0.
While for the rest of the factors, x2, x3, x4, has p-values, 0.103, 0.212, 0.415, respectively,
4
more than the significance level of 0.05, accept H0. In general, with the p-value of 0.5, it
is greater than alpha 0.05, thus, accept H0.
With this, there is evidence that average ambient temperature (x1) affect the
monthly electric power while the number of days in the month (x2), the average product
purity (x3), and the tons of product produced (x4) does not affect the monthly electric
power at 𝛂 = 0.05.
x1 0.09734663 1.41723155
x2 -1.9636464 17.0012143
x3 -1.7954385 6.76159563
x4 -1.7939136 0.8316431
h. Find a 95% prediction interval on the power consumption when x1 = 75, x2 = 24, x3 =
90, and x4 = 98.
5
i. Calculate R2 for this model. Interpret.
SSR = 5600.452812
SST = 6572.916667
2 𝑆𝑆𝑅 5600.452812
𝑟 = 𝑆𝑆𝑇
= 6572.916667
= 0. 852049873
6
j. Plot the residuals versus 𝑦. Interpret this plot.
The points on the scatterplot show which standardized residuals are in the range
of -10 to +10 with certain points outside the range indicating certain unusual results from
the problem. The points are still scattered randomly around the residual 0-line meaning
that the linear model is still appropriate in modeling this data.
7
k. Construct a normal probability plot of the residuals and comment on the normality
assumption.
Because of the points on figure 2. or the scatterplot form a line going upwards from
left to right meaning that there’s a positive correlation or relationship between the
variables so as one variable increases the other variables would also tend to
increase.
8
Hypothesis
Test at α= 0.05
Calculations
Sum of Degree of
Source of variation Mean square Computed F P-value F criti
squares freedom
Total 84.55 19
Analysis
Conclusion
9
There is sufficient evidence to come up with the conclusion that distance does have an effect on
the subject. Therefore, it can be concluded that distance has an effect on the subject at a level of
significance of α= 0.05.
10