Chapter 16
Chapter 16
Chapter 16
Learning Objectives
Upon completion of this chapter, you will be able to:
Understand the applications of the multiple regression model Understand the concept of coefficient of multiple determination, adjusted coefficient of multiple determination, and standard error of the estimate Understand and use residual analysis for testing the assumptions of multiple regression Use statistical significance tests for the regression model and coefficients of regression Test portions of the multiple regression model Understand non-linear regression model and the quadratic regression model, and test the statistical significance of the overall quadratic regression model Understand the concept of model transformation in regression models Understand the concept of collinearity and the use of variance inflationary factors in multiple regression Understand the conceptual framework of model building in multiple regression
Multivariate Analysis I: Multiple Regression Analysis 2
Example 16.1
A consumer electronics company has adopted an aggressive policy to increase sales of a newly launched product. The company has invested in advertisements as well as employed salesmen for increasing sales rapidly. Table 16.2 presents the sales, the number of employed salesmen, and advertisement expenditure for 24 randomly selected months. Develop a regression model to predict the impact of advertisement and the number of salesmen on sales.
Table 16.2: Sales, number of salesmen employed, and advertisement expenditure for 24 randomly selected months of a consumer electronics company
In case of multiple regression, the coefficient of multiple determination is the proportion of variation in the dependent variable y that is explained by the combination of independent (explanatory) variables.
This implies that 73.90% of the variation in sales is explained by the variation in the number of salesmen employed and the variation in the advertisement expenditure.
Adjusted R Square
Adjusted R square is commonly used when a researcher wants to compare two or more regression models having the same dependent variable but different number of independent variables.
This indicates that 71.42% of the total variation in sales can be explained by the multiple regression model adjusted for the number of independent variables and sample size.
10
Figures 16.8 & 16.9: Partial regression output from MS Excel and Minitab showing coefficient of multiple determination, adjusted R square, and standard error
11
12
Figure 16.18(a): Computation of the F statistic using MS Excel (partial output for Example 16.1)
13
The hypotheses for testing the regression coefficient of each independent variable can be set as
Figure 16.19(a) : Computation of the t statistic using MS Excel (partial output for Example 16.1)
14
Figure 16.22 : Existence of non-linear relationship (quadratic) between the dependent and independent variable (2 is the coefficient of quadratic term)
15
16
Example 16.2
A leading consumer electronics company has 125 retail outlets in the country. The company spent heavily on advertisement in the previous year. It wants to estimate the effect of advertisements on sales. This company has taken a random sample of 21 retail stores from the total population of 125 retail stores. Table 16.5 provides the sales and advertisement expenses (in thousand rupees) of 21 randomly selected retail stores.
17
Table 16.5: Sales and advertisement expenses of 21 randomly selected retail stores
Fit an appropriate regression model. Predict the sales when advertisement expenditure is Rs 28,000.
Multivariate Analysis I: Multiple Regression Analysis 18
Using MS Excel, Minitab, and SPSS for the Quadratic Regression Model
Ch 16 Solved Examples\Excel\Ex 16.2 quadratic1.xls Ch 16 Solved Examples\Minitab\EX 16.2 QUADRATIC 1.MPJ Ch 16 Solved Examples\SPSS\Ex 16.2 Quadratic.sav Ch 16 Solved Examples\SPSS\Ouput Ex 16.2.spv
19
A Case When the Quadratic Regression Model Is a Better Alternative to the Simple Regression Model
Figure 16.31: Fitted line plot for Example 16.2 (simple regression model) produced using Minitab
Multivariate Analysis I: Multiple Regression Analysis 20
A Case When the Quadratic Regression Model is a Better Alternative to the Simple Regression Model
Figure 16.33: Fitted line plot for Example 16.2 (quadratic regression model) produced using Minitab
Multivariate Analysis I: Multiple Regression Analysis 21
F Statistic is used for testing the significance of the quadratic regression model as it is used in the simple regression model. Testing the Quadratic Effect of a Quadratic Regression Model
t Statistic is used for testing the significance of the quadratic effect of quadratic regression model.
22
Regression models are based on the assumption that all the independent variables (explanatory) are numerical in nature. There may be cases when some of the variables are qualitative in nature. These variables generate nominal or ordinal information and are used in multiple regression. These variables are referred to as indicator or dummy variables. Researchers usually assign 0 or 1 to code dummy variables in their study. Here, it is important to note that the assignment of code 0 or 1 is arbitrary and the numbers merely represent a place for the category. A particular dummy variable xd is defined as
23
Example 16.3
A company wants to test the effect of age and gender on the productivity (in terms of units produced by the employees per month) of its employees. The HR manager has taken a random sample of 15 employees and collected information about their age and gender. Table 16.6 provides data about the productivity, age, and gender of 15 randomly selected employees. Fit a regression model considering productivity as the dependent variable and age and gender as the explanatory variables.
24
Table 16.6: Data about productivity, age, and gender of 15 randomly selected employees.
Using MS Excel, Minitab and SPSS for the Dummy Variable Regression Model
Ch 16 Solved Examples\Excel\Ex 16.3 dummy & interaction.xls Ch 16 Solved Examples\Minitab\EX 16.3 DUMMY & INTERACTION.MPJ Ch 16 Solved Examples\SPSS\Ex 16.3.sav Ch 16 Solved Examples\SPSS\Output Ex 16.3 (Interaction).spv
26
Example 16.3
27
28
29
Example 16.4
A furniture company receives 12 lots of wooden plates. Each lot is examined by the quality control inspector of the firm for defective items. His report is given in Table 16.8:
Taking batch size as the independent variable and the number of defectives as the dependent variable, fit an appropriate regression model and transform the independent variable if required.
Multivariate Analysis I: Multiple Regression Analysis 30
Using MS Excel, Minitab and SPSS for the Square Root Transformation
Ch 16 Solved Examples\Excel\Ex 16.4 square root transformation.xls Ch 16 Solved Examples\Minitab\Ex 16.4 transformation.MPJ Ch 16 Solved Examples\SPSS\Ex 16.4 (Square root).sav Ch 16 Solved Examples\SPSS\Output Ex 16.4.spv
31
Logarithm Transformation
Logarithm transformation is often used to verify the assumption of constant error variance (homoscedasticity) and to convert a nonlinear model to a linear model.
32
Example 16.5
The data related to sales turnover and advertisement expenditure of a company for 15 randomly selected months are given in Table 16.10
Taking sales as the dependent variable and advertisement as the independent variables, fit a regression line using log transformation of variables.
Multivariate Analysis I: Multiple Regression Analysis 33
34
Collinearity
In multiple regression analysis, when two independent variables are correlated, it is referred to as collinearity and when three or more variables are correlated, it is referred to as multicollinearity. Collinearity is measured by variance inflationary factor (VIF) for each explanatory variable. If explanatory variables are uncorrelated, then variance inflationary factor (VIF) will be equal to 1. Variance inflationary factor (VIF) being greater than 10 is an indication of serious multicollinearity problems. Collinearity is not very simple to handle in multiple regression. One of the best solutions to overcome the problem of collinearity is to drop collinear variables from the regression equation.
35
Example 16.6
Table 16.13 provides the modified data for the consumer electronics company discussed in Example 16.1. Two new variables, number of showrooms and showroom age, of the concerned company have been added. Fit an appropriate regression model.
Ch 16 Solved Examples\Minitab\EX 16.6 MODEL BUILDING.MPJ Ch 16 Solved Examples\SPSS\Ex 16.6 (17).sav Ch 16 Solved Examples\SPSS\Output Ex 16.6 (Model Building).spv
36