BAN 602 - Project2
BAN 602 - Project2
BAN 602 - Project2
Part 1: Introduction
The intent of this student is to analyze the impact of class tardies in a given week on a student’s grade
point average (GPA). Our dependent variable will be student GPA while number of class tardies will be
our independent variable. This makes sense because it is more likely that the number of tardies a
student accumulates each week will impact GPA. Meanwhile, it is less likely that a student’s GPA will
have an impact on the number of tardies.
Since a student’s GPA is calculated by the grades they receive in class and a student’s ability to earn
positive grades when they are tardy (late) to class is more difficult, it is likely that there will be a negative
correlation between these two variables. That is, as student tardies in a given week increase, we will
likely see a decrease in student GPA.
A total of 30 observations were gained. The definitions of each of the variables along with the
descriptive statistics are in the table below.
GPA = β0 + β1(Tardies) + u
Below is a scatter plot with the fitted linear trend line. The trend line is the result of a Simple Linear
Regression and it conforms to the estimate in the table above.
The estimate from the Simple Linear Regression model is shown in the equation below. For this model,
we put a hat () on the dependent variable to remind us that this is an estimate from data:
GPA = 3.17-0.14(Tardies)
The interpretation of the intercept term would be that that if the student had zero tardies, the student’s
GPA would be a 3.17. This amount seems reasonable given our data, but we should be cautious since a
student’s GPA being impact entirely by tardies to class is unlikely.
The estimated impact of tardies on GPA is negative indicating that the more tardies a student
accumulates in a given week the lower the student’s GPA will be. The coefficient of the independent
variable is -0.14 indicating that for every additional tardy, a student’s GPA will drop by 0.14 points. This
Tardies vs. GPA Regression Analysis
BAN 602 – Dr. Curtis Price
Fall 2021
number is negative which is what we would expect and stated at the beginning. As a student
accumulates more tardies, they minimize their learning time and miss potential “top of class”
assignments. It is difficult to tell if this is reasonable in terms of size, but falls in line with the anticipated
negative correlation.
Random sampling
The data collected was not generated randomly, which is a problem if we are wanting to make
inferences from these results for a broader population—the fact that is a mere convenience sample
presents problems with the data collection process that should give us pause to infer that results are
typical. First, the data was collected at a K-12 charter school. Therefore, it is unclear whether grade
level or non-charter attending students have similar data results. Second, having students explicitly
state their data rather than having it verified by a non-biased source could skew or bias the data.
Ideally, we would want students to have to present proof of their GPA and tardy data to make the data
more accurate and fully reliable.
While 30 data points were collected which is a significant cross-section of data, it is concerning that we
do not see a broader cross section of GPA data—specifically between 0.0 and 2.0. Only 8 of the
surveyed students fell between these data points. Additionally, we do not see many students
represented with greater than 8 tardies in a given week—in fact, only 4 of the surveyed students
indicated as such. We would need to analyze schoolwide GPA and tardy data to know if the averages
our sample collected are reliable and fully representative.
Tardies vs. GPA Regression Analysis
BAN 602 – Dr. Curtis Price
Fall 2021
The error term likely contains many factors that would impact GPA but are not tardies to class. For
example, grade level and difficultly of classes or particular teacher grade scales would certainly impact
GPA, but are excluded from the data.
Overall, it is hard to take the estimate the approximation seriously. We would need more data about
the students included in this data list and would also need confidence that the independent variable is
not correlated with factors that have been excluded from the analysis. While this estimate does tell us
something about the relationship between GPA and tardies, it is likely that it is not tardies by itself
which is driving this relationship. To understand this relationship better, we should include more data
and detail in our analysis. Student data such as grade level and course load difficulty would make sense
to include in future analysis.