PM Alternate Project
PM Alternate Project
PM Alternate Project
Jupyter Notebook file: This is a must and will be used for reference while evaluating.
Any assignment found copied/ plagiarized with another person will not be graded and marked as
zero. Please ensure timely submission as a post-deadline assignment will not be accepted.
Problem 1: Linear Regression
You are a part of an investment firm and your work is to do research about these 759 firms. You
are provided with the dataset containing the sales and other attributes of these 759 firms. Predict
the sales of these firms on the bases of the details given in the dataset so as to help your
company in investing consciously. Also, provide them with 5 attributes that are most important.
1. dvcat: factor with levels (estimated impact speeds) 1-9km/h, 10-24, 25-39, 40-54, 55+
2. weight: Observation weights, albeit of uncertain accuracy, designed to account for varying
sampling probabilities. (The inverse probability weighting estimator can be used to demonstrate
causality when the researcher cannot conduct a controlled experiment but has observed data to
model)
3. Survived: factor with levels Survived or not_survived
4. airbag: a factor with levels none or airbag
5. seatbelt: a factor with levels none or belted
6. frontal: a numeric vector; 0 = non-frontal, 1=frontal impact
7. sex: a factor with levels f: Female or m: Male
8. ageOFocc: age of occupant in years
9. yearacc: year of accident
10. yearVeh: Year of model of vehicle; a numeric vector
11. abcat: Did one or more (driver or passenger) airbag(s) deploy? This factor has levels deploy,
nodeploy and unavail
12. occRole: a factor with levels driver or pass: passenger
13. deploy: a numeric vector: 0 if an airbag was unavailable or did not deploy; 1 if one or more
bags deployed.
14. injSeverity: a numeric vector; 0: none, 1: possible injury, 2: no incapacity, 3: incapacity, 4:
killed; 5: unknown, 6: prior death
15. caseid: character, created by pasting together the populations sampling unit, the case
number, and the vehicle number. Within each year, use this to uniquely identify the vehicle.