Assignment 2 Problem
Assignment 2 Problem
Assignment 2 Problem
Alvin Technologies, a large software development and project management company located
in Southern India recruits nearly 2000 engineers with specialization in Computer Sciences or
Information Technology from the industry. Every year their need for manpower was on the
rise. In the Indian software job market, number of engineers was more, but quality of most
engineers was not up to the mark as set by the top management of Alvin Technologies.
Hence, the recruitment screening process was conducted meticulously to identify the
candidates with the right potential and after a rigorous five-round interview process, the best
of the lot were extended offer letters.
The HR Head of Alvin, Mr. Suresh KG observed that there was trend of roughly 30-40% of
the candidates who were given joining offer to reject the same for better opportunities or
reasons unknown. This was a serious worry for Mr. Suresh, since the cost and effort put in to
zero down onto the final candidates was substantial. He was left pondering why so many
individuals decide to not be part of Alvin, despite the firm taking care of major exigencies for
the new joinees such as relocation expense coverage and flexible joining time to avoid notice
period blues. Alvin was considered a good pay master and hence Mr. Suresh could not find
any satisfactory explanation to this pattern. One of the junior staffs in the HR department
-Mr. Niraj Babu approached the Head, HR and suggested that he may throw some light on the
matter. Niraj has been enrolled in a HR Analytics certification course at a premier B School
in India for past three months. This program was sponsored by Alvin, and Niraj felt obliged
to help his department head with the pressing issue by sharing some analytic insights that he
has picked up in his course. Niraj sought permission to get access to the existing data set of
roughly 7000 employees who have appeared for interview till the offer letter stages for the
firm in the past three years. He then first cleaned the existing data for any anomalies such as
missing values and outliers. Finally, he fit the data in a ML model to understand the pattern
behind the event of joining / not joining. Lastly, he applied the trained model on a data set of
new candidates who were about to be extended offer letter by the firm.
QUESTIONS
1) Conduct a predictive analysis to test which of the independent variables are affecting the
chance of joining for the candidates and by what odds with the Training data.
2) Develop a ML model with the Training data using Logistic Regression to test the accuracy
with which the outcome (joined / not joined) is getting predicted in terms of precision, recall,
specificity and AUC.
3) Predict which candidates are most likely to join based on the ML model with more
accuracy as found using the fresh data (Test data) of recently interviewed candidates.
Answer
As we know if odds ratio is >1.1 then that variable is affecting the chance of joining.
Notice period has odds of 1.4 i.e., chances of candidate joining 1.4 times more if notice period is
there
Relocation required odds are 51.7 i.e., chances of candidate joining 51.7 times more if candidate is
offered relocation
Joining bonus odds are 0.62 <0.9 i.e., chances of candidate not joining 0.62 if joining bonus is given
Rem factors have odds ratio between 0.9 & 1.1 so they can be omitted or can be concluded that they
are not affecting candidates joining
Joining Bonus
Ans 2
Confusion Matrix
ROC Analysis
Rank
Test And score-
Average
Ans 3
Prediction
People most likely to join
Most likely to not join
There are high chances almost all will join except 22 people.