Description
Background & Context
Employee Promotion means the ascension of an employee to higher ranks, this aspect of the job is what drives employees the most. The ultimate reward for dedication and loyalty towards an organization and HR team plays an important role in handling all these promotion tasks based on ratings and other attributes available.
The HR team in JMD company stored data of promotion cycle last year, which consists of details of all the employees in the company working last year and also if they got promoted or not, but every time this process gets delayed due to so many details available for each employee - it gets difficult to compare and decide.
So this time the HR team wants to utilize the stored data to make a model that will predict if a person is eligible for promotion or not.
You as a data scientist at JMD company, need to come up with a model that will help the HR team to predict if a person is eligible for promotion or not.
Objective
Explore and visualize the dataset.
Build a classification model to predict if the customer has a higher probability of getting a promotion
Optimize the model using appropriate techniques
Generate a set of insights and recommendations that will help the company
Data Dictionary:
employee_id: Unique ID for the employee
department: Department of employee
region: Region of employment (unordered)
education: Education Level
gender: Gender of Employee
recruitment_channel: Channel of recruitment for employee
no_ of_ trainings: no of other trainings completed in the previous year on soft skills, technical skills, etc.
age: Age of Employee
previous_ year_ rating: Employee Rating for the previous year
length_ of_ service: Length of service in years
awards_ won: if awards won during the previous year then 1 else 0
avg_ training_ score: Average score in current training evaluations
is_promoted: (Target) Recommended for promotion
Best Practices for Notebook :
The notebook should be well-documented, with inline comments explaining the functionality of code and markdown cells containing comments on the observations and insights.
The notebook should be run from start to finish sequentially before submission.
It is preferable to remove all warnings and errors before submission.
The notebook should be submitted as an HTML file (.html) and NOT as a notebook file (.ipynb)
Best Practices for Presentation :
Like in real-world projects, the ultimate destination of any project or work is generally an executive or decision-making meeting, where you are supposed to present your solution to the business problem, based on the project/work you have done. The purpose of this presentation is to simulate that kind of experience and to draw the attention of your audience (a business leader like CMO, COO, CFO, or CEO) to the key points of your project, which are
Business Overview of the problem and solution approach
Key findings and insights which can drive business decisions
Model overview and performance summary
Business recommendations
Please keep the following points in mind while making the presentation:
Focus on explaining the takeaways in an easy-to-understand manner.
Inclusion of the potential benefits of implementing the solution will give you the edge.
Copying and pasting from the notebook is not a good idea, and it is better to avoid showing codes unless they are the focal point of your presentation.
Please submit the presentation in PDF format only.
Submission Guidelines :
There are two parts to the submission:
A well commented Jupyter notebook [format - .html]
A presentation as you would present to the top management/business leaders [format - .pdf ] (you have to export/save the .pptx file as .pdf)
Any assignment found copied/ plagiarized with other groups will not be graded and awarded zero marks
Please ensure timely submission as any submission post-deadline will not be accepted for evaluation
Submission will not be evaluated if,
it is submitted post-deadline, or,
more than 2 files are submitted
Happy Learning!!
Scoring guide (Rubric) - Employee Promotion PredictionEvaluated
Criteria
Perform an Exploratory Data Analysis on the data
- Univariate analysis - Bivariate analysis - Use appropriate visualizations to identify the patterns and insights - Any other exploratory deep dive
Illustrate the insights based on EDA
Key meaningful observations on the relationship between variables
Data Pre-processing
Prepare the data for analysis - Missing value Treatment, Outlier Detection(treat, if needed- why or why not ), Feature Engineering, Prepare data for modeling
Model building - Logistic Regression
- Make a logistic regression model - Improve model performance by up and downsampling the data - Regularize above models, if required
Model building - Bagging and Boosting
- Build Decision tree, random forest, bagging classifier models - Build Xgboost, AdaBoost, and gradient boosting models
Hyperparameter tuning using grid search - Tune the best 3 models using grid search and provide the reason behind choosing those models - Use pipelines in hyperparameter tuning * Please note XGBoost can take a significantly longer time to run, so if you have time complexity issues then you can avoid tuning XGBoost and tune the next best 3 models
Hyperparameter tuning using random search - Tune the best 3 models using random search and provide the reason behind choosing those models - Use pipelines in hyperparameter tuning
Model Performances
- Compare the model performance of all the models - Comment on the time taken by the grid and randomized search in optimization
Actionable Insights & Recommendations
- Business recommendations and insights
Presentation - Overall quality - Structure and flow - Crispness - Visual appeal - All key insights and recommendations covered
Notebook - Overall quality - Structure and flow - Well commented code
Samples:
Concepts used:
Imputation
Univariate and Bivariate analysis
Classification algorithms : Logistic Regression, Decision Tree, Random Forest, Gradient Boost, Ada Boost, XGB.
If you need implementation for any of the topics mentioned above or assignment help on any of its variants, feel free contact us.
Comments