Requirements :
Find 2 datasets, one for regression and the other for classification
Regression:
linear regression, polynomial regression(upto deg=3), random forest, SVM
Classification:
the other for classification using logistic regression, KNN, random forest, SVM
Project Requirements:
No. of rows >=1000
No. variables > 2
No. of classes for the dependent variable must be more than 2 for classification
Do K-fold cross-validation for both.
For regression show: R2, Adjusted R2, RMSE, correlation matrix, p-values of independent variables (codes 10)
For classification show: Accuracy, confusion matrix, (Macro recall and precision for multiclass Classification) (codes 10)
Do hyper-parameter tuning using Grid Search
The report should discuss the properties of the datasets, your results, and model performance comparisons, and inferences/conclusions. (10)
Prepare a report to discuss the properties of the datasets, your results, and inferences. (10)
Here solution of this which fulfill the above requirements :
Import Libraries
>>> import pandas as pd
>>> import numpy as np
>>> import matplotlib.pyplot as plt #Data visualization libraries
>>> import seaborn as sns
>>> %matplotlib inline
Load Data
Creating methods to update columns fields values
Applying these methods on pandas datasets to update values
In the next steps are done logistic regression, if you need the complete solution with k fold to implement logistic regression classification then please contact us here or you can also comment in below comments section.
Comments