DESCRIPTION
Dataset: Eopinions.csv
Dataset Description: The dataset has 2 columns: ‘class’ and ‘text’
Problem Statement :
Epinions.com is a website where people can post reviews of products and services. It covers a wide variety of topics. For this case study, we downloaded a set of 600 posts about digital cameras and cars and saved as “Eopinions.csv”.
Tasks to be performed:
Read the file as a pandas data-frame.
Perform Label Encoding on ‘class’ column.
Plot a bar graph to compare the frequencies of both the classes.
Preprocess the ‘text’ column
Vectorize the text using CountVectorizer
Split the dataset into 2 parts namely “train.csv” and “test.csv” having 80% and 20% of the data respectively from the original data. These are your Train and Test Data. Make sure train and test data are having same proportion of data points as the original data.
Train your machine learning algorithm for classification and prepare a model (you can choose any appropriate algorithm of your choice)
Now test the model on the Test data and evaluate the Performance by providing Confusion Matrix for your model.
Plot ROC Curve.
DATASETS
#Read the file as a pandas data-frame.
##Perform Label Encoding on ‘class’ column.
#Plot a bar graph to compare the frequencies of both the classes.
#Vectorize the text using CountVectorizer
#Split the dataset into 2 parts namely “train.csv” and “test.csv” having 80% and 20% of the data
#Train your machine learning algorithm for classification and prepare a model
#Now test the model on the Test data and evaluate the Performance by providing Confusion Matrix for your model.
#Plot ROC Curve
Get Machine Learning Assignment problem solver answers with step-by-step explanations. Our online Machine Learning Assignment help tutors are available 24*7 to help you with your toughest Machine Learning problem.
We offer the Machine Learning Assignment Help, Machine Learning Assignment Solution, Machine Learning Assignment Project, Machine Learning Assignment Online Tutors
Need the python code for the above question