top of page

Regression, Clustering and Classification assignment help




Task 1: Regression


In this task you are required to apply a machine learning algorithm to the data

set houseprice_data.csv which can be downloaded from the assignment task on

canvas. This data set contains information about house sales in King County, USA.

The data has 18 features, such as: number of bedrooms, bathrooms, floors etc., and

a target variable: house price.


Using linear regression (simple or multiple), develop a model to predict the price

of a house. After developing the model you should also analyze the results and

discuss the effectiveness of the model, outlining the improvements when developing

the model.


Ideas to consider when completing this task:

• Is there a way of visualizing your model? (Possibly just one or two input/feature

variable(s).)

• How will you assess the effectiveness of the model?

• Include as many features as you can. Does the model improve?

• How could you make further improvements?

• What can you conclude about your model?



Task 2: Clustering


In this task you are required to apply a machine learning algorithm to the data set

country_data.csv which can be downloaded from the assignment task on canvas.

This data set contains information about a countries child mortality, exports, health

spending, etc.


Use clustering to investigate this data set. After clustering the data you should

analyze the results and discuss what can be concluded by the clusters.


Ideas to consider when completing this task:


• Is there a way of visualizing the clusters?

• Can you make any conclusions about the clustering?

• Include as many features as you can. Does the clustering change?

• What advice would you give, in the context of the data, based on the clustering?



Task 3: Classification & Neural Networks


In this task you are required to apply a variety of machine learning algorithms to

the data set nba_rookie_data.csv which can be downloaded from the assignment

task on canvas. This data set contains NBA rookie performance with target variable

Target_5Yrs with 1: if career length >= 5 yrs or 0: if career length < 5 yrs.

The classification problem here is to predict if a player will last 5 years in the NBA.


Apply Logistic Regression, Gaussian Naïve Bayes and construct Neural Net-

works. After developing the various models you should also analyze the results and


discuss the effectiveness of the models, outlining the improvements when developing

the models and compare the approaches/algorithms used (strengths and weaknesses).


Ideas to consider when completing this task:


• Apply various algorithms to the problem. Caution: Use a small number rather

than many, analyse in depth rather than being superficial and repetitive.

• Is there a way of visualising the model(s)?

• How will you assess the effectiveness of the model(s)?

• Include as many features as you can. Does the model improve?

• Compare the models produced.

• How could you make further improvements?

• What can you conclude about your model?

• How strong is the relationship between the predictor and target variables?



Sample Screenshots:


This project can be used as final year project, capstone project, personal portfolio project, resume, proof of concept.


If you need implementation for the above problem or any of its variants, feel free to contact us.



Comments


bottom of page