Problem Statement
Define either a feature extraction, clustering, classification or regression problem (or a combination of more than one of these) for the data set. No matter which type of data analysis problem you choose, you should, at a minimum, choose a basic linear technique and one more advanced technique. For example, you could choose to solve a regression problem using multiple linear regression and neural networks (deep learning). Similarly, you could do feature extraction using PCA and using autoencoders. Use graphs and outputs to elaborate and also explain what scope you have chosen for the data set, why do you think your method works or why not and which methods are best fit for your data.
DATA Set information
The data (HPLCoil.xlsx) are for oil samples (120 in number) analyzed by high performance liquid chromatography (HPLC)with charged aerosol detector. There are 4001 variables representing the spectrum of the HPLC analysis. The samples are of various types and grades of olive oils along with non-olive vegetable oils or non-olive vegetable oils mixed with olive oil. Feature identification and classification/clustering analyses can be conducted on the data. Note that the HPLC method is aimed at providing a triacylglyceride profile, and triacylglycerides are known to have a distinct pattern for olive oils.
Data source: http://www.models.life.ku.dk/datasets
Solution Output and screenshot
Get solution of this project at affordable price. please send your query at contact@codersarts.com we'll share you price quote
Comments