Label Encoder In Machine Learning

In machine learning, if the data is given in the form of non-numeric, then to convert it into the numeric form using the concept of encoding. There are different types of Encoding algorithms in machine learning which is given below:

Label Encoder
One hot encoding
Binary Encoding
Hashing
Target Encoding

Now we can go through "Label Encoder":

First import sklearn label encoder libraries:

import pandas as pd
from sklearn.preprocessing import LabelEncoder

Now fit data frame columns into the label encoder:


labelencoder = LabelEncoder()

df['x'] = labelencoder.fit_transform(df['x'])

Here 'x' is the column of the data frame

Example:

import pandas as pd
import numpy as np
# Define the headers since the data does not have any
headers = ["A", "B", "C", "D", "E","F", "G", "I", "J","K", "L", "M", "N", "O","P", "Q", "R", "S","T", "U", "V", "W", "X","Y", "Z", "A1"]

# Read in the CSV file and convert "?" to NaN
df = pd.read_csv("http://mlr.cs.umass.edu/ml/machine-learning-databases/autos/imports-85.data",
header=None, names=headers, na_values="?" )
df.head()

Now we will apply the label encoding on column 'E', which is non-numeric form:

#Now we convert 'E' column into the numeric using the label encoding in pandas dataframe

from sklearn.preprocessing import LabelEncoder
labelencoder = LabelEncoder()
df['E'] = labelencoder.fit_transform(df['E'])

And it displays the following result:

You can see into the above data frame output result column 'E' is successfully encoded into the numeric form.

#datascience #ML #python

0 comments