top of page

INTRODUCTION TO ANN, DNN, CNN, RNN, AND LSTM

Updated: Aug 22, 2022

In this blog, you will be introduced to different types of neural networks and their implementation.


To start with, we will first begin with ANN. The ANN stands for Artificial neural network. It is the foundation of all the neural networks that we are going to look further.


INTRODUCTION TO ANN


ANNs are designed to simulate the human brain algorithmically by exploiting the data. A ANN is composed of an input layer having input neurons, hidden layers having neurons that learn the important features from data, and an output layer having output neurons. Here, the input neurons act as independent variables and output neurons act as dependent variables.


To know more about ANN, you can refer to this link about ANN https://www.codersarts.com/post/artificial-neural-network


IMPLEMENTATION OF ARCHITECTURE OF ANN

Importing the essential libraries to implement the ANN for binary classification.

import tensorflow as tf
from sklearn.metrics import confusion_matrix
Defining the ANN model
# First define the ANN model
ann = tf.keras.models.Sequential()

# Add the input layer and the first hidden layer
ann.add(tf.keras.layers.Dense(units=6, activation='relu'))

# Add the second hidden layer
ann.add(tf.keras.layers.Dense(units=6, activation='relu'))

# Adding the output layer
ann.add(tf.keras.layers.Dense(units=1, activation='sigmoid'))

# Compiling the ANN
ann.compile(optimizer = 'adam', loss = 'binary_crossentropy', metrics = ['accuracy'])

# Training the ANN on the Training set
ann.fit(X_train, y_train, batch_size = 32, epochs = 100)

# Part 4 - Making the predictions and evaluating the model

# Predicting the Test set results
y_pred = ann.predict(X_test)
y_pred = (y_pred > 0.5)
print(np.concatenate((y_pred.reshape(len(y_pred),1), y_test.reshape(len(y_test),1)),1))

# Making the Confusion Matrix
cm = confusion_matrix(y_test, y_pred)
print(cm)

DEEP NEURAL NETWORK (DNN)


DNNs are the artificial neural networks with multiple hidden layers. They can perform complex tasks with more accuracy than simple ANNs. The DNN can also perform better with images, but not as better as computer vision algorithms like CNN.


IMPLEMENTATION OF DNN

# (32, 32, 3) here is the image size is 32x32 and the number of channels in each image is 3 ( RGB channels)
input_layer = Input((32,32,3))

# flattening this input into a vector, using a Flatten layer. This will result a vector of length 3,072 (= 32 × 32 × 3)
# dense layers take input as a vector instead of multidimensional array.
x = Flatten()(input_layer)

# here we use the Rectified Linear Units (ReLU) activation function.
x = Dense(200, activation = 'relu')(x)
x = Dense(150, activation = 'relu')(x)

# output layer consists of 10 units.
# here we use the softmax activation function whicch is more effective and efficient way to evaluate the result.
output_layer = Dense(NUM_CLASSES, activation = 'softmax')(x)

# preparing the mathematical model (OR deep learning model)
model = Model(input_layer, output_layer)
model.summary()
    Model: "model_1"
    _________________________________________________________________
    Layer (type)                 Output Shape              Param #   
    =================================================================
    input_1 (InputLayer)         (None, 32, 32, 3)         0         
    _________________________________________________________________
    flatten_1 (Flatten)          (None, 3072)              0         
    _________________________________________________________________
    dense_1 (Dense)              (None, 200)               614600    
    _________________________________________________________________
    dense_2 (Dense)              (None, 150)               30150     
    _________________________________________________________________
    dense_3 (Dense)              (None, 10)                1510      
    =================================================================
    Total params: 646,260
    Trainable params: 646,260
    Non-trainable params: 0
    _________________________________________________________________
# Training the ANN on the Training set
ann.fit(X_train, y_train, batch_size = 32, epochs = 100)

IMPLEMENTATION OF ARCHITECTURE OF DNN

CNNs, also known as CovNets, are similar to the networks that are present above, the difference is that, they make the assumption that the input is image-like and they use the concept of convolutional kernel to extract the feature from the image. It is a feed-forward network and it consists of multiple layers just like other neural network and each layer transforms the output of its previous layer through another differential function.

As the structure of output of a layer is getting changed by the next layer through convolutional kernel or kernel function, therefore, it is called the convolution neural network. During the training, the network will learn the best kernel that corresponds to a specific feature of an image. Convolution layer uses many such filters, and therefore, it consists many kernels that correspond to many features.


The next layer is polling layer, which reduces the spatial size or dimensions of input image without loosing the essential information or features. While pooling is a feature in CNN, one should also note that the similar goal can be archived by using convolutional kernel. For example, the kernel of size (3 x 3) with stride 2.


IMPLEMENTATION OF ARCHITECTURE OF CNN

from keras.layers import Input, Flatten, Dense, Conv2D, BatchNormalization, LeakyReLU, Dropout, Activation
from keras.models import Model
from keras.optimizers import Adam
from keras.utils import to_categorical
import keras.backend as K 
input_layer = Input(shape=(32,32,3))

# filters = 10 refers to the use of 10 kernels
# strides = 2 means that output image will be half the size of input image
# padding = 'same' refers to adding 0's to the input image so that the  kernel will extend over the edge of the image.
conv_layer_1 = Conv2D(
    filters = 10
    , kernel_size = (4,4)
    , strides = 2
    , padding = 'same'
    )(input_layer)

conv_layer_2 = Conv2D(
    filters = 20
    , kernel_size = (3,3)
    , strides = 2
    , padding = 'same'
    )(conv_layer_1)

flatten_layer = Flatten()(conv_layer_2)

output_layer = Dense(units=10, activation = 'softmax')(flatten_layer)

model = Model(input_layer, output_layer)
opt = Adam(lr=0.0005)
model.compile(loss='categorical_crossentropy', optimizer=opt, metrics=['accuracy'])
model.fit(x_train
          , y_train
          , batch_size=32
          , epochs=10
          , shuffle=True
          , validation_data = (x_test, y_test))

RECURRENT NEURAL NETWORK

RNN, one of the feedback neural networks, is a type of neural network in which the output from the neurons is taken as the input for the previous neuron. It does so to produce the temporal memory which allows it to process dynamic input sequences. RNN is used mainly used for sequential data, for example, time series, DNA sequence, speech recognition.

IMPLEMENTATION OF ARCHITECTURE OF RNN


# DEFINING THE INPUT'S SHAPE
x = layers.Input(shape=(12, 7))

# DEFINE RNN CELLS OR TEMPORAL MEMORY FOR THE RNN MODEL TO USE.
cell = layers.SimpleRNNCell(4, activation='tanh') 

# ADDING THE CELLS TO THE MODEL
rnn = layers.RNN(cell)
rnn_output = rnn(x)

output = layers.Dense(units=1, activation='sigmoid')

# COMPILE THE MODEL
model = keras.Model(inputs=x, outputs=output)

model.compile(loss="binary_crossentropy", metrics=["accuracy"])

model.summary()
Model: "functional_3"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_2 (InputLayer)         [(None, 12, 7)]           0         
_________________________________________________________________
rnn_1 (RNN)                  (None, 4)                 48        
_________________________________________________________________
dense_1 (Dense)              (None, 1)                 5         
=================================================================
Total params: 53
Trainable params: 53
Non-trainable params: 0

LONG SHORT TERM MEMORY (LSTM)

LSTM is one of the variations of RNN where the network is able to remember the patterns for long duration of time. The long term memory is called the cell state, and as the output of a neuron is an input to the same neuron, previous information is stored within it. The forget gate is added to the cells which decides which information is to forget by multiplying 0 to the matrix of cells, otherwise the value is zero to retain the information.


IMPLEMENTATION OF THE ARCHITECTURE OF THE LSTM

First, import the essential libraries

from keras.models import Sequential
from keras.layers import Dense
from keras.layers import LSTM
from keras.layers import Dropout
regressor = Sequential()
regressor.add(LSTM(units = 50, return_sequences = True, input_shape = (X_train.shape[1], 1)))regressor.add(Dropout(0.2))
regressor.add(LSTM(units = 50, return_sequences = True))regressor.add(Dropout(0.2))
regressor.add(LSTM(units = 50, return_sequences = True))regressor.add(Dropout(0.2))

regressor.add(LSTM(units = 50))regressor.add(Dropout(0.2))

regressor.add(Dense(units = 1))

regressor.compile(optimizer = 'adam', loss = 'mean_squared_error')

regressor.fit(X_train, y_train, epochs = 100, batch_size = 32)
If you need implementation for any of the topics mentioned above or assignment help on any of its variants, feel free to contact us.

Comments


bottom of page