In this post, we will continue exploring the development of web-based applications powered by deep learning models. After previously working on skin disease classification, In this blog we will learn how to building a handwritten Prescription letter recognition application using Flask, OpenCV, and a pre-trained deep learning model. This project uses a model trained on the NIST dataset to recognize letters, and we’ll walk through the code implementation step-by-step.
1. Project Overview
Handwritten prescription text recognition is a challenging task due to the variability in writing styles. In this project, we aim to build a web-based application that can recognize individual letters from an uploaded image of handwritten prescription text, specifically focusing on doctors' handwriting. By leveraging the capabilities of deep learning, the application will predict letters from an image and display the results in a user-friendly interface.
The project consists of:
A Flask web application to handle the frontend and backend.
TensorFlow/Keras to load and use the pre-trained model.
OpenCV for image processing and extracting individual letters.
HTML and Bootstrap for the frontend.
2. Setting Up the Flask Application
To begin, we create a Flask app that will manage the user interface and handle image uploads. The app will process the image and make predictions using the trained deep learning model.
from flask import Flask, render_template, request
from werkzeug.utils import secure_filename
import os
import cv2
import numpy as np
import tensorflow as tf
app = Flask(__name__)
app.config['UPLOAD_FOLDER'] = 'uploads'
app.config['ALLOWED_EXTENSIONS'] = {'png', 'jpg', 'jpeg'}
In this setup, we define a few essential configurations:
The UPLOAD_FOLDER stores the uploaded images.
ALLOWED_EXTENSIONS limits the types of files the application can accept (PNG, JPG, JPEG).
3. Loading the Pre-Trained Model
The application uses a pre-trained deep learning model (NIST_MODEL_1.h5), trained on the NIST dataset, to recognize handwritten letters.
model_path = 'NIST_MODEL_1.h5'
model = tf.keras.models.load_model(model_path)
We load the model using TensorFlow's load_model method. This model has already been trained to recognize individual letters, mapping each class label to a corresponding letter in the alphabet.
label_letter_dict = {0: 'a', 1: 'b', 2: 'c', ... , 25: 'z'}
A dictionary (label_letter_dict) is used to map the model’s output (class numbers) to corresponding letters.
Allowed Files
def allowed_file(filename):
return '.' in filename and \
filename.rsplit('.', 1)[1].lower() in app.config['ALLOWED_EXTENSIONS']
4. Processing the Image
To extract letters from an uploaded image, we first convert it into a format suitable for prediction. We use OpenCV to process the image and isolate individual letters.
def split_letters(image):
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
_, threshold = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)
contours, _ = cv2.findContours(threshold, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
letter_images = []
for contour in contours:
(x, y, w, h) = cv2.boundingRect(contour)
roi = threshold[y:y + h, x:x + w]
letter_images.append(roi)
return letter_images
5. Predicting Letters
Once the letters are extracted, each letter image is resized to 32x32 pixels (the input size expected by the model), and predictions are made for each letter.
def predict_letters(image_path):
img = cv2.imread(image_path, cv2.IMREAD_COLOR)
img = cv2.resize(img, (32, 32), interpolation=cv2.INTER_AREA)
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
letter_images = split_letters(img)
predictions = []
for letter_img in letter_images:
letter_img = cv2.resize(letter_img, (32, 32), interpolation=cv2.INTER_AREA)
letter_img = np.expand_dims(letter_img, axis=0)
letter_img = np.expand_dims(letter_img, axis=-1)
pred = np.argmax(model.predict(letter_img), axis=1)
predictions.append(label_letter_dict[pred[0]])
return predictions
In the predict_letters function:
The image is loaded, resized, and processed for predictions.
Each extracted letter is individually resized and passed through the model.
The prediction is converted into the corresponding letter using label_letter_dict.
@app.route('/')
def index():
return render_template('index.html')
6. Creating the User Interface
We create two HTML templates: index.html for the homepage and result.html for displaying the prediction results.
index.html: This file contains a form that allows users to upload an image.
<form method="POST" action="{{ url_for('predict') }}" enctype="multipart/form-data">
<div class="custom-file">
<input type="file" class="custom-file-input" id="fileInput" name="file" accept=".png, .jpg, .jpeg">
<label class="custom-file-label" for="fileInput">Choose image file</label>
</div>
<button type="submit" class="btn btn-primary mt-4">Predict</button>
</form>
result.html: This file displays the predicted letters on the result page.
<h2 class="text-center">Predicted Letters:</h2>
{% for letter in predictions %}
<span class="badge badge-primary letter-badge">{{ letter }}</span>
{% endfor %}
7. Handling File Upload and Prediction in Flask
The /predict route handles the image upload, prediction, and result display.
@app.route('/predict', methods=['POST'])
def predict():
if 'file' not in request.files or file.filename == '':
return render_template('index.html', error='No file selected')
file = request.files['file']
if file and allowed_file(file.filename):
filename = secure_filename(file.filename)
file_path = os.path.join(app.config['UPLOAD_FOLDER'], filename)
file.save(file_path)
predictions = predict_letters(file_path)
return render_template('result.html', image_path=file_path, predictions=predictions)
This code handles:
File validation: Ensuring the uploaded file is valid.
File saving: Storing the image in the upload folder.
Prediction: Calling the prediction function and displaying results.
8. Running the Application
Finally, we run the Flask application:
if __name__ == '__main__':
app.run(debug=True)
Complete Code
from flask import Flask, render_template, request
from werkzeug.utils import secure_filename
import os
import cv2
import numpy as np
import tensorflow as tf
app = Flask(__name__)
app.config['UPLOAD_FOLDER'] = 'uploads'
app.config['ALLOWED_EXTENSIONS'] = {'png', 'jpg', 'jpeg'}
# Load the handwritten letter recognition model
model_path = 'NIST_MODEL_1.h5'
model = tf.keras.models.load_model(model_path)
# Mapping for label to letter conversion
label_letter_dict = {0: 'a', 1: 'b', 2: 'c', 3: 'd', 4: 'e',
5: 'f', 6: 'g', 7: 'h', 8:'i', 9: 'j', 10:'k', 11: 'l',
12: 'm', 13: 'n', 14: 'o', 15: 'p', 16: 'q',
17: 'r', 18: 's', 19:'t', 20: 'u', 21: 'v', 22: 'w',
23: 'x', 24: 'y', 25: 'z'}
def allowed_file(filename):
return '.' in filename and \
filename.rsplit('.', 1)[1].lower() in app.config['ALLOWED_EXTENSIONS']
def split_letters(image):
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
_, threshold = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)
contours, _ = cv2.findContours(threshold, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
letter_images = []
for contour in contours:
(x, y, w, h) = cv2.boundingRect(contour)
roi = threshold[y:y + h, x:x + w]
letter_images.append(roi)
return letter_images
def predict_letters(image_path):
img = cv2.imread(image_path, cv2.IMREAD_COLOR)
img = cv2.resize(img, (32, 32), interpolation=cv2.INTER_AREA)
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
letter_images = split_letters(img)
predictions = []
for letter_img in letter_images:
letter_img = cv2.resize(letter_img, (32, 32), interpolation=cv2.INTER_AREA)
letter_img = np.expand_dims(letter_img, axis=0)
letter_img = np.expand_dims(letter_img, axis=-1)
letter_img = np.array(letter_img, dtype="float32")
pred = np.argmax(model.predict(letter_img), axis=1)
pred_letter = label_letter_dict[pred[0]]
predictions.append(pred_letter)
return predictions
@app.route('/')
def index():
return render_template('index.html')
@app.route('/predict', methods=['POST'])
def predict():
if 'file' not in request.files:
return render_template('index.html', error='No file uploaded')
file = request.files['file']
if file.filename == '':
return render_template('index.html', error='No file selected')
if file and allowed_file(file.filename):
filename = secure_filename(file.filename)
file_path = os.path.join(app.config['UPLOAD_FOLDER'], filename)
file.save(file_path)
predictions = predict_letters(file_path)
return render_template('result.html', image_path=file_path, predictions=predictions)
return render_template('index.html', error='Invalid file format')
if __name__ == '__main__':
app.run(debug=True)
index.html
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Doctor Handwriting Recognition</title>
<link rel="stylesheet" href="https://stackpath.bootstrapcdn.com/bootstrap/4.5.2/css/bootstrap.min.css">
<link rel="stylesheet" href="{{ url_for('static', filename='css/style.css') }}">
</head>
<body>
<div class="container mt-5">
<h1 class="text-center mb-4">Doctor Handwriting Recognition</h1>
{% if error %}
<div class="alert alert-danger text-center">
{{ error }}
</div>
{% endif %}
<form method="POST" action="{{ url_for('predict') }}" enctype="multipart/form-data">
<div class="custom-file">
<input type="file" class="custom-file-input" id="fileInput" name="file" accept=".png, .jpg, .jpeg">
<label class="custom-file-label" for="fileInput">Choose image file</label>
</div>
<div class="text-center mt-4">
<button type="submit" class="btn btn-primary">Predict</button>
</div>
</form>
</div>
<script src="https://code.jquery.com/jquery-3.5.1.slim.min.js"></script>
<script src="https://cdn.jsdelivr.net/npm/@popperjs/core@2.5.3/dist/umd/popper.min.js"></script>
<script src="https://stackpath.bootstrapcdn.com/bootstrap/4.5.2/js/bootstrap.min.js"></script>
</body>
</html>
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Doctor Handwriting Recognition - Result</title>
<link rel="stylesheet" href="https://stackpath.bootstrapcdn.com/bootstrap/4.5.2/css/bootstrap.min.css">
<link rel="stylesheet" href="{{ url_for('static', filename='css/style.css') }}">
</head>
<body>
<div class="container mt-5">
<h1 class="text-center mb-4">Doctor Handwriting Recognition - Result</h1>
<h3 class="text-center mt-4">Predicted Letters:</h3>
<div class="text-center">
<h2>
{% for letter in predictions %}
<span class="badge badge-primary letter-badge">{{ letter }}</span>
{% endfor %}
</h2>
</div>
</div>
<script src="https://code.jquery.com/jquery-3.5.1.slim.min.js"></script>
<script src="https://cdn.jsdelivr.net/npm/@popperjs/core@2.5.3/dist/umd/popper.min.js"></script>
<script src="https://stackpath.bootstrapcdn.com/bootstrap/4.5.2/js/bootstrap.min.js"></script>
</body>
</html>
Project Demo Video
In this blog, we’ve built a full-fledged web application capable of recognizing handwritten prescription letters using a deep learning model. With Flask handling the backend, OpenCV processing the images, and TensorFlow making the predictions, this project demonstrates the power of combining these tools to create an easy-to-use interface for complex tasks.
If you require any assistance with this project or Machine Learning projects, please do not hesitate to contact us. We have a team of experienced developers who specialize in Machine Learning and can provide you with the necessary support and expertise to ensure the success of your project. You can reach us through our website or by contacting us directly via email or phone.
Comments