Building a Code Generator Web Application with Flask and GPT

Automation is key to enhancing productivity, especially in the field of software development. Imagine having a web application where you can input a coding problem and get a code solution generated instantly. In this blog, we will walk through building such a web application using Flask and a pre-trained language model, GPT-Neo.

Introduction

The goal of this project is to develop a web application that takes a coding problem as input and returns a generated code solution. We’ll be using Flask, a lightweight web framework in Python, along with GPT-Neo, a powerful language model that can generate text based on the given input. The application will provide a simple and intuitive interface for users to enter their coding problems and receive generated solutions.

What You Will Learn

By the end of this tutorial, you will learn:

How to set up a Flask web application.
How to use the GPT-Neo model to generate code solutions from textual descriptions.
How to build a user-friendly web interface for inputting problems and displaying solutions.

Prerequisites

Before you begin, ensure you have the following:

Basic knowledge of Python programming.
Flask installed in your environment (pip install flask).
PyTorch and Hugging Face's Transformers installed (pip install torch transformers).

Understanding the Code

Let's go through the provided code in detail, explaining each part and its role in the overall application.

1. Importing Required Libraries

import torch
from markupsafe import escape
from flask import Flask, request, render_template
from transformers import AutoTokenizer, AutoModelForCausalLM

Here, we import the essential libraries:

torch: For handling the tensor operations and working with the GPT-Neo model.
markupsafe.escape: To sanitize the output and prevent any HTML injection attacks when displaying generated text.
flask: For creating the web application and handling HTTP requests.
transformers: Provides tools for loading pre-trained models and tokenizers from Hugging Face's model hub.

2. Initializing the Flask Application

app = Flask(__name__)

This line initializes the Flask application, which serves as the backbone of our web service, handling routes, and rendering HTML templates.

3. Loading the GPT-Neo Model and Tokenizer

tokenizer = AutoTokenizer.from_pretrained("EleutherAI/gpt-neo-125M")
tokenizer.pad_token = tokenizer.eos_token
model = AutoModelForCausalLM.from_pretrained("0xsuid/simba-125M")
model.resize_token_embeddings(len(tokenizer))

Here, we load the GPT-Neo model and its associated tokenizer:

AutoTokenizer.from_pretrained("EleutherAI/gpt-neo-125M"): Loads the tokenizer for the GPT-Neo model with 125 million parameters.
tokenizer.pad_token = tokenizer.eos_token: Ensures that the tokenizer uses the end-of-sequence (EOS) token as the padding token.
AutoModelForCausalLM.from_pretrained("0xsuid/simba-125M"): Loads a custom fine-tuned GPT-Neo model named simba-125M from the Hugging Face model hub.
model.resize_token_embeddings(len(tokenizer)): Adjusts the model's token embeddings to match the size of the tokenizer.

4. Formatting the Input Text

def format_input(input_problem):
    answer_type = "\nUse Standard Input format\n"
    formatted_input = "\nQUESTION:\n" + input_problem + "\n" + answer_type + "\nANSWER:\n"
    return formatted_input

The format_input function prepares the input problem text by adding specific prompts around it. This formatting helps guide the language model to generate the correct type of output:

\nQUESTION:\n: Introduces the problem statement.
\nUse Standard Input format\n: Instructs the model to use a standard format in the generated answer.
\nANSWER:\n: Marks the beginning of the expected output.

5. Generating Predictions

def get_prediction(input_problem):
    formatted_input = format_input(input_problem)
    encoded_text = tokenizer.encode(formatted_input, truncation=True)
    input_ids = torch.LongTensor(encoded_text).unsqueeze(0)
    output_ids = model.generate(
        input_ids,
        num_beams=5,
        early_stopping=True,
        max_length=2048 - len(input_ids)
    )

    prediction = tokenizer.decode(output_ids[0], skip_special_tokens=True)
    prediction = prediction.split("ANSWER:\n")[1]
    return escape(prediction)

The get_prediction function is responsible for generating the text output from the model:

The input problem is first formatted using the format_input function.
The formatted text is tokenized into input IDs using the tokenizer.
torch.LongTensor(encoded_text).unsqueeze(0): Converts the tokenized input into a tensor and adds a batch dimension.
model.generate(...): Generates the output sequence using beam search (num_beams=5) to find the best possible solution.
tokenizer.decode(...): Converts the generated output tensor back into human-readable text.
prediction.split("ANSWER:\n")[1]: Extracts the generated code after the ANSWER: tag.
escape(prediction): Escapes the prediction output to prevent HTML injection.

6. Setting Up Flask Routes

a) Root Route for Displaying the Form

@app.route('/', methods=['GET'])
def root():
    return render_template('index.html')

This route renders the index.html template when the user accesses the root URL. The template provides the form where users can input their coding problem.

b) Prediction Route for Handling Form Submissions

@app.route('/', methods=['POST'])
def predict():
    if request.method == 'POST':
        coding_problem = request.form.get('coding_problem')
        if coding_problem is not None:
            prediction = get_prediction(coding_problem)
            return render_template('index.html', generatedAnswer=prediction, question=coding_problem)

This route handles form submissions:

The coding problem entered by the user is retrieved from the form using request.form.get('coding_problem').
If the problem is not empty, the get_prediction function is called to generate the code solution.
The index.html template is then re-rendered with the original problem and the generated solution.

7. Running the Flask Application

if __name__ == '__main__':
    app.run()

This block ensures that the Flask application runs when the script is executed directly.

The User Interface

The index.html file provides a simple interface where users can input a coding problem and view the generated solution.

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta http-equiv="X-UA-Compatible" content="IE=edge">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>Code Generator</title>
    <link href="https://cdn.jsdelivr.net/npm/bootstrap@5.3.0-alpha1/dist/css/bootstrap.min.css" rel="stylesheet">
</head>
<body>
    <div class="container text-center">
        <h1 class="display-1">Code Generator</h1>
        <div class="row d-flex justify-content-center">
            <p class="col">
                Streamline your coding experience effortlessly by harnessing the power of our Text to Code generator.
            </p>
        </div>
        <form action="" method="post" class="mt-4">
            <div class="row d-flex">
                <div class="col">
                    <textarea name="coding_problem" class="form-control" id="coding_problem" cols="30" rows="10" placeholder="Enter Coding Problem">{{ question }}</textarea>
                </div>
                <div class="col">
                    <textarea name="answer" id="answer" class="form-control" cols="30" rows="10" placeholder="Answer" disabled readonly>{{ generatedAnswer }}</textarea>
                </div>
            </div>
            <div class="row d-flex justify-content-center">
                <div class="col mt-4 mx-auto">
                    <button type="submit" class="btn btn-dark btn-lg">Submit</button>
                </div>
            </div>
        </form>
    </div>
    <script src="https://cdn.jsdelivr.net/npm/bootstrap@5.3.0-alpha1/dist/js/bootstrap.bundle.min.js"></script>
</body>
</html>

This HTML template is structured as follows:

Title and Description: A brief description of the application's purpose.
Form Layout: The form contains two text areas:
- One for inputting the coding problem.
- The other for displaying the generated answer.
Submit Button: A button to submit the coding problem for processing.
Bootstrap: Used for styling the interface, providing a clean and responsive design.

Running the Application

To run the application, save the app.py and index.html files in the same directory. Then, start the Flask server by running:

python app.py

Once the server is running, open your web browser and navigate to http://127.0.0.1:5000/ to access the web app.

Complete Code

app.py

import torch
from markupsafe import escape
from flask import Flask, request, render_template
from transformers import AutoTokenizer, AutoModelForCausalLM

app = Flask(__name__)

tokenizer = AutoTokenizer.from_pretrained("EleutherAI/gpt-neo-125M")
tokenizer.pad_token = tokenizer.eos_token
model = AutoModelForCausalLM.from_pretrained("0xsuid/simba-125M")
model.resize_token_embeddings(len(tokenizer))

def format_input(input_problem):
        answer_type = "\nUse Standard Input format\n"
        formatted_input = "\nQUESTION:\n" + input_problem + "\n" + answer_type + "\nANSWER:\n"
        return formatted_input

# Get a prediction
def get_prediction(input_problem):
    formatted_input = format_input(input_problem)
    encoded_text = tokenizer.encode(formatted_input, truncation=True)
    input_ids = torch.LongTensor(encoded_text).unsqueeze(0)
    output_ids = model.generate(
        input_ids,
        num_beams=5,
        early_stopping=True,
        max_length=2048 - len(input_ids)
    )

    prediction = tokenizer.decode(output_ids[0],skip_special_tokens=True)
    prediction = prediction.split("ANSWER:\n")[1]
    return escape(prediction)

@app.route('/', methods=['GET'])
def root():
    return render_template('index.html')


@app.route('/', methods=['POST'])
def predict():
    if request.method == 'POST':
        coding_problem = request.form.get('coding_problem')
        if coding_problem is not None:
            prediction = get_prediction(coding_problem)
            return render_template('index.html', generatedAnswer=prediction, question=coding_problem)


if __name__ == '__main__':
    app.run()

index.html

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta http-equiv="X-UA-Compatible" content="IE=edge">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>Code Generator</title>
    <link href="https://cdn.jsdelivr.net/npm/bootstrap@5.3.0-alpha1/dist/css/bootstrap.min.css" rel="stylesheet" integrity="sha384-GLhlTQ8iRABdZLl6O3oVMWSktQOp6b7In1Zl3/Jr59b6EGGoI1aFkw7cmDA6j6gD" crossorigin="anonymous">
</head>
<body>
    <div class="container text-center">
        <h1 class="display-1">Code Generator</h1>
        <div class="row d-flex justify-content-center">
            <p class="col">
                Streamline your coding experience effortlessly by harnessing the power of our Text to Code generator.
            </p>
        </div>
        <form action="" method="post" class="mt-4">
            <div class="row d-flex">
                <div class="col">
                    <textarea name="coding_problem" class="form-control" id="coding_problem" cols="30" rows="10" placeholder="Enter Coding Problem">{{ question }}</textarea>
                </div>
                <div class="col">
                    <textarea name="answer" id="answer" class="form-control" cols="30" rows="10" placeholder="Answer" disabled readonly>{{ generatedAnswer }}</textarea>
                </div>
            </div>
            <div class="row d-flex justify-content-center">
                <div class="col mt-4 mx-auto">
                    <button type="submit" class="btn btn-dark btn-lg">Submit</button>
                </div>
            </div>
        </form>
    </div>
    <script src="https://cdn.jsdelivr.net/npm/bootstrap@5.3.0-alpha1/dist/js/bootstrap.bundle.min.js" integrity="sha384-w76AqPfDkMBDXo30jS1Sgez6pr3x5MlQ1ZAGC+nuZB+EYdgRZgiwxhTBTkF7CXvN" crossorigin="anonymous"></script>
</body>
</html>

You can see We built a code generator web application using Flask and GPT-Neo. This application showcases the power of AI in generating code solutions from textual descriptions, streamlining the coding process for developers.

For any assistance with building a Code Generator Web Application using Flask and GPT, feel free to contact us.