Building a Fact Checker with OpenAI: A Step-by-Step Tutorial

Aug 7, 2024

Updated: Aug 8, 2024

In today's fast-paced world, information spreads at the speed of light, making it both a blessing and a curse. With the vast amount of data available, separating fact from fiction can be a challenging task. Whether it's news articles, social media posts, or casual conversations, misinformation can spread rapidly. This is where our project comes into play. Today, I'll guide you through building a powerful fact checker using OpenAI's GPT-3.5-turbo. This step-by-step tutorial will help you create a functional tool and deepen your understanding of natural language processing and AI. Designed to be beginner-friendly, this guide ensures that even if you’re new to coding, you'll be able to follow along and build your very own fact-checker!

Why Build a Fact Checker?

Before we jump into the technical details, let's take a moment to understand the significance of a fact-checking tool. The internet is teeming with information, some of it accurate and some misleading. Misinformation can have serious consequences, from influencing public opinion to affecting individual decision-making. Fact-checking is the process of verifying the factual accuracy of information before it's accepted or published. This practice is essential for maintaining credibility and trustworthiness in journalism, academic research, and everyday communications.

How the Fact Checker App Works

A fact checker will take a user-provided statement and determine whether it is a fact or a myth. If it is a fact, the tool will cite reliable sources for verification. If it is a myth, it will provide evidence to debunk it. This project will utilize OpenAI's GPT-3.5-turbo model to analyze the statements and return comprehensive responses.

Setting Up the Environment

Before writing any code, we need to ensure that our environment is set up correctly. First, make sure you have Python installed on your system. Then, we'll install the necessary packages. You can do this by running the following command:

This command installs the OpenAI and Panel libraries, which we will use for interacting with GPT-3.5-turbo and building our interactive dashboard, respectively.

pip install openai panel

This command installs the OpenAI and Panel libraries, which we will use for interacting with GPT-3.5-turbo and building our interactive dashboard, respectively.

If you are using Google Colab, there is no need for any setup. Simply open Google Colab, install the OpenAI library using the command `!pip install openai`, and start writing code.

Importing Required Libraries

With our environment set up, let's import the libraries we'll need. OpenAI provides access to the GPT-3.5-turbo model, and Panel will help us create an interactive web application.

import openai
import panel as pn
import os

Setting Up the OpenAI API Key

To use OpenAI's services, you need an API key, which you can obtain from the OpenAI website. Once you have your API key, set it up in your script as follows:

openai.api_key = 'YOUR_OPENAI_API_KEY'

Replace 'YOUR_OPENAI_API_KEY' with your actual API key. This key allows you to access OpenAI's powerful language models.

Creating the Completion Function

Next, we'll define a function get_completion that interacts with the GPT-3.5-turbo model. This function takes a prompt and returns a response, which will be crucial for our fact checker.

def get_completion(prompt, model="gpt-3.5-turbo"):
    messages = [{"role": "user", "content": prompt}]
    response = openai.ChatCompletion.create(
        model=model,
        messages=messages,
        temperature=0,  # Degree of randomness in the model's output
    )
    return response.choices[0].message["content"]

This function sends a user input to the GPT-3.5-turbo model and retrieves the response, ensuring that our fact checker has the necessary output to analyze statements.

Now, let's go through it line by line.

Defining the Function

def get_completion(prompt, model="gpt-3.5-turbo"):

def is a keyword used to define a function in Python.
get_completion is the name of our function.
prompt is a parameter that the function takes. This is the text input we want the AI to respond to.
model="gpt-3.5-turbo" sets a default value for the model parameter. This means if you don't specify a model when calling the function, it will use "gpt-3.5-turbo" by default.

Preparing the Messages

messages = [{"role": "user", "content": prompt}]

messages is a list that contains one dictionary (a set of key-value pairs).
The dictionary has two keys: role and content.
role is set to "user", indicating that this message is coming from the user.
content is set to the value of prompt, which is the text input we want the AI to respond to.

Calling the OpenAI API

response = openai.ChatCompletion.create(
        model=model,
        messages=messages,
        temperature=0,  # Degree of randomness in the model's output
    )

response is a variable that stores the output from the OpenAI API.
openai.ChatCompletion.create is a method provided by the OpenAI library to interact with the GPT-3.5-turbo model.
model=model specifies which model to use (in this case, "gpt-3.5-turbo").
messages=messages sends the prepared messages to the API.
temperature=0 controls the randomness of the output. A temperature of 0 means the model's output will be very deterministic and less random.

Returning the Result

return response.choices[0].message["content"]

return sends back the result of the function.
response.choices[0].message["content"] accesses the content of the first choice in the response. The API might return multiple choices, but we are interested in the first one.
response.choices[0] gets the first choice.
.message["content"] extracts the text content of that choice.

Collecting User Input

We'll define a function collect_messages to handle user input. This function formats the user's statement into a prompt for the GPT-3.5-turbo model and displays the result in the dashboard.

pn.extension()
def collect_messages(debug=False):
    user_input = inp.value_input

    if debug: print(f"Input = {user_input}")

    if user_input == "":
        return
    inp.value = ''

    prompt = f"""
    Please check whether the provided statement is a fact or a myth.
    If the statement is a fact, then cite and provide some reliable sources where the user can verify.
    If the statement is a myth, then give evidence (if any) to debunk it.

    Provide the solution in the following format:
    **Status**: Fact or Myth   \n
    **Clarification**:   \n
    **Sources**: Reliable sources if any.   \n

    If you are not able to analyze whether the statement is a fact or a myth then politely apologize.
    You will only analyze a statement to be a fact (true) or myth (false). You would not perform any operation other than this.
    If a user does not provide a statement that is appropriate for fact-checking then politely ask to provide the correct type of input.

    The statement is delimited with triple backticks.

    Question: '''{user_input}'''
    """

    global context
    response, context = get_completion(prompt), context

    context.append({'role': 'Fact checker', 'content': f"{response}"})
    panels.append(
        pn.Row('Statement:', pn.pane.Markdown(user_input, width=600)))
    panels.append(
        pn.Row('Checker:', pn.pane.Markdown(response, width=600, style={'background-color': "#fff1e6"})))

    return pn.Column(*panels)

This function processes the user's input, generates a prompt for the GPT-3.5-turbo model, and updates the dashboard with the response.

Let's break down the collect_messages function line by line.

Initializing Panel

pn.extension()

pn.extension() initializes Panel, a library we use to create interactive web applications. It ensures that all necessary components are loaded and ready to use.

Defining the Function

def collect_messages(debug=False):

def is a keyword used to define a function in Python.
collect_messages is the name of our function.
debug=False is a parameter that, when set to True, will enable debug messages. By default, it is False.

Getting User Input

user_input = inp.value_input

user_input is a variable that stores the text entered by the user in an input field (created earlier with Panel).

Debugging (Optional)

if debug: print(f"Input = {user_input}")

This line checks if debugging is enabled (debug is True).
If it is, it prints the user input to the console for debugging purposes.

Checking for Empty Input

if user_input == "":
    return

This checks if the user input is empty.
If the input is empty, the function stops (returns) and does nothing else.

Clearing the Input Field

inp.value = ''

This line clears the input field after the user submits their input, making it ready for the next input.

Creating the Prompt

prompt = f"""
Please check whether the provided statement is a fact or a myth.
If the statement is a fact, then cite and provide some reliable sources where the user can verify.
If the statement is a myth, then give evidence (if any) to debunk it.

Provide the solution in the following format:
**Status**: Fact or Myth   \n
**Clarification**:   \n
**Sources**: Reliable sources if any.   \n

If you are not able to analyze whether the statement is a fact or a myth then politely apologize.
You will only analyze a statement to be a fact (true) or myth (false). You would not perform any operation other than this.
If a user does not provide a statement that is appropriate for fact-checking then politely ask to provide the correct type of input.

The statement is delimited with triple backticks.

Question: '''{user_input}'''
"""

This creates a prompt string that is sent to the AI model.
The prompt asks the model to determine if the user's statement is a fact or a myth and provide relevant information.
The user's statement is inserted into the prompt where {user_input} is.

Importance of Prompt in Fine-Tuning LLM

Creating an effective prompt is crucial for fine-tuning large language models (LLMs). The prompt sets the context and guides the model's response, ensuring it aligns with the user's expectations. A well-crafted prompt can significantly enhance the model's performance, making its outputs more accurate and relevant. Fine-tuning with precise prompts allows for better control over the model's behaviour, tailoring it to specific tasks and improving overall user experience.

Global Context

global context

This declares that we are using the context variable, which is defined outside the function and can be accessed globally within the script.

Getting the Response from the AI

response, context = get_completion(prompt), context

response stores the AI's reply to the prompt.
context remains unchanged but is re-assigned here for clarity.

Updating the Context

context.append({'role': 'Fact checker', 'content': f"{response}"})

This adds the AI's response to the context list.
context keeps track of the conversation history.

Displaying the User's Input

panels.append(
    pn.Row('Statement:', pn.pane.Markdown(user_input, width=600)))

This adds a new row to the panels list.
The row displays the user's statement in Markdown format.

Displaying the AI's Response

panels.append(
    pn.Row('Checker:', pn.pane.Markdown(response, width=600, style={'background-color': "#fff1e6"})))

This adds another row to the panels list.
The row displays the AI's response in Markdown format with a specific background color for better readability.

Returning the Updated Panels

return pn.Column(*panels)

This returns a Column layout containing all the rows in panels.
The pn.Column function takes multiple rows (elements in panels) and arranges them vertically.

Displaying the Results

We need a way to display the results of the fact-checking process. We'll create a global list panels to hold the display elements and a context to store the conversation history. Then, we'll set up the binding for the button and the dashboard layout.

panels = []  # Collect display elements
context = [{'role': 'Fact checker', 'content': "You are a Fact checker."}]

interactive_conversation = pn.bind(collect_messages, button_conversation)

dashboard = pn.Column(
    inp,
    pn.Row(button_conversation),
    pn.panel(interactive_conversation, loading_indicator=True, height=300),
)

dashboard

Above code binds the collect_messages function to the button, ensuring that when the button is clicked, the function processes the input and updates the dashboard accordingly.

Screenshots

To give you a better understanding of how our fact checker app works in real-time, we’ve prepared a comprehensive demo video. In this video, you'll see how, when we input a sentence, the app provides the result indicating whether the sentence is a fact or a myth.

We have successfully built a fact checker using OpenAI's GPT-3.5-turbo model. This powerful tool can help you verify statements by determining whether they are factual or mythical and providing reliable sources or evidence for clarification. By following this step-by-step tutorial, you've not only created a functional application but also gained valuable insights into how natural language processing and AI can be harnessed for real-world applications.

If you have any questions or need further assistance, feel free to reach out.

Happy coding!