How To Upload A Document To ChatGPT Using The OpenAI API

If you are at this stage of wondering how to upload a document to ChatGPT through the OpenAI API, you’re either a very adventurous person, or a developer. For the developers out there – this is going to be aimed mainly towards beginners and teaching them how interact with the API, so some parts you can feel free to skip.

How To Upload A Document To ChatGPT Using The OpenAI API
Video Tutorial: How To Upload A Document To ChatGPT Using The OpenAI API

What You Should Know Before You Begin

The API is paid

The OpenAI api is paid, but at least as of right now, they give 5 USD free credit for new accounts, so you can play around with it to your heart’s content. Once your initial credit is gone, you would need to provide a payment method and pay. You can see the prices here: https://openai.com/pricing

The API can be interacted with only through TEXT

You cannot just upload a PDF / docx file. You would need to first convert it to text, and then send it to the GPT. Fortunately, there are lots of txt converters for most common filetypes, so this shouldn’t be a big issue.

The API has a context limit

ChatGPT itself has a context limit as well of course, and usually it is not a problem for regular conversations. However, if you are planning on uploading big documents, you may go over the context limit and will have trouble with the API. OpenAI has different GPT models that have different context limits, so you should choose the one that best suits your needs. The biggest context limit that you can have right now is for GPT4, and it is 32k tokens (roughly equivalent to 25,000 words). Note: This includes your input and the output of the model.

With that said, how do you upload documents to the OpenAI api?

Step 1: Install python and PyCharm

The language we are going to be using to upload a document to ChatGPT through the OpenAI api is Python. It is one of the most popular languages and it is heavily used in the machine learning field. Even if you’re a total beginner, it shouldn’t be that hard for you to get started. 

Download and install Python:

You can use this link to do so: https://www.python.org/downloads/. Download the file and install it. If you need specific help for your operating system, follow this guide: https://wiki.python.org/moin/BeginnersGuide/Download

Download and install PyCharm Community Edition

You can use this link. PyCharm is a great development tool that the big boys use, and it has a free community edition which will fit our needs just fine. 

PyCharm is the software we will use to write our code (kind of like Notepad or any other text editor, but it has special powers that make it perfect for writing python code.)

Step 2: Create a PyCharm Project

  1. Open PyCharm after you install it
  2. Click on new project
  3. Call it something like ‘OpenAIUploadDocument’ or whatever you like:
  4. Cope and paste the below code into the file called ‘main.py’:
import openai
import re

openai.api_key = 'YOUR-API-KEY'

def read_file(file_path):
    with open(file_path, 'r') as file:
        return file.read()

def ask_question(prompt, question):
    response = openai.ChatCompletion.create(
        model='gpt-3.5-turbo',
        messages=[
            {'role':'system', 'content':'You are a helpful assistant. The first prompt will be a long text,'
                                        'and any messages that you get be regarding that. Please answer any '
                                        'questions and requests having in mind the first prompt '},
            {'role':'user', 'content': prompt},
            {'role':'user', 'content': question}
        ]
    )
    return response.choices[0].message['content']

# Read your file's content
file_content = read_file('FILENAME.txt')

# Ask a question about the file's content
question = input('What is your question?: ')
response = ask_question(file_content, question)
formatted_response = re.sub(r'(\. )', '.\n', response)

print(formatted_response)

Step 3: Install required packages and update the project with your data

  1. You will see that openai at the top is highlighted red. Hover over it for a few seconds and you should see what you see in my screenshot. Click on install package openail:
  2. On line 4, you will see openai.api_key = ‘YOUR-API-KEY’. Replace YOUR-API-KEY with your actual api key. You can create a new one from here: https://platform.openai.com/account/api-keys
  3. Create a TXT file with the name ‘FILENAME.txt in the same folder you created your project. Right click your project folder, click on new -> File -> Filename.txt.
    You can alternatively create a file with a different name, but you would need to update this line (line 24):
    file_content = read_file(‘FILENAME.txt’)
    Change FILENAME.txt to your actual filename.
  4. Right click somewhere inside the editor and click on Run ‘main’
  5. Here at the bottom you will have a terminal open. It will say: ‘What is your question?:’. Type in your question regarding your data, and wait for the chatbot to respond:

This is pretty much it. This is very simple implementation, and if there is interest I can create something a bit more complex, but it will get you started. You can make the process iterative and aks multiple questions, dynamically enter filenames / paths etc. Lots of possible improvements and cool stuff you can do from this point.

Important Notes On Uploading Documents To ChatGPT using OpenAI API:

How long can the uploaded text be?

It shouldn’t be more than the total context the model you’re using allows, together with the length of output you expect. You can see the context limits for the models here: https://openai.com/pricing

We are using the default gpt-3.5-turbo here, so the context is around 4k, which is about 3000 words. You can definitely go way above it with a different model. Currently up to 32k tokens, and you need to use the pricy GPT4.

Is my data safe when I use the API?

By default, OpenAI will not use your conversations through the API to train the model.  However, they still keep them in the system for 30 days for abuse monitoring. Having that in mind, make sure you do not use any sensitive data. From the OpenAI documentation, I found out that you can come to agreements with OpenAI directly where they do not store any data at all, but that is possible after a direct conversation with one of their sales teams. So if you’re a company, you still have options. To understand how ChatGPT (the online chatbot itself) stores and handles data, check out my comprehensive article here: Does Chat GPT Save Data? 

Conclusion: Why use the OpenAI API over ChatGPT to upload documents?

OpenAI gives you a simple way of using its API to communicate with different ChatBots. However, it is much easier to use the ChatGPT chatbot itself, especially after the introduction of the Code Interpreter. Check my comprehensive article on the topic here: How To Upload A Document To ChatGPT. The main two reasons here are: Data Privacy and Context Limit. 

OpenAI API is more private

By default, the conversations you have with the Chatbot will not be used to train the model if you use the API. Additionally, if you are a company, you can negotiate with OpenAI a solution that doesn’t even store the data for the default 30 days.

OpenAI API has a bigger context limit option

Depending on the model you choose to upload the document to, you can have up to a whooping 32k context limit. From what I could gather from around the web, it seems like the online chat version, ChatGPT and ChatGPT plus has a context limit of around 4k. A significant improvement I would say, and one that gives you many more options on the files you can use.