In this post, I talk about how I built an open-source Custom Content AI Chatbot with Upstash, Next.js, LangChain and Fly.io. Upstash helped me to schedule model training, offered way of generous rate limiting and caching OpenAI API responses.
Once you have created an Upstash account and are logged in you are going to go to the Redis tab and create a database.
After you have created your database, you are then going to the Details tab. Scroll down until you find the Connect your database section. Copy the content and save it somewhere safe.
Also, scroll down until you find the REST API section and select the .env button. Copy the content and save it somewhere safe.
Setting up Upstash QStash
Once logged in you are going to go to the QStash tab and obtain the QSTASH_URL, QSTASH_TOKEN, QSTASH_CURRENT_SIGNING_KEY, and QSTASH_NEXT_SIGNING_KEY. Copy the content and save it somewhere safe.
Setting up the project
To set up, just clone the app repo and follow this tutorial to learn everything that's in it. To fork the project, run:
Once you have cloned the repo, you are going to create a .env file. You are going to add the items we saved from the above sections.
It should look something like this:
After these steps, you should be able to start the local environment using the following command:
Repository Structure
This is the main folder structure for the project. I have marked in red the files that will be discussed further in this post that deals with managing the vector store, creating API Routes for chatting with AI trained on your custom content (with caching the responses), and scheduling model training process.
High-Level Data Flow and Operations
This is a high-level diagram of how data is flowing and operations that take place 👇🏻
When a user asks a question via the chatbot, users’ IP is checked against rate limiting, and a response, if not cached via Upstash Redis, is sought from OpenAI API (then cached) and streamed to the user
When an admin requests training of the existing model on given set of URLs, with help of Upstash’s QStash a POST request is made in the serverless after a given delay to fetch the content in the given URLs and update the model (in the background)
Setup Chat and Train API Routes in Next.js
In this section, we talk about how we’ve setup the route: pages/api/chat.js to enable Cross Origin Requests, Rate Limit the Chat API Calls, Cache and Stream Responses to the users and expose a method to schedule content training on particular URLs, and pages/api/train.js to solely perform training on the given URLs but in the background.
1. Enable CORS
Using cors package, we’ve enabled CORS in the application to use the chatbot at multiple places, say as a bot on your website. As soon as the API Route is initialized, we run the cors setup as below 👇🏻
2. Schedule Content Training Request(s) on given URLs
With Upstash QStash, one can create APIs that are like fire and forget. You don’t need to actively wait for the main function to get finished to get a response but rather do it in the background (optionally, after some given delay). It’s like a cron-job but that runs as per each request and not regularly at scheduled intervals.
In the same chat api route, we accept a request that has an admin-key header and if that matches with the server side secret (ADMIN_KEY), we schedule content training on the set of URLs passed in the request body after some delay (here 10s). The content training request after the set delay is made to a given endpoint (here: https://custom-content-ai-chatbot.fly.dev/api/train)
Now, let’s dive into what’s there in the train API route (pages/api/train.js) 👇🏻
In the code above, we’re performing three critical actions:
Perform incoming request verification using QStash’s verifySignature method. This underneath looks for Upstash-Signature header and verifies it with the raw body received.
Call the train function that takes of fetching the URL content and adding to the existing vector store (and saving it).
Clear the cached responses in Upstash Redis after filtering out the keys that pertain to implementation of rate limiting via Redis Transactions.
3. Rate Limiting
To implement rate-limiting, we use Upstash Redis database client and a rate limiter library called @upstash/ratelimit.
Using Rate Limiting, I was able to make the use of service - totally free and public! This allowed me to showcase the benefits of the system i.e. the chat responses. Literally anyone can ask 30 questions in a day via the website. We’re able to enforce the rate limit of 30 questions in a day based on IP address as the key.
4. Load the saved indexed vector store and ask OpenAI for responses
With all the checks done, we are now heading onto the main work - calling OpenAI API with our custom content and sending the response to the user. To simplify things, we’ll break this into further parts:
3.1: Retrieving Saved Vector Store
3.2: Adding Prompt guidelines to the user queries
Using PromptTemplate by LangChain, with the user query we pass the instructions on how to, and in which manner shall the AI answer the question:
3.3: Stream and Cache Responses
To cache responses with Upstash Redis, we’ll make use of UpstashRedisCache cache library by LangChain. We pass on the existing Redis instance as the client, and pass the caching handler to the ChatOpenAI wrapper to use it to cache once responses are delivered:
That was a lot of learning! You’re all done now.
Deploy to Fly.io
The repository comes in with a baked-in setup for Fly.io, specifically pertaining to:
Dockerfile
fly.toml
.dockerignore
Deploying requires an account on Fly.io. Once you have an account, you can create an app in Fly.io by running the following command in the root folder of your project:
Now we are done with the deployment! Yes, that was all.
Conclusion
In conclusion, this project has provided valuable experience in implementing OpenAI response caching, rate limiting, and scheduled API requests to train the model, all while using a service that scales with your need, i.e. Upstash.