Efficient Article Summarization with QStash: Handling API Rate Limits and Parallel Processing
In this article, we'll build an application to summarise hundreds of online articles at once. To create these summaries, we'll use QStash's LLM integration to call an Upstash-hosted LLM. This not only allows us to bypass platform-specific function execution limits but also massively reduces our billed function execution duration.
You'll learn how to work around API rate limits, which could otherwise be a problem when making many calls in parallel. The result will be hundreds of neatly summarised online articles created at the same time, ready for you to read or further process.
Motivation
Almost all publicly available APIs have a rate limit applied to them, a maximum amount of requests you can make in a certain time frame. And, of course, depending on the API, hitting those limits is usually relatively easy. For example, Twitter is known for having very restrictive API rate limits, even for expensive premium tiers of their API.
If you depend on a rate-limited API for your service, you're forced to implement some kind of workaround (i.e. throttling) that leads to a more complex codebase.
With Upstash QStash, a message scheduler for the serverless environment, we don't need to worry about throttling mechanisms under high API load. Our API requests are automatically retried when hitting our rate limits to make sure every request gets processed.
Prerequisites
To follow along, you'll need:
- A basic understanding of Python and Django.
- An Upstash account to obtain your QStash token and Redis URL.
- A Vercel account to deploy the web application.
Project Overview
The project consists of two main components:
-
A Django web application that receives article summaries and saves them to our Redis database. We'll deploy this application to Vercel.
-
A Python script that sends articles to our Upstash hosted model for summarization using QStash's LLM API support. The script will iterate over 1000 articles stored in Redis, send each one to our model for summarization, and save the summaries back in Redis. We'll use QStash's queue system to handle the parallel processing of these tasks.
If we want to use one of OpenAI's models, we can still use QStash to handle the rate limits. What we need to do is create another endpoint in our Django application, call it from the Python script using QStash, call the OpenAI model to create the summary, and return the value of the x-ratelimit-reset-requests
header in the Retry-After
header to QStash to handle the rate limits.
Thankfully, when we use an Upstash-hosted model, and the rate limits are exceeded, QStash automatically schedules the retry of publishing or enqueuing chat completion tasks depending on the reset time of the rate limits. This way, we don't need to worry about handling the rate limits ourselves.
Project Setup
Install Necessary Packages
Install QStash Python SDK, Upstash Redis, Django, and Python-dotenv using pip:
QStash Python SDK is used to interact with QStash services, upstash-redis is used to communicate with our database, django is used to create the web application, and python-dotenv is used to load environment variables from a .env
file.
To use a Redis database, create a free account on Upstash and get your Redis URL. Follow the instructions in the Upstash Redis documentation to create one.
Create a Django Project
First, we need to set up a new Django project. Navigate where you'd like this project to live and run:
Configure Django Settings
In our settings.py
, we'll add summarizer
to INSTALLED_APPS
and set APPEND_SLASH
to False
. Also, add .vercel.app
and 127.0.0.1
to ALLOWED_HOSTS
to allow requests from Vercel and local development:
Add QStash configurations and other environment variables to a .env
file in the project root:
Load the environment variables into the project's settings.py
:
Finally add the following line to the wsgi.py
file to expose the application to Vercel:
Implementation
1. Creating a Django View to Use as a Callback URL
We'll create a Django view to use as our callback URL. This view will handle the summary data sent by QStash and save it in our Redis database. We will use the upstash_redis
package to interact with our Redis database. We will also add the csrf_exempt
decorator to the view to allow POST requests without CSRF tokens.
First, we decode the base64-encoded data, extract the summary, and save it to Redis using the article ID as the key.
2. Adding the URL Pattern for the Callback View
We will add the URL pattern for the callback view to the summarizer/urls.py
file of the summarizer
app:
3. Update the Project's URL Configuration
We will include the URL pattern for the summarizer
app in the project's article_summarizer/urls.py
file:
4. Deploy the Django Application
We will use Vercel to deploy our application. Before deploying, we need to create a vercel.json
file in the project root with the following configuration:
Then we will create a requirements file to specify the dependencies. We will run the following command to generate the requirements.txt
file:
We are now ready to deploy!
To easily deploy our app, we can create a GitHub repository and push our Django project to it. Then, create a new project on Vercel and connect it to our GitHub repository. After that, Vercel will handle the deployment process for us. After the deployment is complete, we will get a deployment URL that we can use as the callback URL and we need to set our environment variables in our project’s Settings -> Environment Variables. After we set our variables we will redeploy from the Deployments tab.
5. Creating the Queue and Sending Summarization Requests
We'll create a queue with parallelism set to 2, meaning two summarization tasks can run concurrently. Then, we'll iterate over 1000 articles stored in Redis, sending each one to our model for summarization. We'll also set the callback URL to our deployed Django application with the article ID as a query parameter.
Conclusion
And that's it! We now have an app that can summarize hundreds of web articles reliably and quickly using parallelism and automatic retries upon hitting our rate limits. By the way, I included a bonus for you: Use this article summary app to summarize any article and send the summary straight to your email inbox.
For more details, you can explore the Upstash QStash documentation. You can find the complete source code for this project on the GitHub repository. For any questions or feedback, feel free to reach out to me on LinkedIn.