Building an Auto-Generated News Quiz
On the Danish Broadcasting Corporation (DR)'s website, dr.dk, there is a weekly quiz. It's a short quiz that consists of 7 questions based on this weeks news. As a daily reader, I always found them too short and easy and I wanted more them more frequently. And since it's no fun doing a quiz written by yourself, I built an automatically generated, daily quiz. Try it here. It's in Danish, sorry. Here is how I built it:
Defining the Scope
DR's weekly quiz is often based on images or video clips which are vastly more complex to process than just text, so I decided to go with a purely text-based quiz for my MVP.
The original quiz has only 3 options, but I decided to go with 4 options to increase the difficulty.
After having a look at their RSS feeds to determine approximately how many articles they were writing per day, it seemed appropriate to make the quiz 10 questions long every day.
That leaves us with the following feature set:
Feature Scope
- Generate 4 quizzes every day based on the news in the last 24 hours in 4 categories.
- Each quiz must contain at least 3 questions based on the availability of articles.
- Each question must have 1 correct answer and 3 wrong answers.
Architecturing the App
I started out by writing a short step by step battle plan of the daily process to generate the quiz, providing an overview before narrowing down the tech-stack.
Battle Plan
- Gather the news articles from the last 24 hours from dr.dk.
- Filter interesting facts from the articles with an LLM.
- Generate the quiz questions and answers from those facts, also using an LLM.
- Store the quiz in a database.
- Get the quiz data from the database and render it on the front-end.
Choosing a Tech Stack
Then it was time to choose a tech-stack. I knew I was gonna build the front-end with SvelteKit simply because that's the front-end framework I have been using the most recently. I also knew that I wanted to try out Supabase for the database as an alternative to Firebase. I also always try to make my apps more engaging by using Rive to add some animations. Here is the tech stack I came up with:
Tech Stack
- JS framework: SvelteKit
- Animations: Rive
- Hosting: Vercel
- Database: Supabase
- Scraping: Puppeteer
- LLM: gpt-4o
- Cloud compute: Google Cloud
Problems Encountered
The following are the main problems I encountered during building this app:
Vercel Serverless Functions Maximum Dependency Size
Vercel's maximum dependency size for serverless function didn't allow for Puppeteer to get installed. Even if upgrading to Vercel Pro, the limit is still 50mb.
One option would be to host on a VPS like Digital Ocean instead, but since Puppeteer requires 2GB RAM to run, it is rather expensive to rent a whole VPS only for it to run a few seconds per day.
Another option would be to use a headless browser service like Browserless, but it's even more expensive.
A third option I found was this repo, but I didn't want to rely on it for long term use and I stumbled across some users mentioning having issues getting detected as a bot while using it.
In the end I decided to go with Google Cloud Run which has a much higher maximum dependency size.
Quality of gpt-4o-mini
I first generated a bunch of quizzes with gpt-4o-mini which cost me around USD 0.01 per day, but the quality of especially the answers were not good enough and it was often very obvious which answer was the correct one. First I tried to solve it with prompt engineering, but ultimately decided to go with gpt-4o, which brought the cost up to around USD 0.15 per day.
Stripe not Supporting Taiwan
I am currently based in Taiwan and I planned to try and monetize the app and built up my subscription system around Stripe only to find out that Stripe doesn't support paying out to businesses in Taiwan with no easy work-arounds. Oh well, I will keep all the quizzes free for now.