Cost of Service

I'm being extremely transparent. It's a core value of mine. Hopefully you can learn from our decisions.

Costs to run UIUC.chat as of April 2025

We're becoming an production Illinois service. Let's take a look at our costs before we launch to campus (anticipated full campus advertising campeign in September 2025).

Core Services

LLM inference is the most expensive part of the app, but we pass that onto the user with a "BYO API Keys" model. $0.

Frontend: $30/mo

Hosted on Vercel, had to upgrade to pro tier for greater usage. We're doing 600k function invokations, that's dominated by our "polling" during document uploads. I'm working to reduce tons of unnecessary polling.

Backend: $67/mo

We host our Python backend and a few supporting services on Railway.

Beam.cloud: $15/mo

Beam.cloud runs our document ingest queue, and a few supporting functions for AI Tool use.

Databases

We upgraded from the "Small" to "Medium" instance in Febuaruy 2025. Still, it seems a little under-sized for our needs and occasionally locks up under heavy load.

I’m falling out of love with Supabase. (1) It “locks up” under heavy load, e.g. a user exporting their files while another user adds tons of new file uploads. (2) Using their (optional) SDK creates vendor lock-in. (3) The pricing is good, better than most, but only on-par with AWS Aurora RDS. I’d use managed AWS RDS in the future, or self hosted vanilla Postgres + PGBouncer.

Hosted on AWS EC2 i3en.xlarge with all data stored in-memory - this is not the most cost effective. Using AWS credits supplied to the Center for AI Innovation.

We're going to move this somewhere else more cost effective.

Redis

Purchased via Redis Cloud on AWS Marketplace, just a flat rate $5/mo. Using AWS credits supplied to the Center for AI Innovation.

AWS S3 ($25/mo)

This ranges from $10-$30/mo, depending on egress costs.

Mailgun + Ghost (self hosted) powers news.uiuc.chat. Mailgun is the only supported provider for Ghost, we pay Mailgun a base of $15/mo + usage, averaging $16/mo.

Nomic Atlas $100/mo

A fantastic startup creating visualizations of embedding spaces. We use this to (1) visualize all the documents a user has uploaded and (2) visualize all the conversations in each chatbot. Both have great filtering, search, clutering, hierarchical topic labeling. It's pretty great. They give us $100/mo education pricing.

Total costs

Service

$/mo

Notes

Frontend

$30

Backend

$82

Databases

$439

$329 is Qdrant, which we're moving somewhere cheaper.

Supporting services

$116

Mailgun + Nomic Atlas.

Total

$667

Soon to be $367 w/ cheaper Qdrant. Largest costs are covered by AWS credits.

Costs when we were first starting (as of July 2024):

Language modeling

Our biggest cost by far is OpenAI, especially GPT-4. However, in most cases we pass this cost directly to the consumer with a "BYO API Keys" model. Soon we'll support "BYO base_url" option users can self-host or use alternative hosting providers (like Anyscale or Fireworks) to host their LLM.

Backend

Railway

Railway.app hosts our Python Flask backend. You pay per second of CPU and Memory usage. Our cost is dominated by memory usage, not CPU.

As of January 2024, our web crawling service is a separate Railway deployment. It costs $1-2/mo during idle periods for background memory usage. Too early to tell the long-term cost of web scraping, but it should be minimal. I deployed it to Railway instead of serverless functions like Lambda because the Chrome browser is too large for Vercel's serverless. It is workable on Lambda, but my Illinois AWS account is blocked from that service.

Recent average $70/mo

Supabase

All data is stored in Supabase. It's replicated into other purpose-specific database, like Vectors in Qdrant and metadata in Redis. But Supabase is our "Source of Truth". Supabase is $25/mo for our tier.

Qdrant Vector Store

Our vector embeddings for RAG are stored in Qdrant, the best of the vector databases (used by OpenAI and Azure). It's "self-hosted" on AWS. It's an EC2 instance with the most memory per dollar, t4g.xlarge with 16GB of RAM, and a gp3 disk with increased IOPS for faster retrieval (it really helps). The disk is 60 GiB, 12,000 IOPS and 250 MB/s throughput. IOPS are important, throughput is not (because it's small text files).

This is our most expensive service since high-memory EC2 instances are expensive. $100/mo for the EC2 and $50/mo for the faster storage.

S3 document storage

S3 stores user-uploaded content that's not text, like PDFs, Word, PowerPoint, Video, etc. That way, when a user wants to click a citation to see the source document, we can show them the full source as it was given to us.

Currently this cost about $10/mo in storage + data egress fees.

Beam Serverless functions

We run highly scalable jobs, primarily document ingest, on Beam.cloud. It's wonderfully cheap and reliable. Highly recommend. Steady-state average of $5/mo so far.

Frontend

The frontend is React on Next.js, hosted on Vercel. We're still comfortably inside the free tier. If our usage increases a lot, we could pay $20/mo for everything we need.

Services

Sentry.io for error and latency monitoring. Free tier.
Posthog.com for usage monitoring and custom logs. Free tier... mostly.
Nomic for maps of embedding spaces. Educational free tier.
- As of August 2024 we started an Educational enterprise tier at $99/mo.
GitBook for public documentation. Free tier.

Costs to run UIUC.chat as of April 2025

Core Services

Frontend: $30/mo

Backend: $67/mo

Beam.cloud: $15/mo

Databases

Postgres on Supabase: $49/mo (trending upwards).

Vector DB: $329 (with credits, trending down)

Redis

AWS S3 ($25/mo)

Newsletter $16/mo

Nomic Atlas $100/mo

Total costs

Language modeling

Backend

Railway

Supabase

Qdrant Vector Store

S3 document storage

Beam Serverless functions

Frontend

Services

Total costs