Cost of Service

Core Services

Language modeling

Our biggest cost by far is OpenAI, especially GPT-4. However, in most cases we pass this cost directly to the consumer with a "BYO API Keys" model. Soon we'll support "BYO base_url" option users can self-host or use alternative hosting providers (like Anyscale or Fireworks) to host their LLM.

Backend

Railway

Railway.app hosts our Python Flask backend. You pay per second of CPU and Memory usage. Our cost is dominated by memory usage, not CPU.

As of January 2024, our web crawling service is a separate Railway deployment. It costs $1-2/mo during idle periods for background memory usage. Too early to tell the long-term cost of web scraping, but it should be minimal. I deployed it to Railway instead of serverless functions like Lambda because the Chrome browser is too large for Vercel's serverless. It is workable on Lambda, but my Illinois AWS account is blocked from that service.

Recent average $70/mo

Supabase

All data is stored in Supabase. It's replicated into other purpose-specific database, like Vectors in Qdrant and metadata in Redis. But Supabase is our "Source of Truth". Supabase is $25/mo for our tier.

Qdrant Vector Store

Our vector embeddings for RAG are stored in Qdrant, the best of the vector databases (used by OpenAI and Azure). It's "self-hosted" on AWS. It's an EC2 instance with the most memory per dollar, t4g.xlarge with 16GB of RAM, and a gp3 disk with increased IOPS for faster retrieval (it really helps). The disk is 60 GiB, 12,000 IOPS and 250 MB/s throughput. IOPS are important, throughput is not (because it's small text files).

This is our most expensive service since high-memory EC2 instances are expensive. $100/mo for the EC2 and $50/mo for the faster storage.

S3 document storage

S3 stores user-uploaded content that's not text, like PDFs, Word, PowerPoint, Video, etc. That way, when a user wants to click a citation to see the source document, we can show them the full source as it was given to us.

Currently this cost about $10/mo in storage + data egress fees.

Beam Serverless functions

We run highly scalable jobs, primarily document ingest, on Beam.cloud. It's wonderfully cheap and reliable. Highly recommend. Steady-state average of $5/mo so far.

Frontend

The frontend is React on Next.js, hosted on Vercel. We're still comfortably inside the free tier. If our usage increases a lot, we could pay $20/mo for everything we need.

Services

  • Sentry.io for error and latency monitoring. Free tier.

  • Posthog.com for usage monitoring and custom logs. Free tier... mostly.

  • Nomic for maps of embedding spaces. Educational free tier.

    • As of August 2024 we started an Educational enterprise tier at $99/mo.

  • GitBook for public documentation. Free tier.

Total costs

Last updated