System Architecture
Last updated
Was this helpful?
Last updated
Was this helpful?
The key priority of this architecture is developer velocity.
Vercel + Railway + Supabase + Beam has been a fantastic combo.
Everything runs in Docker. Vercel is the one exception, but we also have a docker version.
Full stack frontend: React + Next.js
Backend: Python Flask
Only used for Python-specific features, like advanced retrieval methods, Nomic document maps.
All other backend operations live in Next.js.
Databases
SQL: Postgres
Object storage: S3 / MinIO
Vector DB: Qdrant
Metadata: Redis - required for every page load
Required stateless services:
Document ingest queue (to handle spiky workloads without overwhelming our DBs): Python-RQ
User Auth: Keycloak (user data stored in Postgres)
Optional stateless add-ons:
LLM Serving: Ollama and vLLM
Web Crawling: Crawlee
Semantic Maps of documents and conversation history: Nomic Atlas
Optional state-full add-ons:
Tool use: N8N workflow builder
Error monitoring: Sentry
Google Analytics clone: Posthog
Using N8N for a user-friendly GUI to define custom tools. This way, any user can give their chatbot custom tools that will be automatically invoked when appropriate, as decided by the LLM.
User submits prompt
Determine if tools should be invoked, if so execute them and store the outputs.
Embed user prompt with LLM embedding model
Retrieve most related documents from vector DB
Robust prompt engineering to:
add as many documents as possible to the context window,
retain as much of the conversation history as possible
include tool outputs and images
include our user-configurable prompt engineering features (tutor mode, document references)
Send final prompt-engineered message to the final LLM, stream result.
During streaming, replace LLM citations with proper links (using state machine). e.g. [doc 1, page 3] is replaced with https://s3.link-to-document.pdf?page=3
Simplify to a single Docker-compose script.
: Main or "top level" storage, contains pointers to all other DBs and additional metadata.
MinIO: File storage (pdf/docx/mp4)
Redis/ValKey: User and project metadata, fast retrieval needed for page load.
Qdrant: Vector DB for document embeddings.