Codebase location: /zweistein-dev/zweistein-dev/
Zweistein is a multi-service AI platform comprising a React admin panel, a NestJS API server, and two Python microservices (Ingestion Worker and Query Engine). Together they power knowledge-based AI agents, file processing pipelines, vector-search retrieval, multi-model LLM orchestration, and a real-time chat interface.
1. Architecture Overview
graph TB
subgraph "Tier 1 — Frontend"
ADMIN["Admin Panel<br/>(React + Vite + TS)<br/>Port 5173"]
end
subgraph "Tier 2 — API Server"
SERVER["NestJS Server<br/>Port 3000"]
SWAGGER["Swagger Docs<br/>/ai/docs"]
WEBSOCKET["Socket.IO<br/>WebSocket Gateway"]
end
subgraph "Tier 3 — Python Services"
QE["Query Engine<br/>(FastAPI, Port 8000)"]
IW["Ingestion Worker<br/>(Redis Consumer)"]
end
subgraph "Data Stores"
PG["PostgreSQL"]
QDRANT["Qdrant<br/>(Vector DB)"]
REDIS["Redis Streams<br/>(Job Queue)"]
GCS["Google Cloud Storage<br/>(File Storage)"]
end
subgraph "External AI Services"
OPENAI["OpenAI<br/>(GPT-4o/5.x, Whisper,<br/>Embeddings)"]
ANTHROPIC["Anthropic<br/>(Claude Opus 4.6)"]
GOOGLE["Google<br/>(Gemini 2.x/3.x)"]
PERPLEXITY["Perplexity<br/>(Sonar)"]
GROQ["Groq<br/>(Llama 3.3)"]
ELEVENLABS["ElevenLabs<br/>(TTS)"]
FAL["fal.ai<br/>(Image Gen)"]
end
ADMIN -->|"HTTP /ai/api/*"| SERVER
ADMIN -->|"WebSocket"| WEBSOCKET
SERVER -->|"HTTP Proxy"| QE
SERVER -->|"xadd jobs"| REDIS
REDIS -->|"xreadgroup"| IW
IW -->|"Notifications"| REDIS
REDIS -->|"xreadgroup notifications"| SERVER
SERVER --> PG
SERVER --> GCS
IW --> GCS
IW --> QDRANT
IW --> OPENAI
QE --> QDRANT
QE --> OPENAI
QE --> ANTHROPIC
QE --> GOOGLE
QE --> PERPLEXITY
QE --> GROQ
QE --> ELEVENLABS
QE --> FAL
How the tiers connect:
| Connection |
Protocol |
Purpose |
| Admin --> Server |
HTTP REST + WebSocket (Socket.IO) |
All UI operations, real-time updates |
| Server --> Query Engine |
HTTP (internal, PYTHON_SERVER_URL) |
LLM queries, agent calls, chat, image search |
| Server --> Redis Streams |
xadd to stream:zweistein |
Dispatch file/URL processing jobs |
| Redis --> Ingestion Worker |
xreadgroup consumer groups |
Workers pick up and process jobs |
| Ingestion Worker --> Redis |
xadd to stream:zweistein:notifications |
Notify server when processing completes |
| Server <-- Redis Notifications |
xreadgroup on notification stream |
Server receives completion events, pushes to UI via WebSocket |
2. Admin Panel
Path: admin/
Purpose
React-based frontend for managing AI bots, knowledge concepts, agents, agentic apps, controls (visual components), conversations, and collections. Serves as the primary UI for the Zweistein AI platform.
Tech Stack
| Technology |
Version |
Purpose |
| React |
18.3 |
UI framework |
| Vite |
5.3 |
Build tool and dev server |
| TypeScript |
5.2 |
Type safety |
| Tailwind CSS |
3.4 |
Styling |
| Socket.IO Client |
4.7 |
Real-time communication |
| Zustand |
4.5 |
State management |
| SWR |
2.2 |
Data fetching / caching |
| React Router |
6 |
Client-side routing |
| React Query |
3.39 |
Server state management |
| Storybook |
8.2 |
Component development |
Key Features
- Monaco Editor (
@monaco-editor/react) — In-browser code editor for controls and agent configuration
- Deepgram Audio (
@deepgram/sdk) — Real-time audio transcription
- Voice Activity Detection (
@ricky0123/vad-react) — Detect when user is speaking
- TipTap (
@tiptap/*) — Rich text editor with mention support
- ReactFlow (
reactflow) — Visual node-based agent graph editor
- Module Federation (
@originjs/vite-plugin-federation) — Exposes ./sdk for embedding in other apps
- Stripe / Paddle — Payment integration for subscriptions
- Auth0 (
@auth0/auth0-react) — Authentication and authorization
- PostHog (
posthog-js) — Product analytics
Page Structure
admin/src/pages/
agent-threads/ # Agent execution threads and history
agentic-apps/ # Agentic app builder and management
agents/ # AI agent configuration
atoms/ # Atomic UI component demos
blinkbot/ # BlinkBot chatbot interface
bots/ # Bot configuration and deployment
internal-tests/ # Internal testing tools
knowledge-base/ # Knowledge concept management (files, URLs, conversations)
plans/ # Subscription plan management
layout.tsx # Main layout wrapper
payment-success.tsx # Payment success callback
payment-cancel.tsx # Payment cancel callback
Entry Points
| Entry |
HTML File |
Route Pattern |
Purpose |
| Main app |
index.html |
/ai/* |
Admin dashboard |
| Chat widget |
chat.html |
/ai/chat* |
Embeddable chat interface |
| Public view |
public.html |
/ai/public* |
Public-facing chatbot pages |
Build & Dev
# Development
cd admin && yarn dev # Starts Vite dev server on port 5173
# Production build
cd admin && yarn build # TypeScript check + Vite build
# Storybook
cd admin && yarn storybook # Component explorer on port 6006
3. NestJS Server
Path: server/
Module Architecture
graph TB
subgraph "Core Infrastructure"
APP["AppModule"]
AUTHZ["AuthzModule<br/>(Auth0 JWT, RBAC)"]
CONFIG["ConfigModule"]
TYPEORM["TypeORM<br/>(PostgreSQL)"]
EVENTS["EventEmitterModule"]
REDIS_MOD["RedisStreamsModule<br/>(Job Queue)"]
end
subgraph "Entity Management (CRUD)"
CONCEPTS["ConceptsModule"]
BOTS["BotsModule"]
AGENTS["AgentsModule"]
INTERVIEWS["InterviewsModule"]
AGENTIC_APP["AgenticAppModule"]
EXT_USER["ExternalUserModule"]
EXT_USER_GRP["ExternalUserGroupModule"]
end
subgraph "Chat System"
CONVERSATIONS["ConversationsModule"]
MESSAGES["MessagesModule"]
CHATBOT["ChatbotModule"]
REALTIME["RealtimeCollaborationModule<br/>(Socket.IO)"]
FAVOURITES["FavouritesModule"]
MSG_FEEDBACK["MessagesFeedbackModule"]
end
subgraph "AI & Processing"
QUERY["QueryModule<br/>(Proxy to Python)"]
DATA_PROC["DataProcessingModule"]
CRAWLER["CrawlerModule"]
VOICE["VoiceModule"]
MCP["McpServerModule"]
CONTROLS["ControlsModule<br/>(AI-Generated UI)"]
QUICK_ACT["QuickActionsModule"]
GEN_ANY["GenerateAnythingModule"]
end
subgraph "Storage & Files"
FILES["FilesModule"]
CLOUD["CloudStorageModule<br/>(GCS / Azure)"]
TENANT_STORE["TenantStorageModule"]
end
subgraph "Billing & Subscriptions"
BILLING["BillingModule<br/>(Stripe)"]
SUBS["SubscriptionsModule"]
WALLET["WalletModule"]
USAGE["UsageModule"]
PLANS["PlansModule"]
end
subgraph "Other"
DASHBOARD["DashboardModule"]
COLLECTION["CollectionModule"]
FAVORITES["FavoritesModule"]
PICASSO["PicassoModule"]
MAILBOX["MailboxModule"]
TENANT_TOOL["TenantToolConfigModule"]
EXT_AUTH["ExternalAuthModule"]
INT_TESTS["InternalTestsModule"]
end
APP --> AUTHZ
APP --> CONFIG
APP --> TYPEORM
APP --> EVENTS
APP --> REDIS_MOD
APP --> CONCEPTS
APP --> BOTS
APP --> AGENTS
APP --> FILES
APP --> CHATBOT
APP --> QUERY
APP --> DATA_PROC
APP --> CONTROLS
APP --> BILLING
APP --> WALLET
APP --> SUBS
CHATBOT --> CONVERSATIONS
CHATBOT --> MESSAGES
CHATBOT --> REALTIME
FILES --> CLOUD
FILES --> TENANT_STORE
FILES --> REDIS_MOD
QUERY --> CONFIG
DATA_PROC --> REDIS_MOD
BILLING --> SUBS
Key Modules Table
| Module |
Path |
Purpose |
| AuthzModule |
server/src/authz/ |
Auth0 JWT validation, RBAC guards, tenant scoping |
| FilesModule |
server/src/files/ |
File upload/download, GCS integration, job dispatch |
| QueryModule |
server/src/query/ |
Proxies queries to Python Query Engine |
| ChatbotModule |
server/src/chat/chatbot/ |
Orchestrates chat: SSE streaming from Python, message persistence |
| ConversationsModule |
server/src/chat/conversations/ |
CRUD for conversation threads |
| MessagesModule |
server/src/chat/messages/ |
CRUD for chat messages |
| RealtimeCollaborationModule |
server/src/chat/realtime-collaboration/ |
Socket.IO gateway for real-time updates |
| ControlsModule |
server/src/controls/ |
AI-generated React components (Controls) |
| DataProcessingModule |
server/src/data-processing/ |
Event-driven file processing pipeline management |
| RedisStreamsModule |
server/src/redis-streams/ |
Redis Streams producer/consumer for job queue |
| CrawlerModule |
server/src/crawler/ |
Web crawling and link extraction |
| BillingModule |
server/src/billing/ |
Stripe integration, model token pricing |
| SubscriptionsModule |
server/src/subscriptions/ |
Subscription plan management |
| WalletModule |
server/src/wallet/ |
Credit-based usage wallet (balance, transactions) |
| UsageModule |
server/src/quotas/ |
Quota tracking and enforcement |
| DashboardModule |
server/src/dashboard/ |
Dashboard analytics (recently used, popular items) |
| VoiceModule |
server/src/voice/ |
Voice input/output (STT/TTS) |
| McpServerModule |
server/src/mcp-servers/ |
Model Context Protocol server management |
| CollectionModule |
server/src/collection/ |
Collections of concepts and items |
| ConceptsModule |
server/src/crud-entities/concepts/ |
Knowledge concepts (knowledge bases) |
| BotsModule |
server/src/crud-entities/bots/ |
Bot configuration and deployment |
| AgentsModule |
server/src/crud-entities/agents/ |
Agent definitions and configurations |
| AgenticAppModule |
server/src/crud-entities/agentic-apps/ |
Agentic application definitions |
| QuickActionsModule |
server/src/quickactions/ |
Quick media actions (transcription, summarization) |
| GenerateAnythingModule |
server/src/generate-anything/ |
Versatile content generation |
| PicassoModule |
server/src/picasso/ |
Integration with Blinkin Studio (Picasso) |
| MailboxModule |
server/src/mailbox/ |
Email ingestion via Mailgun |
| CloudStorageModule |
server/src/cloudstorage/ |
Abstraction layer for GCS / Azure Blob |
| TenantStorageModule |
server/src/tenantstorage/ |
Tenant-scoped file storage |
| ExternalAuthModule |
server/src/external-auth/ |
External user authentication settings |
| TenantToolConfigModule |
server/src/tenant-tool-config/ |
Per-tenant tool configurations |
API Endpoints Table
| Method |
Route |
Module |
Description |
| Files |
|
|
|
GET |
/api/files/:conceptId/:group? |
Files |
List files for a concept |
GET |
/api/files/with-full-urls/:conceptId/:group? |
Files |
List files with signed URLs |
POST |
/api/files/:conceptId/:group |
Files |
Upload files |
POST |
/api/files/:conceptId/:group/uploadFromUrl |
Files |
Add file from URL |
PUT |
/api/files/:conceptId/:group/:fileId |
Files |
Replace a file |
PUT |
/api/files/:conceptId/:group/:fileId/text |
Files |
Replace file text content |
POST |
/api/files/:conceptId/:group/:fileId/regenerate |
Files |
Regenerate file processing |
DELETE |
/api/files/:conceptId/:group/:fileId |
Files |
Delete a file |
POST |
/api/files/getUrl/:fileId |
Files |
Get signed download URL |
GET |
/api/files/getFile/:fileId |
Files |
Stream file content |
GET |
/api/files/getFile/:fileId/text |
Files |
Get text transcript of file |
| Query |
|
|
|
POST |
/api/query/:conceptId |
Query |
Query a knowledge concept |
POST |
/api/query/chat3/:conceptId |
Query |
Chat with a concept (proxied to Python) |
POST |
/api/query/ask-about-image |
Query |
Ask a question about an image |
| Controls |
|
|
|
POST |
/api/controls |
Controls |
Create a control |
GET |
/api/controls |
Controls |
List all controls |
GET |
/api/controls/:id |
Controls |
Get control by ID |
PUT |
/api/controls/:id |
Controls |
Update a control |
DELETE |
/api/controls/:id |
Controls |
Delete a control |
POST |
/api/controls/generate |
Controls |
AI-generate a React control |
POST |
/api/controls/fix-errors |
Controls |
AI-fix code errors |
POST |
/api/controls/edit-with-ai |
Controls |
AI-edit code |
POST |
/api/controls/regenerate-json |
Controls |
Regenerate sample JSON |
POST |
/api/controls/regenerate-schema/:id |
Controls |
Regenerate JSON schema |
POST |
/api/controls/duplicate/:id |
Controls |
Duplicate a control |
| Wallet |
|
|
|
GET |
/api/wallet |
Wallet |
Get wallet balance |
GET |
/api/wallet/usage-history |
Wallet |
Get usage event history |
GET |
/api/wallet/transaction-history |
Wallet |
Get transaction history |
| Billing |
|
|
|
| Various |
/api/billing/* |
Billing |
Stripe checkout, webhooks, pricing |
| Subscriptions |
|
|
|
| Various |
/api/subscriptions/* |
Subscriptions |
Plan management, tenant subscriptions |
| Dashboard |
|
|
|
| Various |
/api/dashboard/* |
Dashboard |
Recently used items, popular in org |
| Collection |
|
|
|
| Various |
/api/collection/* |
Collection |
Collection CRUD |
| Crawler |
|
|
|
| Various |
/api/crawler/* |
Crawler |
Web crawling |
| Auth |
|
|
|
| Various |
/api/auth/* |
Authz |
Login, token refresh, user info |
| Health |
|
|
|
GET |
/healthz |
Core |
Health check |
Key File Paths
server/src/
main.ts # Bootstrap: port 3000, Swagger at /ai/docs
app.module.ts # Root module importing all feature modules
authz/ # Auth0 JWT strategy, RBAC guards, decorators
files/
files.controller.ts # File upload/download REST endpoints
files.service.ts # File business logic, GCS interactions
file.entity.ts # TypeORM entity for files
chat/
chatbot/chatbot.ts # Main chatbot controller + module
chatbot/llm.service.ts # LLM orchestration for chat
chatbot/proxy.service.ts # Proxy to Python SSE streams
conversations/ # Conversation entity and CRUD
messages/ # Message entity and CRUD
realtime-collaboration/ # Socket.IO broadcast service
query/
query.controller.ts # Proxy to Python Query Engine
proxy.service.ts # HTTP proxy helper
redis-streams/
redis-streams.module.ts # Dynamic module for Redis Streams
jobs.service.ts # Job producer (xadd)
jobs-listener.service.ts # Notification consumer (xreadgroup)
controls/
controls.controller.ts # CRUD + AI generation endpoints
controls.entity.ts # Control entity (React component code, JSON schema)
llm/llm.service.ts # LLM calls for code generation
wallet/
wallet.controller.ts # Balance and usage history
wallet.service.ts # Wallet business logic
entities/ # Wallet, UsageEvent, UsageTransaction, PriceTag
billing/
billing.controller.ts # Stripe integration
entity/ # Billing entity, model-token pricing
crud-entities/
concepts/ # Knowledge concept CRUD
bots/ # Bot CRUD
agents/ # Agent CRUD
agentic-apps/ # Agentic app CRUD
interviews/ # Interview CRUD
external-user/ # External user management
external-user-group/ # External user group management
base/base.entity.ts # Base entity with tenant scoping
4. Python Ingestion Worker (Detailed)
Path: python_server/ingestion_worker/
Purpose
Consumes file/URL/conversation processing jobs from Redis Streams, extracts text content from various file types, generates vector embeddings via OpenAI, and indexes them into Google File Search (via Google GenAI) for retrieval. It also handles image explanation and audio transcription through external services.
Processing Pipeline
graph LR
subgraph "Job Source"
REDIS_STREAM["Redis Stream<br/>stream:zweistein"]
end
subgraph "Worker Core"
RW["RedisWorker<br/>(Consumer Group)"]
MP["MessageProcessor"]
end
subgraph "Indexers"
DOC["DocumentIndexer<br/>(PDFs, text, office docs)"]
SCRAPER["ScraperIndexer<br/>(Websites)"]
CONV["ConversationIndexer<br/>(Chat transcripts)"]
IMG["ImageIndexer<br/>(Images → OCR)"]
GFS["GoogleFileSearchIndexer<br/>(Google GenAI File Search)"]
end
subgraph "External Services"
WHISPER["Whisper<br/>(Audio transcription)"]
IMAGE_API["IMAGE_EXPLAINER_ENDPOINT<br/>(OCR / Image description)"]
AUDIO_API["AUDIO_TRANSCRIBER_ENDPOINT<br/>(Remote audio transcription)"]
QE_TRANSCRIBE["Query Engine<br/>/quick-actions/media/transcribe"]
end
subgraph "Storage"
GCS_OUT["Google Cloud Storage<br/>(Transcripts)"]
QDRANT_OUT["Google File Search<br/>(Vector Index)"]
end
REDIS_STREAM -->|"xreadgroup"| RW
RW -->|"route by type"| MP
MP -->|"file"| DOC
MP -->|"file (youtube)"| QE_TRANSCRIBE
MP -->|"file (website)"| SCRAPER
MP -->|"url"| SCRAPER
MP -->|"conversation"| CONV
MP -->|"file (image)"| IMG
DOC -->|"video/audio"| WHISPER
DOC -->|"documents"| GFS
IMG -->|"explain"| IMAGE_API
IMG --> GFS
SCRAPER --> GFS
CONV --> DOC
QE_TRANSCRIBE --> GFS
GFS -->|"index text"| QDRANT_OUT
DOC -->|"upload transcript"| GCS_OUT
MP -->|"upload transcript"| GCS_OUT
Job Types
| Job Type |
Source |
Processing |
file |
File uploaded via UI |
Route by file type: document, image, video/audio, YouTube, website |
file.text-replaced |
User edits file text |
Delete old index, re-index with new text |
file.deleted |
File deleted via UI |
Remove from vector index |
concept.deleted |
Concept deleted |
Delete entire Google File Search store |
url |
URL added to concept |
Crawl and scrape website content, index |
conversation |
Chat conversation indexed |
Summarize and index conversation content |
File Type Handling
| File Type |
Detection |
Processing Pipeline |
| PDF |
filetype.guess() → not video/audio/image |
DocumentIndexer.index_document_file() → parse → chunk → GFS index |
| Text/Markdown |
.md or .txt extension |
DocumentIndexer.index_document_file() → chunk → GFS index |
| Image |
MIME image/* |
ImageIndexer.index_image() → IMAGE_EXPLAINER_ENDPOINT → GFS index |
| Video |
MIME video/* (.mp4, .avi, .webm) |
DocumentIndexer.index_file_containing_audio() → Whisper transcription → GFS index |
| Audio |
MIME audio/* (.mp3, .wav) |
DocumentIndexer.index_file_containing_audio() → Whisper transcription → GFS index |
| YouTube |
group == "youtube" |
SSE call to Query Engine /quick-actions/media/transcribe → GFS index |
| Website |
group == "websites" |
spider_rs scraping → readabilipy HTML cleanup → markdown → GFS index |
External Service Endpoints
| Service |
Setting |
Default |
Purpose |
| Image Explainer |
IMAGE_EXPLAINER_ENDPOINT |
https://ocr.akjo.tech/explain-image |
OCR and image description |
| Audio Transcriber |
AUDIO_TRANSCRIBER_ENDPOINT |
https://ocr.akjo.tech/transcribe-audio |
Remote audio transcription |
| Query Engine |
QUERY_ENGINE_URL |
http://queryengine-service |
YouTube transcription via SSE |
Key File Paths
python_server/ingestion_worker/
main.py # Entry point: initializes embeddings, starts RedisWorker
settings.py # Pydantic settings (env vars)
redis_worker.py # Redis Streams consumer group loop
message_processor.py # Routes jobs to appropriate indexer
file_downloader.py # Download files from URLs
cloud_file_downloader.py # Download files from GCS
google_file_search_service.py # Google GenAI File Search client
cloud_storage/
gcp_cloud.py # Google Cloud Storage provider
indexers/
__init__.py # Exports all indexer classes
document.py # DocumentIndexer (PDFs, text, audio)
scraper.py # ScraperIndexer (web crawling)
conversation.py # ConversationIndexer
image_indexer.py # ImageIndexer (OCR)
google_file_search_indexer.py # GoogleFileSearchIndexer
text_indexer.py # Base text indexing
conversations/
anonymizer.py # PII anonymization (Presidio)
chat_retriever.py # Chat history retrieval
recorded_audio_transcriber.py # Audio transcription
smart_summarizer.py # AI-powered conversation summarization
video/
splitter.py # Video scene detection (scenedetect)
5. Python Query Engine (Detailed)
Path: python_server/query_engine/
Purpose
The LLM-powered intelligence core of the platform. Provides query answering, multi-agent orchestration, deep research, image generation/search, web crawling, content generation, and voice transcription. Runs as a FastAPI service on port 8000.
Agent Architecture
graph TB
subgraph "Entry Points"
AGENT_API["/agents/call<br/>(Main Agent Entry)"]
CHAT_API["/chat<br/>(Knowledge Mode)"]
DR_API["/deep-research<br/>(Deep Research)"]
end
subgraph "Dispatcher Layer"
DISPATCH["Dispatcher Step<br/>(GPT-5.2)<br/>Routes to right agent"]
end
subgraph "Agent Types"
SMART["Smart Agent<br/>(Single-turn tool use)"]
SUPERVISOR["Supervisor Agent<br/>(Claude Opus 4.6)<br/>Multi-step planning"]
DEEP_RESEARCH["Deep Research Agent<br/>(Multi-provider)"]
ADR["Agentic Data Retrieval<br/>(Multi-query RAG)"]
PDF_FILL["PDF Filler Agent<br/>(Form auto-fill)"]
AGENTIC_APPS["Agentic Apps<br/>(Task Dispatcher)"]
end
subgraph "Supervisor Sub-Steps"
GOAL["Goal Analyzer"]
PLANNER["Planner Step"]
TASK_SEL["Task Selector"]
INFO_GATHER["Information Gathering"]
VALIDATION["Validation Step"]
REPLAN["Replanner"]
USER_HELP["Request Help from User"]
end
subgraph "Tools"
ZWEISTEIN_TOOL["Zweistein RAG<br/>(Vector Search)"]
FILE_READER["File Reader"]
IMAGE_GEN["Image Generator<br/>(fal.ai / GPT)"]
IMAGE_EXPLAIN["Image Explainer"]
WEB_SEARCH["Web Search<br/>(Tavily, Exa)"]
VIDEO_ANALYZE["Video Analyzer"]
TTS["Text-to-Speech<br/>(ElevenLabs)"]
URL_LOADER["URL Loader"]
EMAIL["Email Sender<br/>(Mailgun)"]
DEEP_RESEARCH_TOOL["Deep Research Tool"]
DOC_OCR["Document OCR<br/>(Google Document AI)"]
CONTROL_TOOLS["Dynamic Control Tools"]
MCP_TOOLS["MCP Server Tools"]
end
subgraph "Deep Research Providers"
DR_GEMINI["Gemini Deep Research"]
DR_CLAUDE["Claude Opus Deep Research"]
DR_GPT5["GPT-5 Deep Research"]
DR_SONAR["Perplexity Sonar"]
DR_KIMI["Kimi K2.5"]
DR_MULTI["Multi-step (default)<br/>Breadth + Depth search"]
end
AGENT_API --> DISPATCH
DISPATCH --> SMART
DISPATCH --> SUPERVISOR
DISPATCH --> ADR
DISPATCH --> PDF_FILL
DISPATCH --> AGENTIC_APPS
SUPERVISOR --> GOAL
SUPERVISOR --> PLANNER
SUPERVISOR --> TASK_SEL
SUPERVISOR --> INFO_GATHER
SUPERVISOR --> VALIDATION
SUPERVISOR --> REPLAN
SUPERVISOR --> USER_HELP
SMART --> ZWEISTEIN_TOOL
SMART --> FILE_READER
SMART --> IMAGE_GEN
SMART --> WEB_SEARCH
SMART --> VIDEO_ANALYZE
SMART --> TTS
SMART --> URL_LOADER
SMART --> EMAIL
SMART --> DOC_OCR
SMART --> CONTROL_TOOLS
SMART --> MCP_TOOLS
CHAT_API --> ZWEISTEIN_TOOL
DR_API --> DR_GEMINI
DR_API --> DR_CLAUDE
DR_API --> DR_GPT5
DR_API --> DR_SONAR
DR_API --> DR_KIMI
DR_API --> DR_MULTI
API Endpoints Table
| Method |
Route |
Module |
Description |
GET |
/healthz |
Core |
Health check |
| Blinks |
|
|
|
POST |
/blinks/generate |
Blinks |
Generate a "Blink" (structured content piece) from AI |
| Query |
|
|
|
POST |
/query/ |
Query |
Query a concept's knowledge base (RAG) |
| Images |
|
|
|
POST |
/images/find-images-simple |
Images |
Find or generate images |
POST |
/images/query-image |
Images |
Ask question about an image (OCR) |
| Chat |
|
|
|
POST |
/chat/ |
Chat |
Non-streaming chat with RAG |
POST |
/chat/stream |
Chat |
SSE streaming chat (Gemini 3.1 Pro + FileSearch) |
| Agents |
|
|
|
POST |
/agents/call |
Agents |
Main agent invocation (Smart, Supervisor, or custom) |
POST |
/agents/call-supervisor |
Agents |
Direct supervisor agent call |
POST |
/agents/call-data-retrieval |
Agents |
Agentic data retrieval |
POST |
/agents/call-pdf-filler |
Agents |
PDF auto-fill agent |
POST |
/agents/call-agentic-app |
Agents |
Agentic app execution |
| Crawler |
|
|
|
POST |
/crawler/fetch-links |
Crawler |
Crawl website and extract links (optional smart scraping) |
| Deep Research |
|
|
|
POST |
/deep-research |
Deep Research |
Multi-provider deep research (SSE streaming) |
| Quick Actions |
|
|
|
POST |
/quick-actions/media/query |
Quick Actions |
Query a media file with AI |
POST |
/quick-actions/media/transcribe |
Quick Actions |
Transcribe audio/video (SSE streaming) |
POST |
/quick-actions/youtube/summarize |
Quick Actions |
Summarize a YouTube video |
| Generate Anything |
|
|
|
POST |
/generate-anything/general |
Generate Anything |
General content generation (SSE streaming) |
POST |
/generate-anything/gemini |
Generate Anything |
Gemini-powered generation (SSE streaming) |
| Voice |
|
|
|
POST |
/voice/transcribe |
Voice |
Audio file transcription (Blinkin inference + GPT-4o-mini cleanup) |
Key File Paths
python_server/query_engine/
main.py # FastAPI app entry point
settings.py # Pydantic settings (LLM models, API keys)
api/v1/
router.py # Main router aggregating all endpoint routers
endpoints/
agents.py # Agent invocation endpoints (largest file)
blinks.py # Blink generation
chat.py # Chat endpoints (streaming and non-streaming)
query.py # RAG query endpoint
images.py # Image search and generation
crawler.py # Web crawling
deep_research.py # Deep research (multi-provider)
quick_actions.py # Media query/transcription
generate_anything.py # Content generation
voice_transcription.py # Voice transcription with cleaning
agents/
chat_models.py # LLM provider factory (OpenAI, Anthropic, Google, Groq, etc.)
state.py # LangGraph agent state definition
agent_with_tools.py # Generic agent-with-tools graph builder
smart_agent/
smart_agent.py # Single-turn smart agent
bosch_smart_agent.py # Custom Bosch variant
supervisor/
replanner.py # Supervisor replanning logic
deep_research/
provider_router.py # Routes to correct deep research provider
provider_gemini.py # Gemini deep research
provider_claude.py # Claude deep research
provider_openai.py # GPT-5 deep research
provider_sonar.py # Perplexity Sonar deep research
provider_kimi.py # Kimi K2.5 deep research
deep_research.py # Original multi-step implementation
agentic_data_retrieval/
graph.py # Agentic data retrieval LangGraph
validator.py # Response validation
zweistein_retriever.py # Vector retrieval integration
agentic_apps/
task_dispatcher.py # Planned task dispatcher
userintheloop.py # User-in-the-loop step
pdf_filler/
pdf_filler.py # PDF form auto-fill
semantic_understanding.py # Semantic field matching
steps/
dispatcher.py # Dispatcher step (routes to right agent)
supervisor.py # Supervisor orchestration steps
planner.py # Planning step
reflector.py # Self-reflection step
answer_improver.py # Answer improvement step
chatbot.py # Chatbot step
memory.py # Zettelkasten memory system
tools.py # Tool binding utilities
helper.py # Helper utilities
supervisor_steps/
goal_analyzer.py # Goal analysis
planner.py # Supervisor planning
success_criteria_analyzer.py # Success evaluation
human_conversation_interface.py # Human-in-the-loop
tools/
zweistein.py # Zweistein RAG tool
file_reader.py # File reading tool
image_generator.py # fal.ai image generation
image_generator_gpt.py # GPT image generation
image_explainer.py # Image explanation tool
tavily_search.py # Tavily web search
exa_search.py # Exa web search
video_analyzer.py # Video analysis tool
new_video_analyzer.py # Updated video analyzer
text_to_speech_elevenlabs.py # ElevenLabs TTS
url_loader.py # URL content loader
email_sender.py # Mailgun email sending
deep_research.py # Deep research as a tool
document_ocr_tool.py # Google Document AI OCR
pdf_filler.py # PDF filler as tool
youtube.py # YouTube tools
control_loader.py # Load control definitions
dynamic_control_tools.py # Runtime control tools
get_tools_from_agent_definition.py # Tool resolver from config
gcp_cloud.py # GCS upload utilities
ovh_vlm.py # OVH vision-language model
zweistein/
blink_generator.py # Blink content generation logic
image_search.py # Image search and generation
retrievers/
zweistein_retriever.py # Core RAG retriever
chat.py # Chat retriever with context
tools.py # Retriever tools
common/
file_downloader.py # File download utility
error_translator.py # Error message translation
usage/
check_quota.py # Quota checking
enforce_quota.py # Quota enforcement
simple_token_tracker.py # Token usage tracking
wallet_client.py # Wallet API client
wallet_integration.py # Wallet reporting integration
token_use.py # Token usage callback
db/ # Database utilities
6. LLM Provider Map
Model Configuration Table
| Setting |
Model |
Provider |
Purpose |
LLM_SIMPLE |
gpt-4o-mini |
OpenAI |
Fast, cheap tasks: query rewriting, transcript cleaning, simple classification |
LLM_ADVANCED |
gpt-4o |
OpenAI |
Default LLM for smart agents, chat, content generation |
LLM_DISPATCHER |
gpt-5.2 |
OpenAI |
Agent dispatcher: decides which agent/tool to invoke |
LLM_SUPERVISOR |
claude_opus46 (Claude Opus 4.6) |
Anthropic |
Supervisor agent: multi-step planning and orchestration |
GEMINI_25_PRO |
gemini-2.5-pro |
Google |
Deep research (Gemini provider), general reasoning |
GEMINI_25_FLASH |
gemini-2.5-flash |
Google |
Memory generation (zettelkasten entries) |
GEMINI_3_PRO |
gemini-3-pro-preview |
Google |
Advanced Google tasks |
GEMINI_31_PRO |
gemini-3.1-pro-preview |
Google |
Streaming chat with FileSearch grounding (Knowledge Mode) |
GEMINI_3_FLASH |
gemini-3-flash-preview |
Google |
Fast Google tasks |
GEMINI_20_FLASH |
gemini-2.0-flash |
Google |
Legacy quick tasks |
| — |
llama-3.3-70b-versatile |
Groq |
Fast open-source inference |
| — |
deepseek-r1-distill-llama-70b |
Groq |
Reasoning model via Groq |
| — |
Perplexity Sonar |
Perplexity |
Web-grounded deep research |
| — |
Kimi K2.5 |
NVIDIA NIM |
Deep research (Kimi provider) |
| — |
OVH GPT OSS 120B |
OVH |
Alternative open-source model |
| — |
ElevenLabs |
ElevenLabs |
Text-to-speech |
| — |
fal.ai (Flux) |
fal.ai |
Image generation |
EMBEDDING_MODEL |
text-embedding-3-large |
OpenAI |
Vector embeddings for RAG |
LLM Usage Diagram
graph LR
subgraph "User Request"
REQ["Incoming Query"]
end
subgraph "Routing (GPT-5.2)"
DISPATCH["Dispatcher<br/>gpt-5.2"]
end
subgraph "Execution Agents"
SMART_AGENT["Smart Agent<br/>gpt-4o"]
SUPERVISOR_AGENT["Supervisor<br/>Claude Opus 4.6"]
KNOWLEDGE["Knowledge Mode<br/>Gemini 3.1 Pro"]
end
subgraph "Supporting Tasks"
REWRITE["Query Rewriting<br/>gpt-4o-mini"]
MEMORY["Memory Generation<br/>Gemini 2.5 Flash"]
CLEAN["Transcript Cleaning<br/>gpt-4o-mini"]
EMBED["Embeddings<br/>text-embedding-3-large"]
end
subgraph "Deep Research"
DR_GEMINI["Gemini 2.5 Pro"]
DR_CLAUDE["Claude Opus"]
DR_GPT5["GPT-5"]
DR_SONAR["Perplexity Sonar"]
DR_KIMI["Kimi K2.5"]
end
subgraph "Media & Generation"
TTS_MODEL["ElevenLabs<br/>Text-to-Speech"]
IMG_MODEL["fal.ai / GPT<br/>Image Generation"]
end
REQ --> DISPATCH
DISPATCH -->|"simple query"| SMART_AGENT
DISPATCH -->|"complex/multi-step"| SUPERVISOR_AGENT
REQ -->|"knowledge mode"| KNOWLEDGE
SMART_AGENT --> REWRITE
SMART_AGENT --> EMBED
SUPERVISOR_AGENT --> EMBED
KNOWLEDGE --> EMBED
KNOWLEDGE --> MEMORY
REQ -->|"deep research"| DR_GEMINI
REQ -->|"deep research"| DR_CLAUDE
REQ -->|"deep research"| DR_GPT5
REQ -->|"deep research"| DR_SONAR
REQ -->|"deep research"| DR_KIMI
SMART_AGENT --> TTS_MODEL
SMART_AGENT --> IMG_MODEL
SMART_AGENT --> CLEAN
7. Data Processing Pipeline
sequenceDiagram
participant User
participant Admin as Admin Panel
participant Server as NestJS Server
participant GCS as Google Cloud Storage
participant Redis as Redis Streams
participant Worker as Ingestion Worker
participant GFS as Google File Search
participant WS as WebSocket
User->>Admin: Upload file via drag & drop
Admin->>Server: POST /api/files/:conceptId/:group<br/>(multipart form data)
Server->>GCS: Upload original file<br/>(tenant-scoped path)
GCS-->>Server: Cloud file path
Server->>Server: Create FileEntity in PostgreSQL<br/>(status: "pending")
Server->>Redis: XADD stream:zweistein<br/>{type: "file", conceptId, entityId,<br/>cloudFilePath, filename}
Server-->>Admin: 200 OK (file entity)
Admin->>WS: Subscribe to file status updates
Note over Redis,Worker: Consumer Group Processing
Redis->>Worker: XREADGROUP (picks up job)
Worker->>Redis: Send "processing" notification
Redis->>Server: Notification received
Server->>WS: Push "processing" status to UI
WS-->>Admin: File status: "processing"
alt PDF / Text Document
Worker->>GCS: Download file via signed URL
Worker->>Worker: Parse document<br/>(PyMuPDF, text extraction)
Worker->>GFS: Index text chunks<br/>(Google File Search)
else Image
Worker->>GCS: Download file
Worker->>Worker: Call IMAGE_EXPLAINER_ENDPOINT<br/>(OCR / description)
Worker->>GFS: Index image description
else Video / Audio
Worker->>GCS: Download file
Worker->>Worker: Whisper transcription<br/>(local model)
Worker->>GFS: Index transcript
Worker->>GCS: Upload .md transcript
else YouTube
Worker->>Server: SSE to /quick-actions/media/transcribe
Worker->>GFS: Index transcript
Worker->>GCS: Upload .md transcript
else Website URL
Worker->>Worker: spider_rs scraping<br/>+ readabilipy cleanup
Worker->>GFS: Index markdown content
Worker->>GCS: Upload .md content
end
Worker->>Redis: XADD notification stream<br/>{phase: "done", payload}
Redis->>Server: Notification: processing complete
Server->>Server: Update FileEntity<br/>(status: "done",<br/>transcript path if applicable)
Server->>WS: Push "done" status to UI
WS-->>Admin: File status: "done" (green check)
Note over User,Admin: File is now searchable via RAG
8. Vector Search Architecture
sequenceDiagram
participant User
participant Admin as Admin Panel
participant Server as NestJS Server
participant QE as Query Engine (Python)
participant GFS as Google File Search (Gemini FileSearch)
participant LLM as LLM (Gemini 3.1 Pro)
User->>Admin: Ask a question
Admin->>Server: POST /api/chat or /api/agents/call<br/>{messages, conceptId, agentConfig}
Server->>QE: Proxy to Python<br/>POST /chat/stream or /agents/call
alt Knowledge Mode (Direct RAG)
QE->>GFS: gfs_service.chat_stream()<br/>{concept_id, messages,<br/>system_instruction, model}
GFS->>GFS: Retrieve relevant chunks<br/>from concept's vector store
GFS->>LLM: Grounded generation<br/>(Gemini 3.1 Pro + FileSearch)
LLM-->>QE: SSE stream (tokens + citations)
else Agent Mode
QE->>QE: Dispatcher step (GPT-5.2)<br/>→ Select agent type
QE->>QE: Agent executes with tools
QE->>GFS: Zweistein RAG tool<br/>→ vector search
GFS-->>QE: Retrieved context
QE->>LLM: Generate response<br/>with retrieved context
LLM-->>QE: SSE stream
end
QE-->>Server: SSE event stream<br/>(tokens, citations, usage, state)
Server-->>Admin: Forward SSE stream
Admin-->>User: Render streaming response<br/>with inline citations
Vector Storage Details
- Embedding model:
text-embedding-3-large (OpenAI, 3072 dimensions)
- Primary search: Google GenAI File Search (per-concept vector stores)
- Collection naming: Each concept gets its own File Search store, identified by
conceptId
- Index types:
DocumentFiles (text documents), ImageFiles (image descriptions)
- Retrieval: Gemini's built-in FileSearch grounding retrieves relevant chunks automatically during generation
9. Database & Storage
PostgreSQL Entities
| Entity |
Path |
Purpose |
FileEntity |
server/src/files/file.entity.ts |
Uploaded files metadata (path, status, concept, group) |
ConversationEntity |
server/src/chat/conversations/conversations.entity.ts |
Chat conversation threads |
MessageEntity |
server/src/chat/messages/message.entity.ts |
Individual chat messages |
MessageFeedbackEntity |
server/src/chat/message-feedback/message-feedback.entity.ts |
User feedback on messages |
BotEntity |
server/src/crud-entities/bots.entity.ts |
Bot configurations |
AgentEntity |
server/src/crud-entities/agents.entity.ts |
Agent definitions |
AgenticAppEntity |
server/src/crud-entities/agentic-app.entity.ts |
Agentic app definitions |
ControlEntity |
server/src/controls/controls.entity.ts |
AI-generated React components |
CollectionEntity |
server/src/collection/collections.entity.ts |
Knowledge collections |
CollectionItemEntity |
server/src/collection/collection_items.entity.ts |
Items within collections |
FavoriteEntity |
server/src/favorite/favorite.entity.ts |
User favorites |
SubscriptionPlanEntity |
server/src/subscriptions/entities/subscription-plan.entity.ts |
Available subscription plans |
TenantSubscriptionEntity |
server/src/subscriptions/entities/tenant-subscription.entity.ts |
Active tenant subscriptions |
WalletEntity |
server/src/wallet/entities/wallet.entity.ts |
Tenant credit wallet (grant + paid balance) |
UsageEventEntity |
server/src/wallet/entities/usage-event.entity.ts |
LLM usage events (model, tokens, cost) |
UsageTransactionEntity |
server/src/wallet/entities/usage-transaction.entity.ts |
Balance deduction transactions |
PriceTagEntity |
server/src/wallet/entities/price-tag.entity.ts |
Per-model token pricing |
BillingEntity |
server/src/billing/entity/billing.entity.ts |
Billing records |
ModelTokenPricingEntity |
server/src/billing/entity/model-token-pricing.entity.ts |
Model pricing configuration |
PlanEntity |
server/src/quotas/entities/plan.entity.ts |
Usage plans |
QuotaDefinitionEntity |
server/src/quotas/entities/quota-definition.entity.ts |
Quota limits |
UsageRecordEntity |
server/src/quotas/entities/usage-record.entity.ts |
Usage counting records |
UsageCounterEntity |
server/src/quotas/entities/usage-counter.entity.ts |
Usage counters |
RecentlyUsedEntity |
server/src/dashboard/recently-used.entity.ts |
Recently accessed items |
PopularInOrgLogsEntity |
server/src/dashboard/popular-in-org-logs.entity.ts |
Popular items analytics |
TenantExternalSettingsEntity |
server/src/external-auth/tenant-external-settings.entity.ts |
External auth settings |
ExternalUserAuditEntity |
server/src/external-auth/audit/external-user-audit.entity.ts |
External user audit log |
ExternalUserEntity |
server/src/crud-entities/external-user/external-user.entity.ts |
External users |
ExternalUserGroupEntity |
server/src/crud-entities/external-user-group/external-user-group.entity.ts |
External user groups |
TenantToolConfigEntity |
server/src/crud-entities/tenant-tool-config.entity.ts |
Per-tenant tool settings |
BaseEntity |
server/src/crud-entities/base/base.entity.ts |
Base entity with tenantId scoping |
Database connections:
- Primary: PostgreSQL via TypeORM (DB_HOST, DB_PORT, DB_NAME) — all Zweistein entities
- Secondary (Studio): PostgreSQL via TypeORM (STUDIO_DB_*) — read-only connection to Blinkin Studio DB
Google File Search (Vector Storage)
| Aspect |
Detail |
| Service |
Google GenAI File Search API |
| Store-per-concept |
Each concept gets its own vector store |
| Embedding |
text-embedding-3-large (OpenAI, used during ingestion) |
| Retrieval |
Gemini FileSearch grounding (during query) |
| Text chunking |
Automatic by Google File Search |
| File types indexed |
Text, PDFs, images (as descriptions), audio/video (as transcripts), websites |
Redis Streams (Job Queue)
| Stream |
Consumer Group |
Purpose |
stream:zweistein |
group:zweistein |
File/URL processing job queue |
stream:zweistein:notifications |
group:zweistein:notifications |
Job completion notifications back to server |
Message format (job):
{
"type": "file|file.text-replaced|file.deleted|url|conversation|concept.deleted",
"conceptId": "uuid",
"entityId": "uuid",
"cloudFilePath": "tenant/concept/filename",
"filename": "original-name.pdf",
"group": "documents|youtube|websites",
"metadata": { "url": "..." }
}
Message format (notification):
{
"original_message": "{...json...}",
"phase": "processing|done|error",
"payload": "{...optional json...}"
}
Google Cloud Storage (File Storage)
| Aspect |
Detail |
| Bucket |
GCS_BUCKET_NAME (e.g., blinkin-ai-dev-storage) |
| Structure |
{tenantId}/{conceptId}/{filename} |
| Access |
Signed URLs (1-hour expiry for downloads) |
| Transcripts |
Stored as .{random}.md alongside original file |
| Key file |
GCS_KEY_FILE (service account JSON) |
10. External Dependencies Table
| Service |
Purpose |
Config Variable(s) |
| OpenAI |
LLM (GPT-4o, GPT-5.x), embeddings, Whisper |
OPENAI_API_KEY |
| Anthropic |
Claude Opus 4.6 (supervisor agent) |
ANTHROPIC_API_KEY |
| Google GenAI |
Gemini models, File Search, Document AI |
GOOGLE_API_KEY |
| Groq |
Fast inference (Llama 3.3, DeepSeek) |
GROQ_API_KEY |
| Perplexity |
Web-grounded deep research (Sonar) |
PERPLEXITY_API_KEY |
| ElevenLabs |
Text-to-speech |
ELEVENLABS_API_KEY_PART1 |
| fal.ai |
Image generation (Flux) |
FAL_KEY |
| Pexels |
Stock image search |
PEXELS_API_KEY |
| Tavily |
Web search for agents |
TAVILY_API_KEY |
| Exa |
Semantic web search |
EXA_API_KEY |
| NVIDIA NIM |
Kimi K2.5 model |
NVIDIA_API_KEY |
| OVH Cloud |
Open-source LLM (GPT-OSS 120B) |
OVH_API_KEY |
| PostgreSQL |
Primary relational database |
DB_HOST, DB_PORT, DB_NAME, DB_USERNAME, DB_PASSWORD |
| Redis |
Job queue (Streams), caching |
REDIS_HOST, REDIS_PORT, REDIS_PASS |
| Google Cloud Storage |
File storage (uploads, transcripts) |
GCS_KEY_FILE, GCS_BUCKET_NAME |
| Azure Blob Storage |
Alternative file storage |
PICASSO_BLOB_URL |
| Auth0 |
Authentication and authorization |
AUTH0_DOMAIN, AUTH0_CLIENT_ID, AUTH0_AUDIENCE, AUTH0_ISSUER_URL |
| Stripe |
Payment processing, subscriptions |
STRIPE_API_KEY, STRIPE_WEBHOOK_SECRET, STRIPE_PUBLIC_KEY |
| Mailgun |
Email ingestion and sending |
MAILGUN_API_KEY, MAILGUN_DOMAIN, MAILGUN_FROM_EMAIL |
| Google OAuth |
Google sign-in |
GOOGLE_CLIENT_ID |
| LangSmith |
LLM tracing and debugging |
LANGCHAIN_ENDPOINT, LANGCHAIN_API_KEY, LANGCHAIN_PROJECT |
| PostHog |
Product analytics (frontend) |
Configured in admin app |
| Blinkin Inference |
Custom OCR, audio transcription |
IMAGE_EXPLAINER_ENDPOINT, AUDIO_TRANSCRIBER_ENDPOINT |
| Blinkin Studio (Picasso) |
Visual content creation |
PICASSO_URL, PICASSO_API_URL, PICASSO_APP_URL |
| Blinkin Houston |
Internal tooling |
HOUSTON_URL |
| C3 (Chatwoot) |
Customer messaging |
C3_DOMAIN, C3_AGENT_ACCOUNT_ID, C3_AGENT_TOKEN |
| Canvas |
Learning management integration |
CANVAS_DOMAIN, CANVAS_TOKEN |
| Google Document AI |
Advanced PDF/document OCR |
DOCUMENT_AI_PROJECT_ID, DOCUMENT_AI_LOCATION, DOCUMENT_AI_PROCESSOR_ID |
| Bosch Gemini |
Custom Bosch Gemini endpoint |
BOSCH_GEMINI_API_KEY, BOSCH_GEMINI_BASE_URL |
11. Key Environment Variables Table
NestJS Server
| Variable |
Purpose |
Example |
GLOBAL_PREFIX |
API route prefix |
/ai |
DB_HOST |
PostgreSQL host |
10.100.10.3 |
DB_PORT |
PostgreSQL port |
5432 |
DB_NAME |
Database name |
zweistein_dev |
DB_USERNAME |
Database user |
(secret) |
DB_PASSWORD |
Database password |
(secret) |
STUDIO_DB_HOST |
Studio DB host (read-only) |
— |
STUDIO_DB_PORT |
Studio DB port |
5432 |
STUDIO_DB_NAME |
Studio database name |
— |
REDIS_HOST |
Redis host |
redis-master.redis.svc.cluster.local |
REDIS_PORT |
Redis port |
6379 |
REDIS_PASS |
Redis password |
(secret) |
REDIS_JOB_STREAM |
Job stream name |
stream:zweistein |
REDIS_JOB_CONSUMER_GROUP |
Server's consumer group |
group:zweistein |
REDIS_JOB_NOTIFICATION_STREAM |
Notification stream |
stream:zweistein:notifications |
REDIS_JOB_NOTIFICATION_CONSUMER_GROUP |
Notification consumer group |
group:zweistein:notifications |
SERVER_INSTANCE_ID |
Unique server instance ID |
server-instance-1 |
SERVER_JWT_SECRET |
JWT signing secret |
(secret) |
PYTHON_SERVER_URL |
Query Engine URL |
http://queryengine-service |
IMAGE_EXPLAINER_URL |
Image OCR service |
https://ocr.blinkin.io/explain-image |
STORAGE_TYPE |
Cloud storage provider |
gcp |
GCS_BUCKET_NAME |
GCS bucket name |
blinkin-ai-dev-storage |
AUTH0_DOMAIN |
Auth0 domain |
dev-w248kl0wxwpsp7q3.eu.auth0.com |
AUTH0_CLIENT_ID |
Auth0 client ID |
— |
AUTH0_AUDIENCE |
Auth0 audience |
— |
AUTH0_ISSUER_URL |
Auth0 issuer |
— |
STRIPE_API_KEY |
Stripe secret key |
(secret) |
STRIPE_WEBHOOK_SECRET |
Stripe webhook secret |
(secret) |
GOOGLE_CLIENT_ID |
Google OAuth client ID |
— |
Ingestion Worker
| Variable |
Purpose |
Example |
OPENAI_API_KEY |
OpenAI API key |
(secret) |
EMBEDDING_MODEL |
Embedding model name |
text-embedding-3-large |
REDIS_HOST |
Redis host |
redis-master.redis.svc.cluster.local |
REDIS_PORT |
Redis port |
6379 |
REDIS_PASS |
Redis password |
(secret) |
TASKS_STREAM |
Job stream |
stream:zweistein |
TASKS_GROUP |
Consumer group |
group:zweistein |
NOTIFICATION_STREAM |
Notification stream |
stream:zweistein:notifications |
PROCESSOR_ID |
Unique worker ID |
zweistein_processor_1 |
HOSTNAME |
Worker hostname (overrides PROCESSOR_ID) |
— |
GCS_KEY_FILE |
GCS service account key file |
— |
GCS_BUCKET_NAME |
GCS bucket name |
blinkin-ai-dev-storage |
IMAGE_EXPLAINER_ENDPOINT |
OCR endpoint |
https://ocr.akjo.tech/explain-image |
AUDIO_TRANSCRIBER_ENDPOINT |
Audio transcription endpoint |
https://ocr.akjo.tech/transcribe-audio |
CONVERSATION_SUMMARIZER_MODEL |
Model for conversation summaries |
gpt-4o |
QUERY_ENGINE_URL |
Query Engine for YouTube transcription |
http://queryengine-service |
GOOGLE_API_KEY |
Google API key (File Search) |
(secret) |
C3_DOMAIN |
C3 / Chatwoot domain |
— |
CANVAS_DOMAIN |
Canvas LMS domain |
— |
CANVAS_TOKEN |
Canvas API token |
(secret) |
Query Engine
| Variable |
Purpose |
Example |
OPENAI_API_KEY |
OpenAI API key |
(secret) |
ANTHROPIC_API_KEY |
Anthropic API key |
(secret) |
GOOGLE_API_KEY |
Google API key |
(secret) |
GROQ_API_KEY |
Groq API key |
(secret) |
PERPLEXITY_API_KEY |
Perplexity API key |
(secret) |
TAVILY_API_KEY |
Tavily search API key |
(secret) |
EXA_API_KEY |
Exa search API key |
(secret) |
FAL_KEY |
fal.ai API key |
(secret) |
PEXELS_API_KEY |
Pexels stock photo key |
(secret) |
ELEVENLABS_API_KEY_PART1 |
ElevenLabs TTS key |
(secret) |
EMBEDDING_MODEL |
Embedding model |
text-embedding-3-large |
LLM_SIMPLE |
Simple/cheap model |
gpt-4o-mini |
LLM_ADVANCED |
Advanced model |
gpt-4o |
LLM_DISPATCHER |
Dispatcher model |
gpt-5.2 |
LLM_SUPERVISOR |
Supervisor model |
claude_opus46 |
GEMINI_25_PRO |
Gemini 2.5 Pro model name |
gemini-2.5-pro |
GEMINI_25_FLASH |
Gemini 2.5 Flash model name |
gemini-2.5-flash |
GEMINI_3_PRO |
Gemini 3 Pro model name |
gemini-3-pro-preview |
GEMINI_31_PRO |
Gemini 3.1 Pro model name |
gemini-3.1-pro-preview |
GEMINI_3_FLASH |
Gemini 3 Flash model name |
gemini-3-flash-preview |
GEMINI_20_FLASH |
Gemini 2.0 Flash model name |
gemini-2.0-flash |
UVICORN_PORT |
FastAPI server port |
8000 |
REDIS_HOST |
Redis host |
redis-master.redis.svc.cluster.local |
REDIS_PORT |
Redis port |
6379 |
REDIS_PASS |
Redis password |
(secret) |
GCS_KEY_FILE |
GCS service account key |
— |
GCS_BUCKET_NAME |
GCS bucket |
blinkin-ai-dev-storage |
QUOTA_SERVICE_URL |
Quota checking service URL |
— |
INTERNAL_API_KEY |
Internal wallet API key |
— |
MAILGUN_DOMAIN |
Mailgun domain |
— |
MAILGUN_API_KEY |
Mailgun API key |
(secret) |
MAILGUN_FROM_EMAIL |
Sender email address |
— |
LANGCHAIN_TRACING_V2 |
Enable LangSmith tracing |
true |
LANGCHAIN_ENDPOINT |
LangSmith endpoint |
https://eu.api.smith.langchain.com |
LANGCHAIN_API_KEY |
LangSmith API key |
(secret) |
LANGCHAIN_PROJECT |
LangSmith project name |
dev |
12. Development & Deployment
Local Development Commands
Admin Panel:
cd admin
yarn install # Install dependencies
yarn dev # Start Vite dev server (https://localhost:5173)
yarn build # Production build
yarn storybook # Start Storybook (http://localhost:6006)
NestJS Server:
cd server
yarn install # Install dependencies
yarn start:dev # Start with watch mode (http://localhost:3000)
yarn start:debug # Start with debug + watch
yarn build # Production build
yarn start:prod # Start production (node dist/main)
yarn migration:generate --name=MigrationName # Generate TypeORM migration
yarn migration:run # Run pending migrations
yarn migration:rollback # Rollback last migration
yarn test # Run unit tests
yarn test:e2e # Run end-to-end tests
Python Ingestion Worker:
cd python_server/ingestion_worker
poetry install # Install dependencies
python main.py # Start the Redis consumer worker
# or
./start.sh # Production start script
Python Query Engine:
cd python_server/query_engine
poetry install # Install dependencies
python main.py # Start FastAPI on port 8000
# or
uvicorn main:app --host 0.0.0.0 --port 8000 --reload # Development with reload
# or
./start_dev.sh # Development start script
Docker Images
| Service |
Image Path |
| Admin |
europe-west3-docker.pkg.dev/blinkin-ai-prod/blinkin-docker-registry/blinkin-ai-admin |
| Server |
europe-west3-docker.pkg.dev/blinkin-ai-prod/blinkin-docker-registry/blinkin-ai-server |
| Query Engine |
europe-west3-docker.pkg.dev/blinkin-ai-prod/blinkin-docker-registry/blinkin-ai-query-engine |
| Ingestion Worker |
europe-west3-docker.pkg.dev/blinkin-ai-prod/blinkin-docker-registry/blinkin-ai-data-ingestion-worker |
Helm Chart Structure
helm/blinkin-ai/
Chart.yaml # Helm chart metadata
values.dev.yaml # Development environment values
values.uat.yaml # UAT environment values
values.prod.yaml # Production environment values
templates/
# Admin Panel
admin.deployment.yaml # Admin Deployment
admin.nginx.configmap.yaml # Nginx config for serving admin SPA
admin.service.yaml # Admin Service (ClusterIP)
# NestJS Server
server.deployment.yaml # Server Deployment
server.configmap.yaml # Server environment ConfigMap
server.service.yaml # Server Service (ClusterIP)
# Query Engine
queryengine.deployment.yaml # Query Engine Deployment
queryengine.configmap.yaml # Query Engine environment ConfigMap
queryengine.service.yaml # Query Engine Service (ClusterIP)
# Ingestion Worker
ingestionworker.statefulset.yaml # Ingestion Worker StatefulSet (3 replicas)
ingestionworker.configmap.yaml # Ingestion Worker environment ConfigMap
# Cluster / Ingress
cluster.ingress.yaml # GKE Ingress with managed certificate
cluster.managedcertificate.yaml # Google-managed TLS certificate
cluster.frontendconfig.yaml # Frontend config (HTTP→HTTPS redirect)
Deployment Topology
| Service |
Type |
Replicas (dev) |
Notes |
| Admin |
Deployment |
1 |
Nginx serving static React build |
| Server |
Deployment |
1 |
NestJS on port 3000 |
| Query Engine |
Deployment |
1 |
FastAPI on port 8000 |
| Ingestion Worker |
StatefulSet |
3 |
Redis consumer group (parallel processing) |
Kubernetes Secrets
| Secret Name |
Keys |
Purpose |
postgres-secrets |
DB_USERNAME, DB_PASSWORD |
PostgreSQL credentials |
redis-secrets |
REDIS_PASS |
Redis password |
query-engine-secrets |
OPENAI_API_KEY, PERPLEXITY_API_KEY, FAL_KEY, PEXELS_API_KEY, TAVILY_API_KEY, GROQ_API_KEY, EXA_API_KEY, ANTHROPIC_API_KEY |
API keys for query engine |
google-api-secrets |
GOOGLE_API_KEY |
Google API key |
gcs-keyfile-dev |
gcs-key.json |
GCS service account key file |
canvas-secrets |
CANVAS_TOKEN |
Canvas LMS token |
Environments
| Environment |
Domain |
Helm Values |
| Development |
app-dev.blinkin.io |
values.dev.yaml |
| UAT |
app-uat.blinkin.io |
values.uat.yaml |
| Production |
app.blinkin.io |
values.prod.yaml |