Zweistein — AI Platform Documentation¶

Codebase location: /zweistein-dev/zweistein-dev/

Zweistein is a multi-service AI platform comprising a React admin panel, a NestJS API server, and two Python microservices (Ingestion Worker and Query Engine). Together they power knowledge-based AI agents, file processing pipelines, vector-search retrieval, multi-model LLM orchestration, and a real-time chat interface.

1. Architecture Overview¶

graph TB
    subgraph "Tier 1 — Frontend"
        ADMIN["Admin Panel<br/>(React + Vite + TS)<br/>Port 5173"]
    end

    subgraph "Tier 2 — API Server"
        SERVER["NestJS Server<br/>Port 3000"]
        SWAGGER["Swagger Docs<br/>/ai/docs"]
        WEBSOCKET["Socket.IO<br/>WebSocket Gateway"]
    end

    subgraph "Tier 3 — Python Services"
        QE["Query Engine<br/>(FastAPI, Port 8000)"]
        IW["Ingestion Worker<br/>(Redis Consumer)"]
    end

    subgraph "Data Stores"
        PG["PostgreSQL"]
        QDRANT["Qdrant<br/>(Vector DB)"]
        REDIS["Redis Streams<br/>(Job Queue)"]
        GCS["Google Cloud Storage<br/>(File Storage)"]
    end

    subgraph "External AI Services"
        OPENAI["OpenAI<br/>(GPT-4o/5.x, Whisper,<br/>Embeddings)"]
        ANTHROPIC["Anthropic<br/>(Claude Opus 4.6)"]
        GOOGLE["Google<br/>(Gemini 2.x/3.x)"]
        PERPLEXITY["Perplexity<br/>(Sonar)"]
        GROQ["Groq<br/>(Llama 3.3)"]
        ELEVENLABS["ElevenLabs<br/>(TTS)"]
        FAL["fal.ai<br/>(Image Gen)"]
    end

    ADMIN -->|"HTTP /ai/api/*"| SERVER
    ADMIN -->|"WebSocket"| WEBSOCKET
    SERVER -->|"HTTP Proxy"| QE
    SERVER -->|"xadd jobs"| REDIS
    REDIS -->|"xreadgroup"| IW
    IW -->|"Notifications"| REDIS
    REDIS -->|"xreadgroup notifications"| SERVER
    SERVER --> PG
    SERVER --> GCS
    IW --> GCS
    IW --> QDRANT
    IW --> OPENAI
    QE --> QDRANT
    QE --> OPENAI
    QE --> ANTHROPIC
    QE --> GOOGLE
    QE --> PERPLEXITY
    QE --> GROQ
    QE --> ELEVENLABS
    QE --> FAL

How the tiers connect:

Connection	Protocol	Purpose
Admin --> Server	HTTP REST + WebSocket (Socket.IO)	All UI operations, real-time updates
Server --> Query Engine	HTTP (internal, `PYTHON_SERVER_URL`)	LLM queries, agent calls, chat, image search
Server --> Redis Streams	`xadd` to `stream:zweistein`	Dispatch file/URL processing jobs
Redis --> Ingestion Worker	`xreadgroup` consumer groups	Workers pick up and process jobs
Ingestion Worker --> Redis	`xadd` to `stream:zweistein:notifications`	Notify server when processing completes
Server <-- Redis Notifications	`xreadgroup` on notification stream	Server receives completion events, pushes to UI via WebSocket

2. Admin Panel¶

Path: admin/

Purpose¶

React-based frontend for managing AI bots, knowledge concepts, agents, agentic apps, controls (visual components), conversations, and collections. Serves as the primary UI for the Zweistein AI platform.

Tech Stack¶

Technology	Version	Purpose
React	18.3	UI framework
Vite	5.3	Build tool and dev server
TypeScript	5.2	Type safety
Tailwind CSS	3.4	Styling
Socket.IO Client	4.7	Real-time communication
Zustand	4.5	State management
SWR	2.2	Data fetching / caching
React Router	6	Client-side routing
React Query	3.39	Server state management
Storybook	8.2	Component development

Key Features¶

Monaco Editor (@monaco-editor/react) — In-browser code editor for controls and agent configuration
Deepgram Audio (@deepgram/sdk) — Real-time audio transcription
Voice Activity Detection (@ricky0123/vad-react) — Detect when user is speaking
TipTap (@tiptap/*) — Rich text editor with mention support
ReactFlow (reactflow) — Visual node-based agent graph editor
Module Federation (@originjs/vite-plugin-federation) — Exposes ./sdk for embedding in other apps
Stripe / Paddle — Payment integration for subscriptions
Auth0 (@auth0/auth0-react) — Authentication and authorization
PostHog (posthog-js) — Product analytics

Page Structure¶

admin/src/pages/
  agent-threads/      # Agent execution threads and history
  agentic-apps/       # Agentic app builder and management
  agents/             # AI agent configuration
  atoms/              # Atomic UI component demos
  blinkbot/           # BlinkBot chatbot interface
  bots/               # Bot configuration and deployment
  internal-tests/     # Internal testing tools
  knowledge-base/     # Knowledge concept management (files, URLs, conversations)
  plans/              # Subscription plan management
  layout.tsx          # Main layout wrapper
  payment-success.tsx # Payment success callback
  payment-cancel.tsx  # Payment cancel callback

Entry Points¶

Entry	HTML File	Route Pattern	Purpose
Main app	`index.html`	`/ai/*`	Admin dashboard
Chat widget	`chat.html`	`/ai/chat*`	Embeddable chat interface
Public view	`public.html`	`/ai/public*`	Public-facing chatbot pages

Build & Dev¶

# Development
cd admin && yarn dev          # Starts Vite dev server on port 5173

# Production build
cd admin && yarn build        # TypeScript check + Vite build

# Storybook
cd admin && yarn storybook    # Component explorer on port 6006

3. NestJS Server¶

Path: server/

Module Architecture¶

graph TB
    subgraph "Core Infrastructure"
        APP["AppModule"]
        AUTHZ["AuthzModule<br/>(Auth0 JWT, RBAC)"]
        CONFIG["ConfigModule"]
        TYPEORM["TypeORM<br/>(PostgreSQL)"]
        EVENTS["EventEmitterModule"]
        REDIS_MOD["RedisStreamsModule<br/>(Job Queue)"]
    end

    subgraph "Entity Management (CRUD)"
        CONCEPTS["ConceptsModule"]
        BOTS["BotsModule"]
        AGENTS["AgentsModule"]
        INTERVIEWS["InterviewsModule"]
        AGENTIC_APP["AgenticAppModule"]
        EXT_USER["ExternalUserModule"]
        EXT_USER_GRP["ExternalUserGroupModule"]
    end

    subgraph "Chat System"
        CONVERSATIONS["ConversationsModule"]
        MESSAGES["MessagesModule"]
        CHATBOT["ChatbotModule"]
        REALTIME["RealtimeCollaborationModule<br/>(Socket.IO)"]
        FAVOURITES["FavouritesModule"]
        MSG_FEEDBACK["MessagesFeedbackModule"]
    end

    subgraph "AI & Processing"
        QUERY["QueryModule<br/>(Proxy to Python)"]
        DATA_PROC["DataProcessingModule"]
        CRAWLER["CrawlerModule"]
        VOICE["VoiceModule"]
        MCP["McpServerModule"]
        CONTROLS["ControlsModule<br/>(AI-Generated UI)"]
        QUICK_ACT["QuickActionsModule"]
        GEN_ANY["GenerateAnythingModule"]
    end

    subgraph "Storage & Files"
        FILES["FilesModule"]
        CLOUD["CloudStorageModule<br/>(GCS / Azure)"]
        TENANT_STORE["TenantStorageModule"]
    end

    subgraph "Billing & Subscriptions"
        BILLING["BillingModule<br/>(Stripe)"]
        SUBS["SubscriptionsModule"]
        WALLET["WalletModule"]
        USAGE["UsageModule"]
        PLANS["PlansModule"]
    end

    subgraph "Other"
        DASHBOARD["DashboardModule"]
        COLLECTION["CollectionModule"]
        FAVORITES["FavoritesModule"]
        PICASSO["PicassoModule"]
        MAILBOX["MailboxModule"]
        TENANT_TOOL["TenantToolConfigModule"]
        EXT_AUTH["ExternalAuthModule"]
        INT_TESTS["InternalTestsModule"]
    end

    APP --> AUTHZ
    APP --> CONFIG
    APP --> TYPEORM
    APP --> EVENTS
    APP --> REDIS_MOD
    APP --> CONCEPTS
    APP --> BOTS
    APP --> AGENTS
    APP --> FILES
    APP --> CHATBOT
    APP --> QUERY
    APP --> DATA_PROC
    APP --> CONTROLS
    APP --> BILLING
    APP --> WALLET
    APP --> SUBS

    CHATBOT --> CONVERSATIONS
    CHATBOT --> MESSAGES
    CHATBOT --> REALTIME
    FILES --> CLOUD
    FILES --> TENANT_STORE
    FILES --> REDIS_MOD
    QUERY --> CONFIG
    DATA_PROC --> REDIS_MOD
    BILLING --> SUBS

Key Modules Table¶

Module	Path	Purpose
AuthzModule	`server/src/authz/`	Auth0 JWT validation, RBAC guards, tenant scoping
FilesModule	`server/src/files/`	File upload/download, GCS integration, job dispatch
QueryModule	`server/src/query/`	Proxies queries to Python Query Engine
ChatbotModule	`server/src/chat/chatbot/`	Orchestrates chat: SSE streaming from Python, message persistence
ConversationsModule	`server/src/chat/conversations/`	CRUD for conversation threads
MessagesModule	`server/src/chat/messages/`	CRUD for chat messages
RealtimeCollaborationModule	`server/src/chat/realtime-collaboration/`	Socket.IO gateway for real-time updates
ControlsModule	`server/src/controls/`	AI-generated React components (Controls)
DataProcessingModule	`server/src/data-processing/`	Event-driven file processing pipeline management
RedisStreamsModule	`server/src/redis-streams/`	Redis Streams producer/consumer for job queue
CrawlerModule	`server/src/crawler/`	Web crawling and link extraction
BillingModule	`server/src/billing/`	Stripe integration, model token pricing
SubscriptionsModule	`server/src/subscriptions/`	Subscription plan management
WalletModule	`server/src/wallet/`	Credit-based usage wallet (balance, transactions)
UsageModule	`server/src/quotas/`	Quota tracking and enforcement
DashboardModule	`server/src/dashboard/`	Dashboard analytics (recently used, popular items)
VoiceModule	`server/src/voice/`	Voice input/output (STT/TTS)
McpServerModule	`server/src/mcp-servers/`	Model Context Protocol server management
CollectionModule	`server/src/collection/`	Collections of concepts and items
ConceptsModule	`server/src/crud-entities/concepts/`	Knowledge concepts (knowledge bases)
BotsModule	`server/src/crud-entities/bots/`	Bot configuration and deployment
AgentsModule	`server/src/crud-entities/agents/`	Agent definitions and configurations
AgenticAppModule	`server/src/crud-entities/agentic-apps/`	Agentic application definitions
QuickActionsModule	`server/src/quickactions/`	Quick media actions (transcription, summarization)
GenerateAnythingModule	`server/src/generate-anything/`	Versatile content generation
PicassoModule	`server/src/picasso/`	Integration with Blinkin Studio (Picasso)
MailboxModule	`server/src/mailbox/`	Email ingestion via Mailgun
CloudStorageModule	`server/src/cloudstorage/`	Abstraction layer for GCS / Azure Blob
TenantStorageModule	`server/src/tenantstorage/`	Tenant-scoped file storage
ExternalAuthModule	`server/src/external-auth/`	External user authentication settings
TenantToolConfigModule	`server/src/tenant-tool-config/`	Per-tenant tool configurations

API Endpoints Table¶

Method	Route	Module	Description
Files
`GET`	`/api/files/:conceptId/:group?`	Files	List files for a concept
`GET`	`/api/files/with-full-urls/:conceptId/:group?`	Files	List files with signed URLs
`POST`	`/api/files/:conceptId/:group`	Files	Upload files
`POST`	`/api/files/:conceptId/:group/uploadFromUrl`	Files	Add file from URL
`PUT`	`/api/files/:conceptId/:group/:fileId`	Files	Replace a file
`PUT`	`/api/files/:conceptId/:group/:fileId/text`	Files	Replace file text content
`POST`	`/api/files/:conceptId/:group/:fileId/regenerate`	Files	Regenerate file processing
`DELETE`	`/api/files/:conceptId/:group/:fileId`	Files	Delete a file
`POST`	`/api/files/getUrl/:fileId`	Files	Get signed download URL
`GET`	`/api/files/getFile/:fileId`	Files	Stream file content
`GET`	`/api/files/getFile/:fileId/text`	Files	Get text transcript of file
Query
`POST`	`/api/query/:conceptId`	Query	Query a knowledge concept
`POST`	`/api/query/chat3/:conceptId`	Query	Chat with a concept (proxied to Python)
`POST`	`/api/query/ask-about-image`	Query	Ask a question about an image
Controls
`POST`	`/api/controls`	Controls	Create a control
`GET`	`/api/controls`	Controls	List all controls
`GET`	`/api/controls/:id`	Controls	Get control by ID
`PUT`	`/api/controls/:id`	Controls	Update a control
`DELETE`	`/api/controls/:id`	Controls	Delete a control
`POST`	`/api/controls/generate`	Controls	AI-generate a React control
`POST`	`/api/controls/fix-errors`	Controls	AI-fix code errors
`POST`	`/api/controls/edit-with-ai`	Controls	AI-edit code
`POST`	`/api/controls/regenerate-json`	Controls	Regenerate sample JSON
`POST`	`/api/controls/regenerate-schema/:id`	Controls	Regenerate JSON schema
`POST`	`/api/controls/duplicate/:id`	Controls	Duplicate a control
Wallet
`GET`	`/api/wallet`	Wallet	Get wallet balance
`GET`	`/api/wallet/usage-history`	Wallet	Get usage event history
`GET`	`/api/wallet/transaction-history`	Wallet	Get transaction history
Billing
Various	`/api/billing/*`	Billing	Stripe checkout, webhooks, pricing
Subscriptions
Various	`/api/subscriptions/*`	Subscriptions	Plan management, tenant subscriptions
Dashboard
Various	`/api/dashboard/*`	Dashboard	Recently used items, popular in org
Collection
Various	`/api/collection/*`	Collection	Collection CRUD
Crawler
Various	`/api/crawler/*`	Crawler	Web crawling
Auth
Various	`/api/auth/*`	Authz	Login, token refresh, user info
Health
`GET`	`/healthz`	Core	Health check

Key File Paths¶

server/src/
  main.ts                    # Bootstrap: port 3000, Swagger at /ai/docs
  app.module.ts              # Root module importing all feature modules
  authz/                     # Auth0 JWT strategy, RBAC guards, decorators
  files/
    files.controller.ts      # File upload/download REST endpoints
    files.service.ts         # File business logic, GCS interactions
    file.entity.ts           # TypeORM entity for files
  chat/
    chatbot/chatbot.ts       # Main chatbot controller + module
    chatbot/llm.service.ts   # LLM orchestration for chat
    chatbot/proxy.service.ts # Proxy to Python SSE streams
    conversations/           # Conversation entity and CRUD
    messages/                # Message entity and CRUD
    realtime-collaboration/  # Socket.IO broadcast service
  query/
    query.controller.ts      # Proxy to Python Query Engine
    proxy.service.ts         # HTTP proxy helper
  redis-streams/
    redis-streams.module.ts  # Dynamic module for Redis Streams
    jobs.service.ts          # Job producer (xadd)
    jobs-listener.service.ts # Notification consumer (xreadgroup)
  controls/
    controls.controller.ts   # CRUD + AI generation endpoints
    controls.entity.ts       # Control entity (React component code, JSON schema)
    llm/llm.service.ts       # LLM calls for code generation
  wallet/
    wallet.controller.ts     # Balance and usage history
    wallet.service.ts        # Wallet business logic
    entities/                # Wallet, UsageEvent, UsageTransaction, PriceTag
  billing/
    billing.controller.ts    # Stripe integration
    entity/                  # Billing entity, model-token pricing
  crud-entities/
    concepts/                # Knowledge concept CRUD
    bots/                    # Bot CRUD
    agents/                  # Agent CRUD
    agentic-apps/            # Agentic app CRUD
    interviews/              # Interview CRUD
    external-user/           # External user management
    external-user-group/     # External user group management
    base/base.entity.ts      # Base entity with tenant scoping

4. Python Ingestion Worker (Detailed)¶

Path: python_server/ingestion_worker/

Purpose¶

Consumes file/URL/conversation processing jobs from Redis Streams, extracts text content from various file types, generates vector embeddings via OpenAI, and indexes them into Google File Search (via Google GenAI) for retrieval. It also handles image explanation and audio transcription through external services.

Processing Pipeline¶

graph LR
    subgraph "Job Source"
        REDIS_STREAM["Redis Stream<br/>stream:zweistein"]
    end

    subgraph "Worker Core"
        RW["RedisWorker<br/>(Consumer Group)"]
        MP["MessageProcessor"]
    end

    subgraph "Indexers"
        DOC["DocumentIndexer<br/>(PDFs, text, office docs)"]
        SCRAPER["ScraperIndexer<br/>(Websites)"]
        CONV["ConversationIndexer<br/>(Chat transcripts)"]
        IMG["ImageIndexer<br/>(Images → OCR)"]
        GFS["GoogleFileSearchIndexer<br/>(Google GenAI File Search)"]
    end

    subgraph "External Services"
        WHISPER["Whisper<br/>(Audio transcription)"]
        IMAGE_API["IMAGE_EXPLAINER_ENDPOINT<br/>(OCR / Image description)"]
        AUDIO_API["AUDIO_TRANSCRIBER_ENDPOINT<br/>(Remote audio transcription)"]
        QE_TRANSCRIBE["Query Engine<br/>/quick-actions/media/transcribe"]
    end

    subgraph "Storage"
        GCS_OUT["Google Cloud Storage<br/>(Transcripts)"]
        QDRANT_OUT["Google File Search<br/>(Vector Index)"]
    end

    REDIS_STREAM -->|"xreadgroup"| RW
    RW -->|"route by type"| MP
    MP -->|"file"| DOC
    MP -->|"file (youtube)"| QE_TRANSCRIBE
    MP -->|"file (website)"| SCRAPER
    MP -->|"url"| SCRAPER
    MP -->|"conversation"| CONV
    MP -->|"file (image)"| IMG

    DOC -->|"video/audio"| WHISPER
    DOC -->|"documents"| GFS
    IMG -->|"explain"| IMAGE_API
    IMG --> GFS
    SCRAPER --> GFS
    CONV --> DOC
    QE_TRANSCRIBE --> GFS

    GFS -->|"index text"| QDRANT_OUT
    DOC -->|"upload transcript"| GCS_OUT
    MP -->|"upload transcript"| GCS_OUT

Job Types¶

Job Type	Source	Processing
`file`	File uploaded via UI	Route by file type: document, image, video/audio, YouTube, website
`file.text-replaced`	User edits file text	Delete old index, re-index with new text
`file.deleted`	File deleted via UI	Remove from vector index
`concept.deleted`	Concept deleted	Delete entire Google File Search store
`url`	URL added to concept	Crawl and scrape website content, index
`conversation`	Chat conversation indexed	Summarize and index conversation content

File Type Handling¶

File Type	Detection	Processing Pipeline
PDF	`filetype.guess()` → not video/audio/image	`DocumentIndexer.index_document_file()` → parse → chunk → GFS index
Text/Markdown	`.md` or `.txt` extension	`DocumentIndexer.index_document_file()` → chunk → GFS index
Image	MIME `image/*`	`ImageIndexer.index_image()` → IMAGE_EXPLAINER_ENDPOINT → GFS index
Video	MIME `video/*` (.mp4, .avi, .webm)	`DocumentIndexer.index_file_containing_audio()` → Whisper transcription → GFS index
Audio	MIME `audio/*` (.mp3, .wav)	`DocumentIndexer.index_file_containing_audio()` → Whisper transcription → GFS index
YouTube	`group == "youtube"`	SSE call to Query Engine `/quick-actions/media/transcribe` → GFS index
Website	`group == "websites"`	`spider_rs` scraping → `readabilipy` HTML cleanup → markdown → GFS index

External Service Endpoints¶

Service	Setting	Default	Purpose
Image Explainer	`IMAGE_EXPLAINER_ENDPOINT`	`https://ocr.akjo.tech/explain-image`	OCR and image description
Audio Transcriber	`AUDIO_TRANSCRIBER_ENDPOINT`	`https://ocr.akjo.tech/transcribe-audio`	Remote audio transcription
Query Engine	`QUERY_ENGINE_URL`	`http://queryengine-service`	YouTube transcription via SSE

Key File Paths¶

python_server/ingestion_worker/
  main.py                 # Entry point: initializes embeddings, starts RedisWorker
  settings.py             # Pydantic settings (env vars)
  redis_worker.py         # Redis Streams consumer group loop
  message_processor.py    # Routes jobs to appropriate indexer
  file_downloader.py      # Download files from URLs
  cloud_file_downloader.py # Download files from GCS
  google_file_search_service.py # Google GenAI File Search client
  cloud_storage/
    gcp_cloud.py          # Google Cloud Storage provider
  indexers/
    __init__.py           # Exports all indexer classes
    document.py           # DocumentIndexer (PDFs, text, audio)
    scraper.py            # ScraperIndexer (web crawling)
    conversation.py       # ConversationIndexer
    image_indexer.py      # ImageIndexer (OCR)
    google_file_search_indexer.py # GoogleFileSearchIndexer
    text_indexer.py       # Base text indexing
    conversations/
      anonymizer.py       # PII anonymization (Presidio)
      chat_retriever.py   # Chat history retrieval
      recorded_audio_transcriber.py # Audio transcription
      smart_summarizer.py # AI-powered conversation summarization
  video/
    splitter.py           # Video scene detection (scenedetect)

5. Python Query Engine (Detailed)¶

Path: python_server/query_engine/

Purpose¶

The LLM-powered intelligence core of the platform. Provides query answering, multi-agent orchestration, deep research, image generation/search, web crawling, content generation, and voice transcription. Runs as a FastAPI service on port 8000.

Agent Architecture¶

graph TB
    subgraph "Entry Points"
        AGENT_API["/agents/call<br/>(Main Agent Entry)"]
        CHAT_API["/chat<br/>(Knowledge Mode)"]
        DR_API["/deep-research<br/>(Deep Research)"]
    end

    subgraph "Dispatcher Layer"
        DISPATCH["Dispatcher Step<br/>(GPT-5.2)<br/>Routes to right agent"]
    end

    subgraph "Agent Types"
        SMART["Smart Agent<br/>(Single-turn tool use)"]
        SUPERVISOR["Supervisor Agent<br/>(Claude Opus 4.6)<br/>Multi-step planning"]
        DEEP_RESEARCH["Deep Research Agent<br/>(Multi-provider)"]
        ADR["Agentic Data Retrieval<br/>(Multi-query RAG)"]
        PDF_FILL["PDF Filler Agent<br/>(Form auto-fill)"]
        AGENTIC_APPS["Agentic Apps<br/>(Task Dispatcher)"]
    end

    subgraph "Supervisor Sub-Steps"
        GOAL["Goal Analyzer"]
        PLANNER["Planner Step"]
        TASK_SEL["Task Selector"]
        INFO_GATHER["Information Gathering"]
        VALIDATION["Validation Step"]
        REPLAN["Replanner"]
        USER_HELP["Request Help from User"]
    end

    subgraph "Tools"
        ZWEISTEIN_TOOL["Zweistein RAG<br/>(Vector Search)"]
        FILE_READER["File Reader"]
        IMAGE_GEN["Image Generator<br/>(fal.ai / GPT)"]
        IMAGE_EXPLAIN["Image Explainer"]
        WEB_SEARCH["Web Search<br/>(Tavily, Exa)"]
        VIDEO_ANALYZE["Video Analyzer"]
        TTS["Text-to-Speech<br/>(ElevenLabs)"]
        URL_LOADER["URL Loader"]
        EMAIL["Email Sender<br/>(Mailgun)"]
        DEEP_RESEARCH_TOOL["Deep Research Tool"]
        DOC_OCR["Document OCR<br/>(Google Document AI)"]
        CONTROL_TOOLS["Dynamic Control Tools"]
        MCP_TOOLS["MCP Server Tools"]
    end

    subgraph "Deep Research Providers"
        DR_GEMINI["Gemini Deep Research"]
        DR_CLAUDE["Claude Opus Deep Research"]
        DR_GPT5["GPT-5 Deep Research"]
        DR_SONAR["Perplexity Sonar"]
        DR_KIMI["Kimi K2.5"]
        DR_MULTI["Multi-step (default)<br/>Breadth + Depth search"]
    end

    AGENT_API --> DISPATCH
    DISPATCH --> SMART
    DISPATCH --> SUPERVISOR
    DISPATCH --> ADR
    DISPATCH --> PDF_FILL
    DISPATCH --> AGENTIC_APPS

    SUPERVISOR --> GOAL
    SUPERVISOR --> PLANNER
    SUPERVISOR --> TASK_SEL
    SUPERVISOR --> INFO_GATHER
    SUPERVISOR --> VALIDATION
    SUPERVISOR --> REPLAN
    SUPERVISOR --> USER_HELP

    SMART --> ZWEISTEIN_TOOL
    SMART --> FILE_READER
    SMART --> IMAGE_GEN
    SMART --> WEB_SEARCH
    SMART --> VIDEO_ANALYZE
    SMART --> TTS
    SMART --> URL_LOADER
    SMART --> EMAIL
    SMART --> DOC_OCR
    SMART --> CONTROL_TOOLS
    SMART --> MCP_TOOLS

    CHAT_API --> ZWEISTEIN_TOOL

    DR_API --> DR_GEMINI
    DR_API --> DR_CLAUDE
    DR_API --> DR_GPT5
    DR_API --> DR_SONAR
    DR_API --> DR_KIMI
    DR_API --> DR_MULTI

API Endpoints Table¶

Method	Route	Module	Description
`GET`	`/healthz`	Core	Health check
Blinks
`POST`	`/blinks/generate`	Blinks	Generate a "Blink" (structured content piece) from AI
Query
`POST`	`/query/`	Query	Query a concept's knowledge base (RAG)
Images
`POST`	`/images/find-images-simple`	Images	Find or generate images
`POST`	`/images/query-image`	Images	Ask question about an image (OCR)
Chat
`POST`	`/chat/`	Chat	Non-streaming chat with RAG
`POST`	`/chat/stream`	Chat	SSE streaming chat (Gemini 3.1 Pro + FileSearch)
Agents
`POST`	`/agents/call`	Agents	Main agent invocation (Smart, Supervisor, or custom)
`POST`	`/agents/call-supervisor`	Agents	Direct supervisor agent call
`POST`	`/agents/call-data-retrieval`	Agents	Agentic data retrieval
`POST`	`/agents/call-pdf-filler`	Agents	PDF auto-fill agent
`POST`	`/agents/call-agentic-app`	Agents	Agentic app execution
Crawler
`POST`	`/crawler/fetch-links`	Crawler	Crawl website and extract links (optional smart scraping)
Deep Research
`POST`	`/deep-research`	Deep Research	Multi-provider deep research (SSE streaming)
Quick Actions
`POST`	`/quick-actions/media/query`	Quick Actions	Query a media file with AI
`POST`	`/quick-actions/media/transcribe`	Quick Actions	Transcribe audio/video (SSE streaming)
`POST`	`/quick-actions/youtube/summarize`	Quick Actions	Summarize a YouTube video
Generate Anything
`POST`	`/generate-anything/general`	Generate Anything	General content generation (SSE streaming)
`POST`	`/generate-anything/gemini`	Generate Anything	Gemini-powered generation (SSE streaming)
Voice
`POST`	`/voice/transcribe`	Voice	Audio file transcription (Blinkin inference + GPT-4o-mini cleanup)

Key File Paths¶

python_server/query_engine/
  main.py                      # FastAPI app entry point
  settings.py                  # Pydantic settings (LLM models, API keys)
  api/v1/
    router.py                  # Main router aggregating all endpoint routers
    endpoints/
      agents.py                # Agent invocation endpoints (largest file)
      blinks.py                # Blink generation
      chat.py                  # Chat endpoints (streaming and non-streaming)
      query.py                 # RAG query endpoint
      images.py                # Image search and generation
      crawler.py               # Web crawling
      deep_research.py         # Deep research (multi-provider)
      quick_actions.py         # Media query/transcription
      generate_anything.py     # Content generation
      voice_transcription.py   # Voice transcription with cleaning
  agents/
    chat_models.py             # LLM provider factory (OpenAI, Anthropic, Google, Groq, etc.)
    state.py                   # LangGraph agent state definition
    agent_with_tools.py        # Generic agent-with-tools graph builder
    smart_agent/
      smart_agent.py           # Single-turn smart agent
      bosch_smart_agent.py     # Custom Bosch variant
    supervisor/
      replanner.py             # Supervisor replanning logic
    deep_research/
      provider_router.py       # Routes to correct deep research provider
      provider_gemini.py       # Gemini deep research
      provider_claude.py       # Claude deep research
      provider_openai.py       # GPT-5 deep research
      provider_sonar.py        # Perplexity Sonar deep research
      provider_kimi.py         # Kimi K2.5 deep research
      deep_research.py         # Original multi-step implementation
    agentic_data_retrieval/
      graph.py                 # Agentic data retrieval LangGraph
      validator.py             # Response validation
      zweistein_retriever.py   # Vector retrieval integration
    agentic_apps/
      task_dispatcher.py       # Planned task dispatcher
      userintheloop.py         # User-in-the-loop step
    pdf_filler/
      pdf_filler.py            # PDF form auto-fill
      semantic_understanding.py # Semantic field matching
    steps/
      dispatcher.py            # Dispatcher step (routes to right agent)
      supervisor.py            # Supervisor orchestration steps
      planner.py               # Planning step
      reflector.py             # Self-reflection step
      answer_improver.py       # Answer improvement step
      chatbot.py               # Chatbot step
      memory.py                # Zettelkasten memory system
      tools.py                 # Tool binding utilities
      helper.py                # Helper utilities
      supervisor_steps/
        goal_analyzer.py       # Goal analysis
        planner.py             # Supervisor planning
        success_criteria_analyzer.py # Success evaluation
        human_conversation_interface.py # Human-in-the-loop
    tools/
      zweistein.py             # Zweistein RAG tool
      file_reader.py           # File reading tool
      image_generator.py       # fal.ai image generation
      image_generator_gpt.py   # GPT image generation
      image_explainer.py       # Image explanation tool
      tavily_search.py         # Tavily web search
      exa_search.py            # Exa web search
      video_analyzer.py        # Video analysis tool
      new_video_analyzer.py    # Updated video analyzer
      text_to_speech_elevenlabs.py # ElevenLabs TTS
      url_loader.py            # URL content loader
      email_sender.py          # Mailgun email sending
      deep_research.py         # Deep research as a tool
      document_ocr_tool.py     # Google Document AI OCR
      pdf_filler.py            # PDF filler as tool
      youtube.py               # YouTube tools
      control_loader.py        # Load control definitions
      dynamic_control_tools.py # Runtime control tools
      get_tools_from_agent_definition.py # Tool resolver from config
      gcp_cloud.py             # GCS upload utilities
      ovh_vlm.py               # OVH vision-language model
  zweistein/
    blink_generator.py         # Blink content generation logic
    image_search.py            # Image search and generation
    retrievers/
      zweistein_retriever.py   # Core RAG retriever
      chat.py                  # Chat retriever with context
    tools.py                   # Retriever tools
  common/
    file_downloader.py         # File download utility
    error_translator.py        # Error message translation
  usage/
    check_quota.py             # Quota checking
    enforce_quota.py           # Quota enforcement
    simple_token_tracker.py    # Token usage tracking
    wallet_client.py           # Wallet API client
    wallet_integration.py      # Wallet reporting integration
    token_use.py               # Token usage callback
  db/                          # Database utilities

6. LLM Provider Map¶

Model Configuration Table¶

Setting	Model	Provider	Purpose
`LLM_SIMPLE`	`gpt-4o-mini`	OpenAI	Fast, cheap tasks: query rewriting, transcript cleaning, simple classification
`LLM_ADVANCED`	`gpt-4o`	OpenAI	Default LLM for smart agents, chat, content generation
`LLM_DISPATCHER`	`gpt-5.2`	OpenAI	Agent dispatcher: decides which agent/tool to invoke
`LLM_SUPERVISOR`	`claude_opus46` (Claude Opus 4.6)	Anthropic	Supervisor agent: multi-step planning and orchestration
`GEMINI_25_PRO`	`gemini-2.5-pro`	Google	Deep research (Gemini provider), general reasoning
`GEMINI_25_FLASH`	`gemini-2.5-flash`	Google	Memory generation (zettelkasten entries)
`GEMINI_3_PRO`	`gemini-3-pro-preview`	Google	Advanced Google tasks
`GEMINI_31_PRO`	`gemini-3.1-pro-preview`	Google	Streaming chat with FileSearch grounding (Knowledge Mode)
`GEMINI_3_FLASH`	`gemini-3-flash-preview`	Google	Fast Google tasks
`GEMINI_20_FLASH`	`gemini-2.0-flash`	Google	Legacy quick tasks
—	`llama-3.3-70b-versatile`	Groq	Fast open-source inference
—	`deepseek-r1-distill-llama-70b`	Groq	Reasoning model via Groq
—	Perplexity Sonar	Perplexity	Web-grounded deep research
—	Kimi K2.5	NVIDIA NIM	Deep research (Kimi provider)
—	OVH GPT OSS 120B	OVH	Alternative open-source model
—	ElevenLabs	ElevenLabs	Text-to-speech
—	fal.ai (Flux)	fal.ai	Image generation
`EMBEDDING_MODEL`	`text-embedding-3-large`	OpenAI	Vector embeddings for RAG

LLM Usage Diagram¶

graph LR
    subgraph "User Request"
        REQ["Incoming Query"]
    end

    subgraph "Routing (GPT-5.2)"
        DISPATCH["Dispatcher<br/>gpt-5.2"]
    end

    subgraph "Execution Agents"
        SMART_AGENT["Smart Agent<br/>gpt-4o"]
        SUPERVISOR_AGENT["Supervisor<br/>Claude Opus 4.6"]
        KNOWLEDGE["Knowledge Mode<br/>Gemini 3.1 Pro"]
    end

    subgraph "Supporting Tasks"
        REWRITE["Query Rewriting<br/>gpt-4o-mini"]
        MEMORY["Memory Generation<br/>Gemini 2.5 Flash"]
        CLEAN["Transcript Cleaning<br/>gpt-4o-mini"]
        EMBED["Embeddings<br/>text-embedding-3-large"]
    end

    subgraph "Deep Research"
        DR_GEMINI["Gemini 2.5 Pro"]
        DR_CLAUDE["Claude Opus"]
        DR_GPT5["GPT-5"]
        DR_SONAR["Perplexity Sonar"]
        DR_KIMI["Kimi K2.5"]
    end

    subgraph "Media & Generation"
        TTS_MODEL["ElevenLabs<br/>Text-to-Speech"]
        IMG_MODEL["fal.ai / GPT<br/>Image Generation"]
    end

    REQ --> DISPATCH
    DISPATCH -->|"simple query"| SMART_AGENT
    DISPATCH -->|"complex/multi-step"| SUPERVISOR_AGENT
    REQ -->|"knowledge mode"| KNOWLEDGE

    SMART_AGENT --> REWRITE
    SMART_AGENT --> EMBED
    SUPERVISOR_AGENT --> EMBED
    KNOWLEDGE --> EMBED
    KNOWLEDGE --> MEMORY

    REQ -->|"deep research"| DR_GEMINI
    REQ -->|"deep research"| DR_CLAUDE
    REQ -->|"deep research"| DR_GPT5
    REQ -->|"deep research"| DR_SONAR
    REQ -->|"deep research"| DR_KIMI

    SMART_AGENT --> TTS_MODEL
    SMART_AGENT --> IMG_MODEL
    SMART_AGENT --> CLEAN

7. Data Processing Pipeline¶

sequenceDiagram
    participant User
    participant Admin as Admin Panel
    participant Server as NestJS Server
    participant GCS as Google Cloud Storage
    participant Redis as Redis Streams
    participant Worker as Ingestion Worker
    participant GFS as Google File Search
    participant WS as WebSocket

    User->>Admin: Upload file via drag & drop
    Admin->>Server: POST /api/files/:conceptId/:group<br/>(multipart form data)

    Server->>GCS: Upload original file<br/>(tenant-scoped path)
    GCS-->>Server: Cloud file path

    Server->>Server: Create FileEntity in PostgreSQL<br/>(status: "pending")

    Server->>Redis: XADD stream:zweistein<br/>{type: "file", conceptId, entityId,<br/>cloudFilePath, filename}

    Server-->>Admin: 200 OK (file entity)
    Admin->>WS: Subscribe to file status updates

    Note over Redis,Worker: Consumer Group Processing

    Redis->>Worker: XREADGROUP (picks up job)
    Worker->>Redis: Send "processing" notification
    Redis->>Server: Notification received
    Server->>WS: Push "processing" status to UI
    WS-->>Admin: File status: "processing"

    alt PDF / Text Document
        Worker->>GCS: Download file via signed URL
        Worker->>Worker: Parse document<br/>(PyMuPDF, text extraction)
        Worker->>GFS: Index text chunks<br/>(Google File Search)
    else Image
        Worker->>GCS: Download file
        Worker->>Worker: Call IMAGE_EXPLAINER_ENDPOINT<br/>(OCR / description)
        Worker->>GFS: Index image description
    else Video / Audio
        Worker->>GCS: Download file
        Worker->>Worker: Whisper transcription<br/>(local model)
        Worker->>GFS: Index transcript
        Worker->>GCS: Upload .md transcript
    else YouTube
        Worker->>Server: SSE to /quick-actions/media/transcribe
        Worker->>GFS: Index transcript
        Worker->>GCS: Upload .md transcript
    else Website URL
        Worker->>Worker: spider_rs scraping<br/>+ readabilipy cleanup
        Worker->>GFS: Index markdown content
        Worker->>GCS: Upload .md content
    end

    Worker->>Redis: XADD notification stream<br/>{phase: "done", payload}

    Redis->>Server: Notification: processing complete
    Server->>Server: Update FileEntity<br/>(status: "done",<br/>transcript path if applicable)
    Server->>WS: Push "done" status to UI
    WS-->>Admin: File status: "done" (green check)

    Note over User,Admin: File is now searchable via RAG

8. Vector Search Architecture¶

sequenceDiagram
    participant User
    participant Admin as Admin Panel
    participant Server as NestJS Server
    participant QE as Query Engine (Python)
    participant GFS as Google File Search (Gemini FileSearch)
    participant LLM as LLM (Gemini 3.1 Pro)

    User->>Admin: Ask a question
    Admin->>Server: POST /api/chat or /api/agents/call<br/>{messages, conceptId, agentConfig}

    Server->>QE: Proxy to Python<br/>POST /chat/stream or /agents/call

    alt Knowledge Mode (Direct RAG)
        QE->>GFS: gfs_service.chat_stream()<br/>{concept_id, messages,<br/>system_instruction, model}
        GFS->>GFS: Retrieve relevant chunks<br/>from concept's vector store
        GFS->>LLM: Grounded generation<br/>(Gemini 3.1 Pro + FileSearch)
        LLM-->>QE: SSE stream (tokens + citations)
    else Agent Mode
        QE->>QE: Dispatcher step (GPT-5.2)<br/>→ Select agent type
        QE->>QE: Agent executes with tools
        QE->>GFS: Zweistein RAG tool<br/>→ vector search
        GFS-->>QE: Retrieved context
        QE->>LLM: Generate response<br/>with retrieved context
        LLM-->>QE: SSE stream
    end

    QE-->>Server: SSE event stream<br/>(tokens, citations, usage, state)
    Server-->>Admin: Forward SSE stream
    Admin-->>User: Render streaming response<br/>with inline citations

Vector Storage Details¶

Embedding model: text-embedding-3-large (OpenAI, 3072 dimensions)
Primary search: Google GenAI File Search (per-concept vector stores)
Collection naming: Each concept gets its own File Search store, identified by conceptId
Index types: DocumentFiles (text documents), ImageFiles (image descriptions)
Retrieval: Gemini's built-in FileSearch grounding retrieves relevant chunks automatically during generation

9. Database & Storage¶

PostgreSQL Entities¶

Entity	Path	Purpose
`FileEntity`	`server/src/files/file.entity.ts`	Uploaded files metadata (path, status, concept, group)
`ConversationEntity`	`server/src/chat/conversations/conversations.entity.ts`	Chat conversation threads
`MessageEntity`	`server/src/chat/messages/message.entity.ts`	Individual chat messages
`MessageFeedbackEntity`	`server/src/chat/message-feedback/message-feedback.entity.ts`	User feedback on messages
`BotEntity`	`server/src/crud-entities/bots.entity.ts`	Bot configurations
`AgentEntity`	`server/src/crud-entities/agents.entity.ts`	Agent definitions
`AgenticAppEntity`	`server/src/crud-entities/agentic-app.entity.ts`	Agentic app definitions
`ControlEntity`	`server/src/controls/controls.entity.ts`	AI-generated React components
`CollectionEntity`	`server/src/collection/collections.entity.ts`	Knowledge collections
`CollectionItemEntity`	`server/src/collection/collection_items.entity.ts`	Items within collections
`FavoriteEntity`	`server/src/favorite/favorite.entity.ts`	User favorites
`SubscriptionPlanEntity`	`server/src/subscriptions/entities/subscription-plan.entity.ts`	Available subscription plans
`TenantSubscriptionEntity`	`server/src/subscriptions/entities/tenant-subscription.entity.ts`	Active tenant subscriptions
`WalletEntity`	`server/src/wallet/entities/wallet.entity.ts`	Tenant credit wallet (grant + paid balance)
`UsageEventEntity`	`server/src/wallet/entities/usage-event.entity.ts`	LLM usage events (model, tokens, cost)
`UsageTransactionEntity`	`server/src/wallet/entities/usage-transaction.entity.ts`	Balance deduction transactions
`PriceTagEntity`	`server/src/wallet/entities/price-tag.entity.ts`	Per-model token pricing
`BillingEntity`	`server/src/billing/entity/billing.entity.ts`	Billing records
`ModelTokenPricingEntity`	`server/src/billing/entity/model-token-pricing.entity.ts`	Model pricing configuration
`PlanEntity`	`server/src/quotas/entities/plan.entity.ts`	Usage plans
`QuotaDefinitionEntity`	`server/src/quotas/entities/quota-definition.entity.ts`	Quota limits
`UsageRecordEntity`	`server/src/quotas/entities/usage-record.entity.ts`	Usage counting records
`UsageCounterEntity`	`server/src/quotas/entities/usage-counter.entity.ts`	Usage counters
`RecentlyUsedEntity`	`server/src/dashboard/recently-used.entity.ts`	Recently accessed items
`PopularInOrgLogsEntity`	`server/src/dashboard/popular-in-org-logs.entity.ts`	Popular items analytics
`TenantExternalSettingsEntity`	`server/src/external-auth/tenant-external-settings.entity.ts`	External auth settings
`ExternalUserAuditEntity`	`server/src/external-auth/audit/external-user-audit.entity.ts`	External user audit log
`ExternalUserEntity`	`server/src/crud-entities/external-user/external-user.entity.ts`	External users
`ExternalUserGroupEntity`	`server/src/crud-entities/external-user-group/external-user-group.entity.ts`	External user groups
`TenantToolConfigEntity`	`server/src/crud-entities/tenant-tool-config.entity.ts`	Per-tenant tool settings
`BaseEntity`	`server/src/crud-entities/base/base.entity.ts`	Base entity with `tenantId` scoping

Database connections: - Primary: PostgreSQL via TypeORM (DB_HOST, DB_PORT, DB_NAME) — all Zweistein entities - Secondary (Studio): PostgreSQL via TypeORM (STUDIO_DB_*) — read-only connection to Blinkin Studio DB

Google File Search (Vector Storage)¶

Aspect	Detail
Service	Google GenAI File Search API
Store-per-concept	Each concept gets its own vector store
Embedding	`text-embedding-3-large` (OpenAI, used during ingestion)
Retrieval	Gemini FileSearch grounding (during query)
Text chunking	Automatic by Google File Search
File types indexed	Text, PDFs, images (as descriptions), audio/video (as transcripts), websites

Redis Streams (Job Queue)¶

Stream	Consumer Group	Purpose
`stream:zweistein`	`group:zweistein`	File/URL processing job queue
`stream:zweistein:notifications`	`group:zweistein:notifications`	Job completion notifications back to server

Message format (job):

{
  "type": "file|file.text-replaced|file.deleted|url|conversation|concept.deleted",
  "conceptId": "uuid",
  "entityId": "uuid",
  "cloudFilePath": "tenant/concept/filename",
  "filename": "original-name.pdf",
  "group": "documents|youtube|websites",
  "metadata": { "url": "..." }
}

Message format (notification):

{
  "original_message": "{...json...}",
  "phase": "processing|done|error",
  "payload": "{...optional json...}"
}

Google Cloud Storage (File Storage)¶

Aspect	Detail
Bucket	`GCS_BUCKET_NAME` (e.g., `blinkin-ai-dev-storage`)
Structure	`{tenantId}/{conceptId}/{filename}`
Access	Signed URLs (1-hour expiry for downloads)
Transcripts	Stored as `.{random}.md` alongside original file
Key file	`GCS_KEY_FILE` (service account JSON)

10. External Dependencies Table¶

Service	Purpose	Config Variable(s)
OpenAI	LLM (GPT-4o, GPT-5.x), embeddings, Whisper	`OPENAI_API_KEY`
Anthropic	Claude Opus 4.6 (supervisor agent)	`ANTHROPIC_API_KEY`
Google GenAI	Gemini models, File Search, Document AI	`GOOGLE_API_KEY`
Groq	Fast inference (Llama 3.3, DeepSeek)	`GROQ_API_KEY`
Perplexity	Web-grounded deep research (Sonar)	`PERPLEXITY_API_KEY`
ElevenLabs	Text-to-speech	`ELEVENLABS_API_KEY_PART1`
fal.ai	Image generation (Flux)	`FAL_KEY`
Pexels	Stock image search	`PEXELS_API_KEY`
Tavily	Web search for agents	`TAVILY_API_KEY`
Exa	Semantic web search	`EXA_API_KEY`
NVIDIA NIM	Kimi K2.5 model	`NVIDIA_API_KEY`
OVH Cloud	Open-source LLM (GPT-OSS 120B)	`OVH_API_KEY`
PostgreSQL	Primary relational database	`DB_HOST`, `DB_PORT`, `DB_NAME`, `DB_USERNAME`, `DB_PASSWORD`
Redis	Job queue (Streams), caching	`REDIS_HOST`, `REDIS_PORT`, `REDIS_PASS`
Google Cloud Storage	File storage (uploads, transcripts)	`GCS_KEY_FILE`, `GCS_BUCKET_NAME`
Azure Blob Storage	Alternative file storage	`PICASSO_BLOB_URL`
Auth0	Authentication and authorization	`AUTH0_DOMAIN`, `AUTH0_CLIENT_ID`, `AUTH0_AUDIENCE`, `AUTH0_ISSUER_URL`
Stripe	Payment processing, subscriptions	`STRIPE_API_KEY`, `STRIPE_WEBHOOK_SECRET`, `STRIPE_PUBLIC_KEY`
Mailgun	Email ingestion and sending	`MAILGUN_API_KEY`, `MAILGUN_DOMAIN`, `MAILGUN_FROM_EMAIL`
Google OAuth	Google sign-in	`GOOGLE_CLIENT_ID`
LangSmith	LLM tracing and debugging	`LANGCHAIN_ENDPOINT`, `LANGCHAIN_API_KEY`, `LANGCHAIN_PROJECT`
PostHog	Product analytics (frontend)	Configured in admin app
Blinkin Inference	Custom OCR, audio transcription	`IMAGE_EXPLAINER_ENDPOINT`, `AUDIO_TRANSCRIBER_ENDPOINT`
Blinkin Studio (Picasso)	Visual content creation	`PICASSO_URL`, `PICASSO_API_URL`, `PICASSO_APP_URL`
Blinkin Houston	Internal tooling	`HOUSTON_URL`
C3 (Chatwoot)	Customer messaging	`C3_DOMAIN`, `C3_AGENT_ACCOUNT_ID`, `C3_AGENT_TOKEN`
Canvas	Learning management integration	`CANVAS_DOMAIN`, `CANVAS_TOKEN`
Google Document AI	Advanced PDF/document OCR	`DOCUMENT_AI_PROJECT_ID`, `DOCUMENT_AI_LOCATION`, `DOCUMENT_AI_PROCESSOR_ID`
Bosch Gemini	Custom Bosch Gemini endpoint	`BOSCH_GEMINI_API_KEY`, `BOSCH_GEMINI_BASE_URL`

11. Key Environment Variables Table¶

NestJS Server¶

Variable	Purpose	Example
`GLOBAL_PREFIX`	API route prefix	`/ai`
`DB_HOST`	PostgreSQL host	`10.100.10.3`
`DB_PORT`	PostgreSQL port	`5432`
`DB_NAME`	Database name	`zweistein_dev`
`DB_USERNAME`	Database user	(secret)
`DB_PASSWORD`	Database password	(secret)
`STUDIO_DB_HOST`	Studio DB host (read-only)	—
`STUDIO_DB_PORT`	Studio DB port	`5432`
`STUDIO_DB_NAME`	Studio database name	—
`REDIS_HOST`	Redis host	`redis-master.redis.svc.cluster.local`
`REDIS_PORT`	Redis port	`6379`
`REDIS_PASS`	Redis password	(secret)
`REDIS_JOB_STREAM`	Job stream name	`stream:zweistein`
`REDIS_JOB_CONSUMER_GROUP`	Server's consumer group	`group:zweistein`
`REDIS_JOB_NOTIFICATION_STREAM`	Notification stream	`stream:zweistein:notifications`
`REDIS_JOB_NOTIFICATION_CONSUMER_GROUP`	Notification consumer group	`group:zweistein:notifications`
`SERVER_INSTANCE_ID`	Unique server instance ID	`server-instance-1`
`SERVER_JWT_SECRET`	JWT signing secret	(secret)
`PYTHON_SERVER_URL`	Query Engine URL	`http://queryengine-service`
`IMAGE_EXPLAINER_URL`	Image OCR service	`https://ocr.blinkin.io/explain-image`
`STORAGE_TYPE`	Cloud storage provider	`gcp`
`GCS_BUCKET_NAME`	GCS bucket name	`blinkin-ai-dev-storage`
`AUTH0_DOMAIN`	Auth0 domain	`dev-w248kl0wxwpsp7q3.eu.auth0.com`
`AUTH0_CLIENT_ID`	Auth0 client ID	—
`AUTH0_AUDIENCE`	Auth0 audience	—
`AUTH0_ISSUER_URL`	Auth0 issuer	—
`STRIPE_API_KEY`	Stripe secret key	(secret)
`STRIPE_WEBHOOK_SECRET`	Stripe webhook secret	(secret)
`GOOGLE_CLIENT_ID`	Google OAuth client ID	—

Ingestion Worker¶

Variable	Purpose	Example
`OPENAI_API_KEY`	OpenAI API key	(secret)
`EMBEDDING_MODEL`	Embedding model name	`text-embedding-3-large`
`REDIS_HOST`	Redis host	`redis-master.redis.svc.cluster.local`
`REDIS_PORT`	Redis port	`6379`
`REDIS_PASS`	Redis password	(secret)
`TASKS_STREAM`	Job stream	`stream:zweistein`
`TASKS_GROUP`	Consumer group	`group:zweistein`
`NOTIFICATION_STREAM`	Notification stream	`stream:zweistein:notifications`
`PROCESSOR_ID`	Unique worker ID	`zweistein_processor_1`
`HOSTNAME`	Worker hostname (overrides PROCESSOR_ID)	—
`GCS_KEY_FILE`	GCS service account key file	—
`GCS_BUCKET_NAME`	GCS bucket name	`blinkin-ai-dev-storage`
`IMAGE_EXPLAINER_ENDPOINT`	OCR endpoint	`https://ocr.akjo.tech/explain-image`
`AUDIO_TRANSCRIBER_ENDPOINT`	Audio transcription endpoint	`https://ocr.akjo.tech/transcribe-audio`
`CONVERSATION_SUMMARIZER_MODEL`	Model for conversation summaries	`gpt-4o`
`QUERY_ENGINE_URL`	Query Engine for YouTube transcription	`http://queryengine-service`
`GOOGLE_API_KEY`	Google API key (File Search)	(secret)
`C3_DOMAIN`	C3 / Chatwoot domain	—
`CANVAS_DOMAIN`	Canvas LMS domain	—
`CANVAS_TOKEN`	Canvas API token	(secret)

Query Engine¶

Variable	Purpose	Example
`OPENAI_API_KEY`	OpenAI API key	(secret)
`ANTHROPIC_API_KEY`	Anthropic API key	(secret)
`GOOGLE_API_KEY`	Google API key	(secret)
`GROQ_API_KEY`	Groq API key	(secret)
`PERPLEXITY_API_KEY`	Perplexity API key	(secret)
`TAVILY_API_KEY`	Tavily search API key	(secret)
`EXA_API_KEY`	Exa search API key	(secret)
`FAL_KEY`	fal.ai API key	(secret)
`PEXELS_API_KEY`	Pexels stock photo key	(secret)
`ELEVENLABS_API_KEY_PART1`	ElevenLabs TTS key	(secret)
`EMBEDDING_MODEL`	Embedding model	`text-embedding-3-large`
`LLM_SIMPLE`	Simple/cheap model	`gpt-4o-mini`
`LLM_ADVANCED`	Advanced model	`gpt-4o`
`LLM_DISPATCHER`	Dispatcher model	`gpt-5.2`
`LLM_SUPERVISOR`	Supervisor model	`claude_opus46`
`GEMINI_25_PRO`	Gemini 2.5 Pro model name	`gemini-2.5-pro`
`GEMINI_25_FLASH`	Gemini 2.5 Flash model name	`gemini-2.5-flash`
`GEMINI_3_PRO`	Gemini 3 Pro model name	`gemini-3-pro-preview`
`GEMINI_31_PRO`	Gemini 3.1 Pro model name	`gemini-3.1-pro-preview`
`GEMINI_3_FLASH`	Gemini 3 Flash model name	`gemini-3-flash-preview`
`GEMINI_20_FLASH`	Gemini 2.0 Flash model name	`gemini-2.0-flash`
`UVICORN_PORT`	FastAPI server port	`8000`
`REDIS_HOST`	Redis host	`redis-master.redis.svc.cluster.local`
`REDIS_PORT`	Redis port	`6379`
`REDIS_PASS`	Redis password	(secret)
`GCS_KEY_FILE`	GCS service account key	—
`GCS_BUCKET_NAME`	GCS bucket	`blinkin-ai-dev-storage`
`QUOTA_SERVICE_URL`	Quota checking service URL	—
`INTERNAL_API_KEY`	Internal wallet API key	—
`MAILGUN_DOMAIN`	Mailgun domain	—
`MAILGUN_API_KEY`	Mailgun API key	(secret)
`MAILGUN_FROM_EMAIL`	Sender email address	—
`LANGCHAIN_TRACING_V2`	Enable LangSmith tracing	`true`
`LANGCHAIN_ENDPOINT`	LangSmith endpoint	`https://eu.api.smith.langchain.com`
`LANGCHAIN_API_KEY`	LangSmith API key	(secret)
`LANGCHAIN_PROJECT`	LangSmith project name	`dev`

12. Development & Deployment¶

Local Development Commands¶

Admin Panel:

cd admin
yarn install          # Install dependencies
yarn dev              # Start Vite dev server (https://localhost:5173)
yarn build            # Production build
yarn storybook        # Start Storybook (http://localhost:6006)

NestJS Server:

cd server
yarn install          # Install dependencies
yarn start:dev        # Start with watch mode (http://localhost:3000)
yarn start:debug      # Start with debug + watch
yarn build            # Production build
yarn start:prod       # Start production (node dist/main)
yarn migration:generate --name=MigrationName  # Generate TypeORM migration
yarn migration:run    # Run pending migrations
yarn migration:rollback # Rollback last migration
yarn test             # Run unit tests
yarn test:e2e         # Run end-to-end tests

Python Ingestion Worker:

cd python_server/ingestion_worker
poetry install        # Install dependencies
python main.py        # Start the Redis consumer worker
# or
./start.sh            # Production start script

Python Query Engine:

cd python_server/query_engine
poetry install        # Install dependencies
python main.py        # Start FastAPI on port 8000
# or
uvicorn main:app --host 0.0.0.0 --port 8000 --reload  # Development with reload
# or
./start_dev.sh        # Development start script

Docker Images¶

Service	Image Path
Admin	`europe-west3-docker.pkg.dev/blinkin-ai-prod/blinkin-docker-registry/blinkin-ai-admin`
Server	`europe-west3-docker.pkg.dev/blinkin-ai-prod/blinkin-docker-registry/blinkin-ai-server`
Query Engine	`europe-west3-docker.pkg.dev/blinkin-ai-prod/blinkin-docker-registry/blinkin-ai-query-engine`
Ingestion Worker	`europe-west3-docker.pkg.dev/blinkin-ai-prod/blinkin-docker-registry/blinkin-ai-data-ingestion-worker`

Helm Chart Structure¶

helm/blinkin-ai/
  Chart.yaml                         # Helm chart metadata
  values.dev.yaml                    # Development environment values
  values.uat.yaml                    # UAT environment values
  values.prod.yaml                   # Production environment values
  templates/
    # Admin Panel
    admin.deployment.yaml            # Admin Deployment
    admin.nginx.configmap.yaml       # Nginx config for serving admin SPA
    admin.service.yaml               # Admin Service (ClusterIP)

    # NestJS Server
    server.deployment.yaml           # Server Deployment
    server.configmap.yaml            # Server environment ConfigMap
    server.service.yaml              # Server Service (ClusterIP)

    # Query Engine
    queryengine.deployment.yaml      # Query Engine Deployment
    queryengine.configmap.yaml       # Query Engine environment ConfigMap
    queryengine.service.yaml         # Query Engine Service (ClusterIP)

    # Ingestion Worker
    ingestionworker.statefulset.yaml # Ingestion Worker StatefulSet (3 replicas)
    ingestionworker.configmap.yaml   # Ingestion Worker environment ConfigMap

    # Cluster / Ingress
    cluster.ingress.yaml             # GKE Ingress with managed certificate
    cluster.managedcertificate.yaml  # Google-managed TLS certificate
    cluster.frontendconfig.yaml      # Frontend config (HTTP→HTTPS redirect)

Deployment Topology¶

Service	Type	Replicas (dev)	Notes
Admin	Deployment	1	Nginx serving static React build
Server	Deployment	1	NestJS on port 3000
Query Engine	Deployment	1	FastAPI on port 8000
Ingestion Worker	StatefulSet	3	Redis consumer group (parallel processing)

Kubernetes Secrets¶

Secret Name	Keys	Purpose
`postgres-secrets`	`DB_USERNAME`, `DB_PASSWORD`	PostgreSQL credentials
`redis-secrets`	`REDIS_PASS`	Redis password
`query-engine-secrets`	`OPENAI_API_KEY`, `PERPLEXITY_API_KEY`, `FAL_KEY`, `PEXELS_API_KEY`, `TAVILY_API_KEY`, `GROQ_API_KEY`, `EXA_API_KEY`, `ANTHROPIC_API_KEY`	API keys for query engine
`google-api-secrets`	`GOOGLE_API_KEY`	Google API key
`gcs-keyfile-dev`	`gcs-key.json`	GCS service account key file
`canvas-secrets`	`CANVAS_TOKEN`	Canvas LMS token

Environments¶

Environment	Domain	Helm Values
Development	`app-dev.blinkin.io`	`values.dev.yaml`
UAT	`app-uat.blinkin.io`	`values.uat.yaml`
Production	`app.blinkin.io`	`values.prod.yaml`