Skip to content

Zweistein — AI Platform Documentation

Codebase location: /zweistein-dev/zweistein-dev/

Zweistein is a multi-service AI platform comprising a React admin panel, a NestJS API server, and two Python microservices (Ingestion Worker and Query Engine). Together they power knowledge-based AI agents, file processing pipelines, vector-search retrieval, multi-model LLM orchestration, and a real-time chat interface.


1. Architecture Overview

graph TB
    subgraph "Tier 1 — Frontend"
        ADMIN["Admin Panel<br/>(React + Vite + TS)<br/>Port 5173"]
    end

    subgraph "Tier 2 — API Server"
        SERVER["NestJS Server<br/>Port 3000"]
        SWAGGER["Swagger Docs<br/>/ai/docs"]
        WEBSOCKET["Socket.IO<br/>WebSocket Gateway"]
    end

    subgraph "Tier 3 — Python Services"
        QE["Query Engine<br/>(FastAPI, Port 8000)"]
        IW["Ingestion Worker<br/>(Redis Consumer)"]
    end

    subgraph "Data Stores"
        PG["PostgreSQL"]
        QDRANT["Qdrant<br/>(Vector DB)"]
        REDIS["Redis Streams<br/>(Job Queue)"]
        GCS["Google Cloud Storage<br/>(File Storage)"]
    end

    subgraph "External AI Services"
        OPENAI["OpenAI<br/>(GPT-4o/5.x, Whisper,<br/>Embeddings)"]
        ANTHROPIC["Anthropic<br/>(Claude Opus 4.6)"]
        GOOGLE["Google<br/>(Gemini 2.x/3.x)"]
        PERPLEXITY["Perplexity<br/>(Sonar)"]
        GROQ["Groq<br/>(Llama 3.3)"]
        ELEVENLABS["ElevenLabs<br/>(TTS)"]
        FAL["fal.ai<br/>(Image Gen)"]
    end

    ADMIN -->|"HTTP /ai/api/*"| SERVER
    ADMIN -->|"WebSocket"| WEBSOCKET
    SERVER -->|"HTTP Proxy"| QE
    SERVER -->|"xadd jobs"| REDIS
    REDIS -->|"xreadgroup"| IW
    IW -->|"Notifications"| REDIS
    REDIS -->|"xreadgroup notifications"| SERVER
    SERVER --> PG
    SERVER --> GCS
    IW --> GCS
    IW --> QDRANT
    IW --> OPENAI
    QE --> QDRANT
    QE --> OPENAI
    QE --> ANTHROPIC
    QE --> GOOGLE
    QE --> PERPLEXITY
    QE --> GROQ
    QE --> ELEVENLABS
    QE --> FAL

How the tiers connect:

Connection Protocol Purpose
Admin --> Server HTTP REST + WebSocket (Socket.IO) All UI operations, real-time updates
Server --> Query Engine HTTP (internal, PYTHON_SERVER_URL) LLM queries, agent calls, chat, image search
Server --> Redis Streams xadd to stream:zweistein Dispatch file/URL processing jobs
Redis --> Ingestion Worker xreadgroup consumer groups Workers pick up and process jobs
Ingestion Worker --> Redis xadd to stream:zweistein:notifications Notify server when processing completes
Server <-- Redis Notifications xreadgroup on notification stream Server receives completion events, pushes to UI via WebSocket

2. Admin Panel

Path: admin/

Purpose

React-based frontend for managing AI bots, knowledge concepts, agents, agentic apps, controls (visual components), conversations, and collections. Serves as the primary UI for the Zweistein AI platform.

Tech Stack

Technology Version Purpose
React 18.3 UI framework
Vite 5.3 Build tool and dev server
TypeScript 5.2 Type safety
Tailwind CSS 3.4 Styling
Socket.IO Client 4.7 Real-time communication
Zustand 4.5 State management
SWR 2.2 Data fetching / caching
React Router 6 Client-side routing
React Query 3.39 Server state management
Storybook 8.2 Component development

Key Features

  • Monaco Editor (@monaco-editor/react) — In-browser code editor for controls and agent configuration
  • Deepgram Audio (@deepgram/sdk) — Real-time audio transcription
  • Voice Activity Detection (@ricky0123/vad-react) — Detect when user is speaking
  • TipTap (@tiptap/*) — Rich text editor with mention support
  • ReactFlow (reactflow) — Visual node-based agent graph editor
  • Module Federation (@originjs/vite-plugin-federation) — Exposes ./sdk for embedding in other apps
  • Stripe / Paddle — Payment integration for subscriptions
  • Auth0 (@auth0/auth0-react) — Authentication and authorization
  • PostHog (posthog-js) — Product analytics

Page Structure

admin/src/pages/
  agent-threads/      # Agent execution threads and history
  agentic-apps/       # Agentic app builder and management
  agents/             # AI agent configuration
  atoms/              # Atomic UI component demos
  blinkbot/           # BlinkBot chatbot interface
  bots/               # Bot configuration and deployment
  internal-tests/     # Internal testing tools
  knowledge-base/     # Knowledge concept management (files, URLs, conversations)
  plans/              # Subscription plan management
  layout.tsx          # Main layout wrapper
  payment-success.tsx # Payment success callback
  payment-cancel.tsx  # Payment cancel callback

Entry Points

Entry HTML File Route Pattern Purpose
Main app index.html /ai/* Admin dashboard
Chat widget chat.html /ai/chat* Embeddable chat interface
Public view public.html /ai/public* Public-facing chatbot pages

Build & Dev

# Development
cd admin && yarn dev          # Starts Vite dev server on port 5173

# Production build
cd admin && yarn build        # TypeScript check + Vite build

# Storybook
cd admin && yarn storybook    # Component explorer on port 6006

3. NestJS Server

Path: server/

Module Architecture

graph TB
    subgraph "Core Infrastructure"
        APP["AppModule"]
        AUTHZ["AuthzModule<br/>(Auth0 JWT, RBAC)"]
        CONFIG["ConfigModule"]
        TYPEORM["TypeORM<br/>(PostgreSQL)"]
        EVENTS["EventEmitterModule"]
        REDIS_MOD["RedisStreamsModule<br/>(Job Queue)"]
    end

    subgraph "Entity Management (CRUD)"
        CONCEPTS["ConceptsModule"]
        BOTS["BotsModule"]
        AGENTS["AgentsModule"]
        INTERVIEWS["InterviewsModule"]
        AGENTIC_APP["AgenticAppModule"]
        EXT_USER["ExternalUserModule"]
        EXT_USER_GRP["ExternalUserGroupModule"]
    end

    subgraph "Chat System"
        CONVERSATIONS["ConversationsModule"]
        MESSAGES["MessagesModule"]
        CHATBOT["ChatbotModule"]
        REALTIME["RealtimeCollaborationModule<br/>(Socket.IO)"]
        FAVOURITES["FavouritesModule"]
        MSG_FEEDBACK["MessagesFeedbackModule"]
    end

    subgraph "AI & Processing"
        QUERY["QueryModule<br/>(Proxy to Python)"]
        DATA_PROC["DataProcessingModule"]
        CRAWLER["CrawlerModule"]
        VOICE["VoiceModule"]
        MCP["McpServerModule"]
        CONTROLS["ControlsModule<br/>(AI-Generated UI)"]
        QUICK_ACT["QuickActionsModule"]
        GEN_ANY["GenerateAnythingModule"]
    end

    subgraph "Storage & Files"
        FILES["FilesModule"]
        CLOUD["CloudStorageModule<br/>(GCS / Azure)"]
        TENANT_STORE["TenantStorageModule"]
    end

    subgraph "Billing & Subscriptions"
        BILLING["BillingModule<br/>(Stripe)"]
        SUBS["SubscriptionsModule"]
        WALLET["WalletModule"]
        USAGE["UsageModule"]
        PLANS["PlansModule"]
    end

    subgraph "Other"
        DASHBOARD["DashboardModule"]
        COLLECTION["CollectionModule"]
        FAVORITES["FavoritesModule"]
        PICASSO["PicassoModule"]
        MAILBOX["MailboxModule"]
        TENANT_TOOL["TenantToolConfigModule"]
        EXT_AUTH["ExternalAuthModule"]
        INT_TESTS["InternalTestsModule"]
    end

    APP --> AUTHZ
    APP --> CONFIG
    APP --> TYPEORM
    APP --> EVENTS
    APP --> REDIS_MOD
    APP --> CONCEPTS
    APP --> BOTS
    APP --> AGENTS
    APP --> FILES
    APP --> CHATBOT
    APP --> QUERY
    APP --> DATA_PROC
    APP --> CONTROLS
    APP --> BILLING
    APP --> WALLET
    APP --> SUBS

    CHATBOT --> CONVERSATIONS
    CHATBOT --> MESSAGES
    CHATBOT --> REALTIME
    FILES --> CLOUD
    FILES --> TENANT_STORE
    FILES --> REDIS_MOD
    QUERY --> CONFIG
    DATA_PROC --> REDIS_MOD
    BILLING --> SUBS

Key Modules Table

Module Path Purpose
AuthzModule server/src/authz/ Auth0 JWT validation, RBAC guards, tenant scoping
FilesModule server/src/files/ File upload/download, GCS integration, job dispatch
QueryModule server/src/query/ Proxies queries to Python Query Engine
ChatbotModule server/src/chat/chatbot/ Orchestrates chat: SSE streaming from Python, message persistence
ConversationsModule server/src/chat/conversations/ CRUD for conversation threads
MessagesModule server/src/chat/messages/ CRUD for chat messages
RealtimeCollaborationModule server/src/chat/realtime-collaboration/ Socket.IO gateway for real-time updates
ControlsModule server/src/controls/ AI-generated React components (Controls)
DataProcessingModule server/src/data-processing/ Event-driven file processing pipeline management
RedisStreamsModule server/src/redis-streams/ Redis Streams producer/consumer for job queue
CrawlerModule server/src/crawler/ Web crawling and link extraction
BillingModule server/src/billing/ Stripe integration, model token pricing
SubscriptionsModule server/src/subscriptions/ Subscription plan management
WalletModule server/src/wallet/ Credit-based usage wallet (balance, transactions)
UsageModule server/src/quotas/ Quota tracking and enforcement
DashboardModule server/src/dashboard/ Dashboard analytics (recently used, popular items)
VoiceModule server/src/voice/ Voice input/output (STT/TTS)
McpServerModule server/src/mcp-servers/ Model Context Protocol server management
CollectionModule server/src/collection/ Collections of concepts and items
ConceptsModule server/src/crud-entities/concepts/ Knowledge concepts (knowledge bases)
BotsModule server/src/crud-entities/bots/ Bot configuration and deployment
AgentsModule server/src/crud-entities/agents/ Agent definitions and configurations
AgenticAppModule server/src/crud-entities/agentic-apps/ Agentic application definitions
QuickActionsModule server/src/quickactions/ Quick media actions (transcription, summarization)
GenerateAnythingModule server/src/generate-anything/ Versatile content generation
PicassoModule server/src/picasso/ Integration with Blinkin Studio (Picasso)
MailboxModule server/src/mailbox/ Email ingestion via Mailgun
CloudStorageModule server/src/cloudstorage/ Abstraction layer for GCS / Azure Blob
TenantStorageModule server/src/tenantstorage/ Tenant-scoped file storage
ExternalAuthModule server/src/external-auth/ External user authentication settings
TenantToolConfigModule server/src/tenant-tool-config/ Per-tenant tool configurations

API Endpoints Table

Method Route Module Description
Files
GET /api/files/:conceptId/:group? Files List files for a concept
GET /api/files/with-full-urls/:conceptId/:group? Files List files with signed URLs
POST /api/files/:conceptId/:group Files Upload files
POST /api/files/:conceptId/:group/uploadFromUrl Files Add file from URL
PUT /api/files/:conceptId/:group/:fileId Files Replace a file
PUT /api/files/:conceptId/:group/:fileId/text Files Replace file text content
POST /api/files/:conceptId/:group/:fileId/regenerate Files Regenerate file processing
DELETE /api/files/:conceptId/:group/:fileId Files Delete a file
POST /api/files/getUrl/:fileId Files Get signed download URL
GET /api/files/getFile/:fileId Files Stream file content
GET /api/files/getFile/:fileId/text Files Get text transcript of file
Query
POST /api/query/:conceptId Query Query a knowledge concept
POST /api/query/chat3/:conceptId Query Chat with a concept (proxied to Python)
POST /api/query/ask-about-image Query Ask a question about an image
Controls
POST /api/controls Controls Create a control
GET /api/controls Controls List all controls
GET /api/controls/:id Controls Get control by ID
PUT /api/controls/:id Controls Update a control
DELETE /api/controls/:id Controls Delete a control
POST /api/controls/generate Controls AI-generate a React control
POST /api/controls/fix-errors Controls AI-fix code errors
POST /api/controls/edit-with-ai Controls AI-edit code
POST /api/controls/regenerate-json Controls Regenerate sample JSON
POST /api/controls/regenerate-schema/:id Controls Regenerate JSON schema
POST /api/controls/duplicate/:id Controls Duplicate a control
Wallet
GET /api/wallet Wallet Get wallet balance
GET /api/wallet/usage-history Wallet Get usage event history
GET /api/wallet/transaction-history Wallet Get transaction history
Billing
Various /api/billing/* Billing Stripe checkout, webhooks, pricing
Subscriptions
Various /api/subscriptions/* Subscriptions Plan management, tenant subscriptions
Dashboard
Various /api/dashboard/* Dashboard Recently used items, popular in org
Collection
Various /api/collection/* Collection Collection CRUD
Crawler
Various /api/crawler/* Crawler Web crawling
Auth
Various /api/auth/* Authz Login, token refresh, user info
Health
GET /healthz Core Health check

Key File Paths

server/src/
  main.ts                    # Bootstrap: port 3000, Swagger at /ai/docs
  app.module.ts              # Root module importing all feature modules
  authz/                     # Auth0 JWT strategy, RBAC guards, decorators
  files/
    files.controller.ts      # File upload/download REST endpoints
    files.service.ts         # File business logic, GCS interactions
    file.entity.ts           # TypeORM entity for files
  chat/
    chatbot/chatbot.ts       # Main chatbot controller + module
    chatbot/llm.service.ts   # LLM orchestration for chat
    chatbot/proxy.service.ts # Proxy to Python SSE streams
    conversations/           # Conversation entity and CRUD
    messages/                # Message entity and CRUD
    realtime-collaboration/  # Socket.IO broadcast service
  query/
    query.controller.ts      # Proxy to Python Query Engine
    proxy.service.ts         # HTTP proxy helper
  redis-streams/
    redis-streams.module.ts  # Dynamic module for Redis Streams
    jobs.service.ts          # Job producer (xadd)
    jobs-listener.service.ts # Notification consumer (xreadgroup)
  controls/
    controls.controller.ts   # CRUD + AI generation endpoints
    controls.entity.ts       # Control entity (React component code, JSON schema)
    llm/llm.service.ts       # LLM calls for code generation
  wallet/
    wallet.controller.ts     # Balance and usage history
    wallet.service.ts        # Wallet business logic
    entities/                # Wallet, UsageEvent, UsageTransaction, PriceTag
  billing/
    billing.controller.ts    # Stripe integration
    entity/                  # Billing entity, model-token pricing
  crud-entities/
    concepts/                # Knowledge concept CRUD
    bots/                    # Bot CRUD
    agents/                  # Agent CRUD
    agentic-apps/            # Agentic app CRUD
    interviews/              # Interview CRUD
    external-user/           # External user management
    external-user-group/     # External user group management
    base/base.entity.ts      # Base entity with tenant scoping

4. Python Ingestion Worker (Detailed)

Path: python_server/ingestion_worker/

Purpose

Consumes file/URL/conversation processing jobs from Redis Streams, extracts text content from various file types, generates vector embeddings via OpenAI, and indexes them into Google File Search (via Google GenAI) for retrieval. It also handles image explanation and audio transcription through external services.

Processing Pipeline

graph LR
    subgraph "Job Source"
        REDIS_STREAM["Redis Stream<br/>stream:zweistein"]
    end

    subgraph "Worker Core"
        RW["RedisWorker<br/>(Consumer Group)"]
        MP["MessageProcessor"]
    end

    subgraph "Indexers"
        DOC["DocumentIndexer<br/>(PDFs, text, office docs)"]
        SCRAPER["ScraperIndexer<br/>(Websites)"]
        CONV["ConversationIndexer<br/>(Chat transcripts)"]
        IMG["ImageIndexer<br/>(Images → OCR)"]
        GFS["GoogleFileSearchIndexer<br/>(Google GenAI File Search)"]
    end

    subgraph "External Services"
        WHISPER["Whisper<br/>(Audio transcription)"]
        IMAGE_API["IMAGE_EXPLAINER_ENDPOINT<br/>(OCR / Image description)"]
        AUDIO_API["AUDIO_TRANSCRIBER_ENDPOINT<br/>(Remote audio transcription)"]
        QE_TRANSCRIBE["Query Engine<br/>/quick-actions/media/transcribe"]
    end

    subgraph "Storage"
        GCS_OUT["Google Cloud Storage<br/>(Transcripts)"]
        QDRANT_OUT["Google File Search<br/>(Vector Index)"]
    end

    REDIS_STREAM -->|"xreadgroup"| RW
    RW -->|"route by type"| MP
    MP -->|"file"| DOC
    MP -->|"file (youtube)"| QE_TRANSCRIBE
    MP -->|"file (website)"| SCRAPER
    MP -->|"url"| SCRAPER
    MP -->|"conversation"| CONV
    MP -->|"file (image)"| IMG

    DOC -->|"video/audio"| WHISPER
    DOC -->|"documents"| GFS
    IMG -->|"explain"| IMAGE_API
    IMG --> GFS
    SCRAPER --> GFS
    CONV --> DOC
    QE_TRANSCRIBE --> GFS

    GFS -->|"index text"| QDRANT_OUT
    DOC -->|"upload transcript"| GCS_OUT
    MP -->|"upload transcript"| GCS_OUT

Job Types

Job Type Source Processing
file File uploaded via UI Route by file type: document, image, video/audio, YouTube, website
file.text-replaced User edits file text Delete old index, re-index with new text
file.deleted File deleted via UI Remove from vector index
concept.deleted Concept deleted Delete entire Google File Search store
url URL added to concept Crawl and scrape website content, index
conversation Chat conversation indexed Summarize and index conversation content

File Type Handling

File Type Detection Processing Pipeline
PDF filetype.guess() → not video/audio/image DocumentIndexer.index_document_file() → parse → chunk → GFS index
Text/Markdown .md or .txt extension DocumentIndexer.index_document_file() → chunk → GFS index
Image MIME image/* ImageIndexer.index_image() → IMAGE_EXPLAINER_ENDPOINT → GFS index
Video MIME video/* (.mp4, .avi, .webm) DocumentIndexer.index_file_containing_audio() → Whisper transcription → GFS index
Audio MIME audio/* (.mp3, .wav) DocumentIndexer.index_file_containing_audio() → Whisper transcription → GFS index
YouTube group == "youtube" SSE call to Query Engine /quick-actions/media/transcribe → GFS index
Website group == "websites" spider_rs scraping → readabilipy HTML cleanup → markdown → GFS index

External Service Endpoints

Service Setting Default Purpose
Image Explainer IMAGE_EXPLAINER_ENDPOINT https://ocr.akjo.tech/explain-image OCR and image description
Audio Transcriber AUDIO_TRANSCRIBER_ENDPOINT https://ocr.akjo.tech/transcribe-audio Remote audio transcription
Query Engine QUERY_ENGINE_URL http://queryengine-service YouTube transcription via SSE

Key File Paths

python_server/ingestion_worker/
  main.py                 # Entry point: initializes embeddings, starts RedisWorker
  settings.py             # Pydantic settings (env vars)
  redis_worker.py         # Redis Streams consumer group loop
  message_processor.py    # Routes jobs to appropriate indexer
  file_downloader.py      # Download files from URLs
  cloud_file_downloader.py # Download files from GCS
  google_file_search_service.py # Google GenAI File Search client
  cloud_storage/
    gcp_cloud.py          # Google Cloud Storage provider
  indexers/
    __init__.py           # Exports all indexer classes
    document.py           # DocumentIndexer (PDFs, text, audio)
    scraper.py            # ScraperIndexer (web crawling)
    conversation.py       # ConversationIndexer
    image_indexer.py      # ImageIndexer (OCR)
    google_file_search_indexer.py # GoogleFileSearchIndexer
    text_indexer.py       # Base text indexing
    conversations/
      anonymizer.py       # PII anonymization (Presidio)
      chat_retriever.py   # Chat history retrieval
      recorded_audio_transcriber.py # Audio transcription
      smart_summarizer.py # AI-powered conversation summarization
  video/
    splitter.py           # Video scene detection (scenedetect)

5. Python Query Engine (Detailed)

Path: python_server/query_engine/

Purpose

The LLM-powered intelligence core of the platform. Provides query answering, multi-agent orchestration, deep research, image generation/search, web crawling, content generation, and voice transcription. Runs as a FastAPI service on port 8000.

Agent Architecture

graph TB
    subgraph "Entry Points"
        AGENT_API["/agents/call<br/>(Main Agent Entry)"]
        CHAT_API["/chat<br/>(Knowledge Mode)"]
        DR_API["/deep-research<br/>(Deep Research)"]
    end

    subgraph "Dispatcher Layer"
        DISPATCH["Dispatcher Step<br/>(GPT-5.2)<br/>Routes to right agent"]
    end

    subgraph "Agent Types"
        SMART["Smart Agent<br/>(Single-turn tool use)"]
        SUPERVISOR["Supervisor Agent<br/>(Claude Opus 4.6)<br/>Multi-step planning"]
        DEEP_RESEARCH["Deep Research Agent<br/>(Multi-provider)"]
        ADR["Agentic Data Retrieval<br/>(Multi-query RAG)"]
        PDF_FILL["PDF Filler Agent<br/>(Form auto-fill)"]
        AGENTIC_APPS["Agentic Apps<br/>(Task Dispatcher)"]
    end

    subgraph "Supervisor Sub-Steps"
        GOAL["Goal Analyzer"]
        PLANNER["Planner Step"]
        TASK_SEL["Task Selector"]
        INFO_GATHER["Information Gathering"]
        VALIDATION["Validation Step"]
        REPLAN["Replanner"]
        USER_HELP["Request Help from User"]
    end

    subgraph "Tools"
        ZWEISTEIN_TOOL["Zweistein RAG<br/>(Vector Search)"]
        FILE_READER["File Reader"]
        IMAGE_GEN["Image Generator<br/>(fal.ai / GPT)"]
        IMAGE_EXPLAIN["Image Explainer"]
        WEB_SEARCH["Web Search<br/>(Tavily, Exa)"]
        VIDEO_ANALYZE["Video Analyzer"]
        TTS["Text-to-Speech<br/>(ElevenLabs)"]
        URL_LOADER["URL Loader"]
        EMAIL["Email Sender<br/>(Mailgun)"]
        DEEP_RESEARCH_TOOL["Deep Research Tool"]
        DOC_OCR["Document OCR<br/>(Google Document AI)"]
        CONTROL_TOOLS["Dynamic Control Tools"]
        MCP_TOOLS["MCP Server Tools"]
    end

    subgraph "Deep Research Providers"
        DR_GEMINI["Gemini Deep Research"]
        DR_CLAUDE["Claude Opus Deep Research"]
        DR_GPT5["GPT-5 Deep Research"]
        DR_SONAR["Perplexity Sonar"]
        DR_KIMI["Kimi K2.5"]
        DR_MULTI["Multi-step (default)<br/>Breadth + Depth search"]
    end

    AGENT_API --> DISPATCH
    DISPATCH --> SMART
    DISPATCH --> SUPERVISOR
    DISPATCH --> ADR
    DISPATCH --> PDF_FILL
    DISPATCH --> AGENTIC_APPS

    SUPERVISOR --> GOAL
    SUPERVISOR --> PLANNER
    SUPERVISOR --> TASK_SEL
    SUPERVISOR --> INFO_GATHER
    SUPERVISOR --> VALIDATION
    SUPERVISOR --> REPLAN
    SUPERVISOR --> USER_HELP

    SMART --> ZWEISTEIN_TOOL
    SMART --> FILE_READER
    SMART --> IMAGE_GEN
    SMART --> WEB_SEARCH
    SMART --> VIDEO_ANALYZE
    SMART --> TTS
    SMART --> URL_LOADER
    SMART --> EMAIL
    SMART --> DOC_OCR
    SMART --> CONTROL_TOOLS
    SMART --> MCP_TOOLS

    CHAT_API --> ZWEISTEIN_TOOL

    DR_API --> DR_GEMINI
    DR_API --> DR_CLAUDE
    DR_API --> DR_GPT5
    DR_API --> DR_SONAR
    DR_API --> DR_KIMI
    DR_API --> DR_MULTI

API Endpoints Table

Method Route Module Description
GET /healthz Core Health check
Blinks
POST /blinks/generate Blinks Generate a "Blink" (structured content piece) from AI
Query
POST /query/ Query Query a concept's knowledge base (RAG)
Images
POST /images/find-images-simple Images Find or generate images
POST /images/query-image Images Ask question about an image (OCR)
Chat
POST /chat/ Chat Non-streaming chat with RAG
POST /chat/stream Chat SSE streaming chat (Gemini 3.1 Pro + FileSearch)
Agents
POST /agents/call Agents Main agent invocation (Smart, Supervisor, or custom)
POST /agents/call-supervisor Agents Direct supervisor agent call
POST /agents/call-data-retrieval Agents Agentic data retrieval
POST /agents/call-pdf-filler Agents PDF auto-fill agent
POST /agents/call-agentic-app Agents Agentic app execution
Crawler
POST /crawler/fetch-links Crawler Crawl website and extract links (optional smart scraping)
Deep Research
POST /deep-research Deep Research Multi-provider deep research (SSE streaming)
Quick Actions
POST /quick-actions/media/query Quick Actions Query a media file with AI
POST /quick-actions/media/transcribe Quick Actions Transcribe audio/video (SSE streaming)
POST /quick-actions/youtube/summarize Quick Actions Summarize a YouTube video
Generate Anything
POST /generate-anything/general Generate Anything General content generation (SSE streaming)
POST /generate-anything/gemini Generate Anything Gemini-powered generation (SSE streaming)
Voice
POST /voice/transcribe Voice Audio file transcription (Blinkin inference + GPT-4o-mini cleanup)

Key File Paths

python_server/query_engine/
  main.py                      # FastAPI app entry point
  settings.py                  # Pydantic settings (LLM models, API keys)
  api/v1/
    router.py                  # Main router aggregating all endpoint routers
    endpoints/
      agents.py                # Agent invocation endpoints (largest file)
      blinks.py                # Blink generation
      chat.py                  # Chat endpoints (streaming and non-streaming)
      query.py                 # RAG query endpoint
      images.py                # Image search and generation
      crawler.py               # Web crawling
      deep_research.py         # Deep research (multi-provider)
      quick_actions.py         # Media query/transcription
      generate_anything.py     # Content generation
      voice_transcription.py   # Voice transcription with cleaning
  agents/
    chat_models.py             # LLM provider factory (OpenAI, Anthropic, Google, Groq, etc.)
    state.py                   # LangGraph agent state definition
    agent_with_tools.py        # Generic agent-with-tools graph builder
    smart_agent/
      smart_agent.py           # Single-turn smart agent
      bosch_smart_agent.py     # Custom Bosch variant
    supervisor/
      replanner.py             # Supervisor replanning logic
    deep_research/
      provider_router.py       # Routes to correct deep research provider
      provider_gemini.py       # Gemini deep research
      provider_claude.py       # Claude deep research
      provider_openai.py       # GPT-5 deep research
      provider_sonar.py        # Perplexity Sonar deep research
      provider_kimi.py         # Kimi K2.5 deep research
      deep_research.py         # Original multi-step implementation
    agentic_data_retrieval/
      graph.py                 # Agentic data retrieval LangGraph
      validator.py             # Response validation
      zweistein_retriever.py   # Vector retrieval integration
    agentic_apps/
      task_dispatcher.py       # Planned task dispatcher
      userintheloop.py         # User-in-the-loop step
    pdf_filler/
      pdf_filler.py            # PDF form auto-fill
      semantic_understanding.py # Semantic field matching
    steps/
      dispatcher.py            # Dispatcher step (routes to right agent)
      supervisor.py            # Supervisor orchestration steps
      planner.py               # Planning step
      reflector.py             # Self-reflection step
      answer_improver.py       # Answer improvement step
      chatbot.py               # Chatbot step
      memory.py                # Zettelkasten memory system
      tools.py                 # Tool binding utilities
      helper.py                # Helper utilities
      supervisor_steps/
        goal_analyzer.py       # Goal analysis
        planner.py             # Supervisor planning
        success_criteria_analyzer.py # Success evaluation
        human_conversation_interface.py # Human-in-the-loop
    tools/
      zweistein.py             # Zweistein RAG tool
      file_reader.py           # File reading tool
      image_generator.py       # fal.ai image generation
      image_generator_gpt.py   # GPT image generation
      image_explainer.py       # Image explanation tool
      tavily_search.py         # Tavily web search
      exa_search.py            # Exa web search
      video_analyzer.py        # Video analysis tool
      new_video_analyzer.py    # Updated video analyzer
      text_to_speech_elevenlabs.py # ElevenLabs TTS
      url_loader.py            # URL content loader
      email_sender.py          # Mailgun email sending
      deep_research.py         # Deep research as a tool
      document_ocr_tool.py     # Google Document AI OCR
      pdf_filler.py            # PDF filler as tool
      youtube.py               # YouTube tools
      control_loader.py        # Load control definitions
      dynamic_control_tools.py # Runtime control tools
      get_tools_from_agent_definition.py # Tool resolver from config
      gcp_cloud.py             # GCS upload utilities
      ovh_vlm.py               # OVH vision-language model
  zweistein/
    blink_generator.py         # Blink content generation logic
    image_search.py            # Image search and generation
    retrievers/
      zweistein_retriever.py   # Core RAG retriever
      chat.py                  # Chat retriever with context
    tools.py                   # Retriever tools
  common/
    file_downloader.py         # File download utility
    error_translator.py        # Error message translation
  usage/
    check_quota.py             # Quota checking
    enforce_quota.py           # Quota enforcement
    simple_token_tracker.py    # Token usage tracking
    wallet_client.py           # Wallet API client
    wallet_integration.py      # Wallet reporting integration
    token_use.py               # Token usage callback
  db/                          # Database utilities

6. LLM Provider Map

Model Configuration Table

Setting Model Provider Purpose
LLM_SIMPLE gpt-4o-mini OpenAI Fast, cheap tasks: query rewriting, transcript cleaning, simple classification
LLM_ADVANCED gpt-4o OpenAI Default LLM for smart agents, chat, content generation
LLM_DISPATCHER gpt-5.2 OpenAI Agent dispatcher: decides which agent/tool to invoke
LLM_SUPERVISOR claude_opus46 (Claude Opus 4.6) Anthropic Supervisor agent: multi-step planning and orchestration
GEMINI_25_PRO gemini-2.5-pro Google Deep research (Gemini provider), general reasoning
GEMINI_25_FLASH gemini-2.5-flash Google Memory generation (zettelkasten entries)
GEMINI_3_PRO gemini-3-pro-preview Google Advanced Google tasks
GEMINI_31_PRO gemini-3.1-pro-preview Google Streaming chat with FileSearch grounding (Knowledge Mode)
GEMINI_3_FLASH gemini-3-flash-preview Google Fast Google tasks
GEMINI_20_FLASH gemini-2.0-flash Google Legacy quick tasks
llama-3.3-70b-versatile Groq Fast open-source inference
deepseek-r1-distill-llama-70b Groq Reasoning model via Groq
Perplexity Sonar Perplexity Web-grounded deep research
Kimi K2.5 NVIDIA NIM Deep research (Kimi provider)
OVH GPT OSS 120B OVH Alternative open-source model
ElevenLabs ElevenLabs Text-to-speech
fal.ai (Flux) fal.ai Image generation
EMBEDDING_MODEL text-embedding-3-large OpenAI Vector embeddings for RAG

LLM Usage Diagram

graph LR
    subgraph "User Request"
        REQ["Incoming Query"]
    end

    subgraph "Routing (GPT-5.2)"
        DISPATCH["Dispatcher<br/>gpt-5.2"]
    end

    subgraph "Execution Agents"
        SMART_AGENT["Smart Agent<br/>gpt-4o"]
        SUPERVISOR_AGENT["Supervisor<br/>Claude Opus 4.6"]
        KNOWLEDGE["Knowledge Mode<br/>Gemini 3.1 Pro"]
    end

    subgraph "Supporting Tasks"
        REWRITE["Query Rewriting<br/>gpt-4o-mini"]
        MEMORY["Memory Generation<br/>Gemini 2.5 Flash"]
        CLEAN["Transcript Cleaning<br/>gpt-4o-mini"]
        EMBED["Embeddings<br/>text-embedding-3-large"]
    end

    subgraph "Deep Research"
        DR_GEMINI["Gemini 2.5 Pro"]
        DR_CLAUDE["Claude Opus"]
        DR_GPT5["GPT-5"]
        DR_SONAR["Perplexity Sonar"]
        DR_KIMI["Kimi K2.5"]
    end

    subgraph "Media & Generation"
        TTS_MODEL["ElevenLabs<br/>Text-to-Speech"]
        IMG_MODEL["fal.ai / GPT<br/>Image Generation"]
    end

    REQ --> DISPATCH
    DISPATCH -->|"simple query"| SMART_AGENT
    DISPATCH -->|"complex/multi-step"| SUPERVISOR_AGENT
    REQ -->|"knowledge mode"| KNOWLEDGE

    SMART_AGENT --> REWRITE
    SMART_AGENT --> EMBED
    SUPERVISOR_AGENT --> EMBED
    KNOWLEDGE --> EMBED
    KNOWLEDGE --> MEMORY

    REQ -->|"deep research"| DR_GEMINI
    REQ -->|"deep research"| DR_CLAUDE
    REQ -->|"deep research"| DR_GPT5
    REQ -->|"deep research"| DR_SONAR
    REQ -->|"deep research"| DR_KIMI

    SMART_AGENT --> TTS_MODEL
    SMART_AGENT --> IMG_MODEL
    SMART_AGENT --> CLEAN

7. Data Processing Pipeline

sequenceDiagram
    participant User
    participant Admin as Admin Panel
    participant Server as NestJS Server
    participant GCS as Google Cloud Storage
    participant Redis as Redis Streams
    participant Worker as Ingestion Worker
    participant GFS as Google File Search
    participant WS as WebSocket

    User->>Admin: Upload file via drag & drop
    Admin->>Server: POST /api/files/:conceptId/:group<br/>(multipart form data)

    Server->>GCS: Upload original file<br/>(tenant-scoped path)
    GCS-->>Server: Cloud file path

    Server->>Server: Create FileEntity in PostgreSQL<br/>(status: "pending")

    Server->>Redis: XADD stream:zweistein<br/>{type: "file", conceptId, entityId,<br/>cloudFilePath, filename}

    Server-->>Admin: 200 OK (file entity)
    Admin->>WS: Subscribe to file status updates

    Note over Redis,Worker: Consumer Group Processing

    Redis->>Worker: XREADGROUP (picks up job)
    Worker->>Redis: Send "processing" notification
    Redis->>Server: Notification received
    Server->>WS: Push "processing" status to UI
    WS-->>Admin: File status: "processing"

    alt PDF / Text Document
        Worker->>GCS: Download file via signed URL
        Worker->>Worker: Parse document<br/>(PyMuPDF, text extraction)
        Worker->>GFS: Index text chunks<br/>(Google File Search)
    else Image
        Worker->>GCS: Download file
        Worker->>Worker: Call IMAGE_EXPLAINER_ENDPOINT<br/>(OCR / description)
        Worker->>GFS: Index image description
    else Video / Audio
        Worker->>GCS: Download file
        Worker->>Worker: Whisper transcription<br/>(local model)
        Worker->>GFS: Index transcript
        Worker->>GCS: Upload .md transcript
    else YouTube
        Worker->>Server: SSE to /quick-actions/media/transcribe
        Worker->>GFS: Index transcript
        Worker->>GCS: Upload .md transcript
    else Website URL
        Worker->>Worker: spider_rs scraping<br/>+ readabilipy cleanup
        Worker->>GFS: Index markdown content
        Worker->>GCS: Upload .md content
    end

    Worker->>Redis: XADD notification stream<br/>{phase: "done", payload}

    Redis->>Server: Notification: processing complete
    Server->>Server: Update FileEntity<br/>(status: "done",<br/>transcript path if applicable)
    Server->>WS: Push "done" status to UI
    WS-->>Admin: File status: "done" (green check)

    Note over User,Admin: File is now searchable via RAG

8. Vector Search Architecture

sequenceDiagram
    participant User
    participant Admin as Admin Panel
    participant Server as NestJS Server
    participant QE as Query Engine (Python)
    participant GFS as Google File Search (Gemini FileSearch)
    participant LLM as LLM (Gemini 3.1 Pro)

    User->>Admin: Ask a question
    Admin->>Server: POST /api/chat or /api/agents/call<br/>{messages, conceptId, agentConfig}

    Server->>QE: Proxy to Python<br/>POST /chat/stream or /agents/call

    alt Knowledge Mode (Direct RAG)
        QE->>GFS: gfs_service.chat_stream()<br/>{concept_id, messages,<br/>system_instruction, model}
        GFS->>GFS: Retrieve relevant chunks<br/>from concept's vector store
        GFS->>LLM: Grounded generation<br/>(Gemini 3.1 Pro + FileSearch)
        LLM-->>QE: SSE stream (tokens + citations)
    else Agent Mode
        QE->>QE: Dispatcher step (GPT-5.2)<br/>→ Select agent type
        QE->>QE: Agent executes with tools
        QE->>GFS: Zweistein RAG tool<br/>→ vector search
        GFS-->>QE: Retrieved context
        QE->>LLM: Generate response<br/>with retrieved context
        LLM-->>QE: SSE stream
    end

    QE-->>Server: SSE event stream<br/>(tokens, citations, usage, state)
    Server-->>Admin: Forward SSE stream
    Admin-->>User: Render streaming response<br/>with inline citations

Vector Storage Details

  • Embedding model: text-embedding-3-large (OpenAI, 3072 dimensions)
  • Primary search: Google GenAI File Search (per-concept vector stores)
  • Collection naming: Each concept gets its own File Search store, identified by conceptId
  • Index types: DocumentFiles (text documents), ImageFiles (image descriptions)
  • Retrieval: Gemini's built-in FileSearch grounding retrieves relevant chunks automatically during generation

9. Database & Storage

PostgreSQL Entities

Entity Path Purpose
FileEntity server/src/files/file.entity.ts Uploaded files metadata (path, status, concept, group)
ConversationEntity server/src/chat/conversations/conversations.entity.ts Chat conversation threads
MessageEntity server/src/chat/messages/message.entity.ts Individual chat messages
MessageFeedbackEntity server/src/chat/message-feedback/message-feedback.entity.ts User feedback on messages
BotEntity server/src/crud-entities/bots.entity.ts Bot configurations
AgentEntity server/src/crud-entities/agents.entity.ts Agent definitions
AgenticAppEntity server/src/crud-entities/agentic-app.entity.ts Agentic app definitions
ControlEntity server/src/controls/controls.entity.ts AI-generated React components
CollectionEntity server/src/collection/collections.entity.ts Knowledge collections
CollectionItemEntity server/src/collection/collection_items.entity.ts Items within collections
FavoriteEntity server/src/favorite/favorite.entity.ts User favorites
SubscriptionPlanEntity server/src/subscriptions/entities/subscription-plan.entity.ts Available subscription plans
TenantSubscriptionEntity server/src/subscriptions/entities/tenant-subscription.entity.ts Active tenant subscriptions
WalletEntity server/src/wallet/entities/wallet.entity.ts Tenant credit wallet (grant + paid balance)
UsageEventEntity server/src/wallet/entities/usage-event.entity.ts LLM usage events (model, tokens, cost)
UsageTransactionEntity server/src/wallet/entities/usage-transaction.entity.ts Balance deduction transactions
PriceTagEntity server/src/wallet/entities/price-tag.entity.ts Per-model token pricing
BillingEntity server/src/billing/entity/billing.entity.ts Billing records
ModelTokenPricingEntity server/src/billing/entity/model-token-pricing.entity.ts Model pricing configuration
PlanEntity server/src/quotas/entities/plan.entity.ts Usage plans
QuotaDefinitionEntity server/src/quotas/entities/quota-definition.entity.ts Quota limits
UsageRecordEntity server/src/quotas/entities/usage-record.entity.ts Usage counting records
UsageCounterEntity server/src/quotas/entities/usage-counter.entity.ts Usage counters
RecentlyUsedEntity server/src/dashboard/recently-used.entity.ts Recently accessed items
PopularInOrgLogsEntity server/src/dashboard/popular-in-org-logs.entity.ts Popular items analytics
TenantExternalSettingsEntity server/src/external-auth/tenant-external-settings.entity.ts External auth settings
ExternalUserAuditEntity server/src/external-auth/audit/external-user-audit.entity.ts External user audit log
ExternalUserEntity server/src/crud-entities/external-user/external-user.entity.ts External users
ExternalUserGroupEntity server/src/crud-entities/external-user-group/external-user-group.entity.ts External user groups
TenantToolConfigEntity server/src/crud-entities/tenant-tool-config.entity.ts Per-tenant tool settings
BaseEntity server/src/crud-entities/base/base.entity.ts Base entity with tenantId scoping

Database connections: - Primary: PostgreSQL via TypeORM (DB_HOST, DB_PORT, DB_NAME) — all Zweistein entities - Secondary (Studio): PostgreSQL via TypeORM (STUDIO_DB_*) — read-only connection to Blinkin Studio DB

Google File Search (Vector Storage)

Aspect Detail
Service Google GenAI File Search API
Store-per-concept Each concept gets its own vector store
Embedding text-embedding-3-large (OpenAI, used during ingestion)
Retrieval Gemini FileSearch grounding (during query)
Text chunking Automatic by Google File Search
File types indexed Text, PDFs, images (as descriptions), audio/video (as transcripts), websites

Redis Streams (Job Queue)

Stream Consumer Group Purpose
stream:zweistein group:zweistein File/URL processing job queue
stream:zweistein:notifications group:zweistein:notifications Job completion notifications back to server

Message format (job):

{
  "type": "file|file.text-replaced|file.deleted|url|conversation|concept.deleted",
  "conceptId": "uuid",
  "entityId": "uuid",
  "cloudFilePath": "tenant/concept/filename",
  "filename": "original-name.pdf",
  "group": "documents|youtube|websites",
  "metadata": { "url": "..." }
}

Message format (notification):

{
  "original_message": "{...json...}",
  "phase": "processing|done|error",
  "payload": "{...optional json...}"
}

Google Cloud Storage (File Storage)

Aspect Detail
Bucket GCS_BUCKET_NAME (e.g., blinkin-ai-dev-storage)
Structure {tenantId}/{conceptId}/{filename}
Access Signed URLs (1-hour expiry for downloads)
Transcripts Stored as .{random}.md alongside original file
Key file GCS_KEY_FILE (service account JSON)

10. External Dependencies Table

Service Purpose Config Variable(s)
OpenAI LLM (GPT-4o, GPT-5.x), embeddings, Whisper OPENAI_API_KEY
Anthropic Claude Opus 4.6 (supervisor agent) ANTHROPIC_API_KEY
Google GenAI Gemini models, File Search, Document AI GOOGLE_API_KEY
Groq Fast inference (Llama 3.3, DeepSeek) GROQ_API_KEY
Perplexity Web-grounded deep research (Sonar) PERPLEXITY_API_KEY
ElevenLabs Text-to-speech ELEVENLABS_API_KEY_PART1
fal.ai Image generation (Flux) FAL_KEY
Pexels Stock image search PEXELS_API_KEY
Tavily Web search for agents TAVILY_API_KEY
Exa Semantic web search EXA_API_KEY
NVIDIA NIM Kimi K2.5 model NVIDIA_API_KEY
OVH Cloud Open-source LLM (GPT-OSS 120B) OVH_API_KEY
PostgreSQL Primary relational database DB_HOST, DB_PORT, DB_NAME, DB_USERNAME, DB_PASSWORD
Redis Job queue (Streams), caching REDIS_HOST, REDIS_PORT, REDIS_PASS
Google Cloud Storage File storage (uploads, transcripts) GCS_KEY_FILE, GCS_BUCKET_NAME
Azure Blob Storage Alternative file storage PICASSO_BLOB_URL
Auth0 Authentication and authorization AUTH0_DOMAIN, AUTH0_CLIENT_ID, AUTH0_AUDIENCE, AUTH0_ISSUER_URL
Stripe Payment processing, subscriptions STRIPE_API_KEY, STRIPE_WEBHOOK_SECRET, STRIPE_PUBLIC_KEY
Mailgun Email ingestion and sending MAILGUN_API_KEY, MAILGUN_DOMAIN, MAILGUN_FROM_EMAIL
Google OAuth Google sign-in GOOGLE_CLIENT_ID
LangSmith LLM tracing and debugging LANGCHAIN_ENDPOINT, LANGCHAIN_API_KEY, LANGCHAIN_PROJECT
PostHog Product analytics (frontend) Configured in admin app
Blinkin Inference Custom OCR, audio transcription IMAGE_EXPLAINER_ENDPOINT, AUDIO_TRANSCRIBER_ENDPOINT
Blinkin Studio (Picasso) Visual content creation PICASSO_URL, PICASSO_API_URL, PICASSO_APP_URL
Blinkin Houston Internal tooling HOUSTON_URL
C3 (Chatwoot) Customer messaging C3_DOMAIN, C3_AGENT_ACCOUNT_ID, C3_AGENT_TOKEN
Canvas Learning management integration CANVAS_DOMAIN, CANVAS_TOKEN
Google Document AI Advanced PDF/document OCR DOCUMENT_AI_PROJECT_ID, DOCUMENT_AI_LOCATION, DOCUMENT_AI_PROCESSOR_ID
Bosch Gemini Custom Bosch Gemini endpoint BOSCH_GEMINI_API_KEY, BOSCH_GEMINI_BASE_URL

11. Key Environment Variables Table

NestJS Server

Variable Purpose Example
GLOBAL_PREFIX API route prefix /ai
DB_HOST PostgreSQL host 10.100.10.3
DB_PORT PostgreSQL port 5432
DB_NAME Database name zweistein_dev
DB_USERNAME Database user (secret)
DB_PASSWORD Database password (secret)
STUDIO_DB_HOST Studio DB host (read-only)
STUDIO_DB_PORT Studio DB port 5432
STUDIO_DB_NAME Studio database name
REDIS_HOST Redis host redis-master.redis.svc.cluster.local
REDIS_PORT Redis port 6379
REDIS_PASS Redis password (secret)
REDIS_JOB_STREAM Job stream name stream:zweistein
REDIS_JOB_CONSUMER_GROUP Server's consumer group group:zweistein
REDIS_JOB_NOTIFICATION_STREAM Notification stream stream:zweistein:notifications
REDIS_JOB_NOTIFICATION_CONSUMER_GROUP Notification consumer group group:zweistein:notifications
SERVER_INSTANCE_ID Unique server instance ID server-instance-1
SERVER_JWT_SECRET JWT signing secret (secret)
PYTHON_SERVER_URL Query Engine URL http://queryengine-service
IMAGE_EXPLAINER_URL Image OCR service https://ocr.blinkin.io/explain-image
STORAGE_TYPE Cloud storage provider gcp
GCS_BUCKET_NAME GCS bucket name blinkin-ai-dev-storage
AUTH0_DOMAIN Auth0 domain dev-w248kl0wxwpsp7q3.eu.auth0.com
AUTH0_CLIENT_ID Auth0 client ID
AUTH0_AUDIENCE Auth0 audience
AUTH0_ISSUER_URL Auth0 issuer
STRIPE_API_KEY Stripe secret key (secret)
STRIPE_WEBHOOK_SECRET Stripe webhook secret (secret)
GOOGLE_CLIENT_ID Google OAuth client ID

Ingestion Worker

Variable Purpose Example
OPENAI_API_KEY OpenAI API key (secret)
EMBEDDING_MODEL Embedding model name text-embedding-3-large
REDIS_HOST Redis host redis-master.redis.svc.cluster.local
REDIS_PORT Redis port 6379
REDIS_PASS Redis password (secret)
TASKS_STREAM Job stream stream:zweistein
TASKS_GROUP Consumer group group:zweistein
NOTIFICATION_STREAM Notification stream stream:zweistein:notifications
PROCESSOR_ID Unique worker ID zweistein_processor_1
HOSTNAME Worker hostname (overrides PROCESSOR_ID)
GCS_KEY_FILE GCS service account key file
GCS_BUCKET_NAME GCS bucket name blinkin-ai-dev-storage
IMAGE_EXPLAINER_ENDPOINT OCR endpoint https://ocr.akjo.tech/explain-image
AUDIO_TRANSCRIBER_ENDPOINT Audio transcription endpoint https://ocr.akjo.tech/transcribe-audio
CONVERSATION_SUMMARIZER_MODEL Model for conversation summaries gpt-4o
QUERY_ENGINE_URL Query Engine for YouTube transcription http://queryengine-service
GOOGLE_API_KEY Google API key (File Search) (secret)
C3_DOMAIN C3 / Chatwoot domain
CANVAS_DOMAIN Canvas LMS domain
CANVAS_TOKEN Canvas API token (secret)

Query Engine

Variable Purpose Example
OPENAI_API_KEY OpenAI API key (secret)
ANTHROPIC_API_KEY Anthropic API key (secret)
GOOGLE_API_KEY Google API key (secret)
GROQ_API_KEY Groq API key (secret)
PERPLEXITY_API_KEY Perplexity API key (secret)
TAVILY_API_KEY Tavily search API key (secret)
EXA_API_KEY Exa search API key (secret)
FAL_KEY fal.ai API key (secret)
PEXELS_API_KEY Pexels stock photo key (secret)
ELEVENLABS_API_KEY_PART1 ElevenLabs TTS key (secret)
EMBEDDING_MODEL Embedding model text-embedding-3-large
LLM_SIMPLE Simple/cheap model gpt-4o-mini
LLM_ADVANCED Advanced model gpt-4o
LLM_DISPATCHER Dispatcher model gpt-5.2
LLM_SUPERVISOR Supervisor model claude_opus46
GEMINI_25_PRO Gemini 2.5 Pro model name gemini-2.5-pro
GEMINI_25_FLASH Gemini 2.5 Flash model name gemini-2.5-flash
GEMINI_3_PRO Gemini 3 Pro model name gemini-3-pro-preview
GEMINI_31_PRO Gemini 3.1 Pro model name gemini-3.1-pro-preview
GEMINI_3_FLASH Gemini 3 Flash model name gemini-3-flash-preview
GEMINI_20_FLASH Gemini 2.0 Flash model name gemini-2.0-flash
UVICORN_PORT FastAPI server port 8000
REDIS_HOST Redis host redis-master.redis.svc.cluster.local
REDIS_PORT Redis port 6379
REDIS_PASS Redis password (secret)
GCS_KEY_FILE GCS service account key
GCS_BUCKET_NAME GCS bucket blinkin-ai-dev-storage
QUOTA_SERVICE_URL Quota checking service URL
INTERNAL_API_KEY Internal wallet API key
MAILGUN_DOMAIN Mailgun domain
MAILGUN_API_KEY Mailgun API key (secret)
MAILGUN_FROM_EMAIL Sender email address
LANGCHAIN_TRACING_V2 Enable LangSmith tracing true
LANGCHAIN_ENDPOINT LangSmith endpoint https://eu.api.smith.langchain.com
LANGCHAIN_API_KEY LangSmith API key (secret)
LANGCHAIN_PROJECT LangSmith project name dev

12. Development & Deployment

Local Development Commands

Admin Panel:

cd admin
yarn install          # Install dependencies
yarn dev              # Start Vite dev server (https://localhost:5173)
yarn build            # Production build
yarn storybook        # Start Storybook (http://localhost:6006)

NestJS Server:

cd server
yarn install          # Install dependencies
yarn start:dev        # Start with watch mode (http://localhost:3000)
yarn start:debug      # Start with debug + watch
yarn build            # Production build
yarn start:prod       # Start production (node dist/main)
yarn migration:generate --name=MigrationName  # Generate TypeORM migration
yarn migration:run    # Run pending migrations
yarn migration:rollback # Rollback last migration
yarn test             # Run unit tests
yarn test:e2e         # Run end-to-end tests

Python Ingestion Worker:

cd python_server/ingestion_worker
poetry install        # Install dependencies
python main.py        # Start the Redis consumer worker
# or
./start.sh            # Production start script

Python Query Engine:

cd python_server/query_engine
poetry install        # Install dependencies
python main.py        # Start FastAPI on port 8000
# or
uvicorn main:app --host 0.0.0.0 --port 8000 --reload  # Development with reload
# or
./start_dev.sh        # Development start script

Docker Images

Service Image Path
Admin europe-west3-docker.pkg.dev/blinkin-ai-prod/blinkin-docker-registry/blinkin-ai-admin
Server europe-west3-docker.pkg.dev/blinkin-ai-prod/blinkin-docker-registry/blinkin-ai-server
Query Engine europe-west3-docker.pkg.dev/blinkin-ai-prod/blinkin-docker-registry/blinkin-ai-query-engine
Ingestion Worker europe-west3-docker.pkg.dev/blinkin-ai-prod/blinkin-docker-registry/blinkin-ai-data-ingestion-worker

Helm Chart Structure

helm/blinkin-ai/
  Chart.yaml                         # Helm chart metadata
  values.dev.yaml                    # Development environment values
  values.uat.yaml                    # UAT environment values
  values.prod.yaml                   # Production environment values
  templates/
    # Admin Panel
    admin.deployment.yaml            # Admin Deployment
    admin.nginx.configmap.yaml       # Nginx config for serving admin SPA
    admin.service.yaml               # Admin Service (ClusterIP)

    # NestJS Server
    server.deployment.yaml           # Server Deployment
    server.configmap.yaml            # Server environment ConfigMap
    server.service.yaml              # Server Service (ClusterIP)

    # Query Engine
    queryengine.deployment.yaml      # Query Engine Deployment
    queryengine.configmap.yaml       # Query Engine environment ConfigMap
    queryengine.service.yaml         # Query Engine Service (ClusterIP)

    # Ingestion Worker
    ingestionworker.statefulset.yaml # Ingestion Worker StatefulSet (3 replicas)
    ingestionworker.configmap.yaml   # Ingestion Worker environment ConfigMap

    # Cluster / Ingress
    cluster.ingress.yaml             # GKE Ingress with managed certificate
    cluster.managedcertificate.yaml  # Google-managed TLS certificate
    cluster.frontendconfig.yaml      # Frontend config (HTTP→HTTPS redirect)

Deployment Topology

Service Type Replicas (dev) Notes
Admin Deployment 1 Nginx serving static React build
Server Deployment 1 NestJS on port 3000
Query Engine Deployment 1 FastAPI on port 8000
Ingestion Worker StatefulSet 3 Redis consumer group (parallel processing)

Kubernetes Secrets

Secret Name Keys Purpose
postgres-secrets DB_USERNAME, DB_PASSWORD PostgreSQL credentials
redis-secrets REDIS_PASS Redis password
query-engine-secrets OPENAI_API_KEY, PERPLEXITY_API_KEY, FAL_KEY, PEXELS_API_KEY, TAVILY_API_KEY, GROQ_API_KEY, EXA_API_KEY, ANTHROPIC_API_KEY API keys for query engine
google-api-secrets GOOGLE_API_KEY Google API key
gcs-keyfile-dev gcs-key.json GCS service account key file
canvas-secrets CANVAS_TOKEN Canvas LMS token

Environments

Environment Domain Helm Values
Development app-dev.blinkin.io values.dev.yaml
UAT app-uat.blinkin.io values.uat.yaml
Production app.blinkin.io values.prod.yaml