Skip to content

Infrastructure & Deployment

How the Blinkin platform is deployed, configured, and managed across environments. Source of truth: the blinkin-gitops repository.


1. Deployment Architecture

The platform runs on Kubernetes (both GKE on GCP and EKS on AWS) with NGINX Ingress Controllers routing external traffic to internal services. Each environment has its own cluster and namespace.

graph TB
    subgraph External
        User([End User / Browser])
        Auth0([Auth0 SSO])
        LLM_APIs([LLM APIs<br/>OpenAI / Anthropic / Groq / Perplexity])
        Deepgram([Deepgram STT])
        Mailgun([Mailgun Email])
        Stripe([Stripe Payments])
        Inference([Inference Server<br/>inference.blinkin.io])
        GCS([GCS / Azure Blob<br/>File Storage])
    end

    subgraph Kubernetes Cluster
        LB[Load Balancer]
        Ingress[NGINX Ingress Controller<br/>+ cert-manager TLS]

        subgraph studio-ns [Studio Namespace]
            Houston[houston<br/>Next.js SSR<br/>port 3000]
            PicassoFE[studio-fe<br/>Picasso Editor<br/>port 80]
            StudioAPI[studio-api<br/>NestJS Backend<br/>port 3000]
            SuperAdmin[studio-superadmin<br/>Admin Panel<br/>port 3000]
        end

        subgraph ai-ns [AI Namespace]
            Admin[admin<br/>Zweistein Admin SPA<br/>NGINX port 80]
            Server[server<br/>Zweistein NestJS<br/>port 3000]
            QueryEngine[queryengine<br/>Python FastAPI<br/>port 8000]
            IngestionWorker[ingestionworker<br/>Python StatefulSet<br/>3 replicas]
        end

        subgraph data-ns [Data Layer]
            Postgres[(PostgreSQL<br/>Cloud SQL / RDS)]
            Redis[(Redis<br/>In-cluster)]
            Qdrant[(Qdrant<br/>Vector DB)]
        end
    end

    User --> LB --> Ingress
    Ingress -->|/ path| Houston
    Ingress -->|/studio path| PicassoFE
    Ingress -->|studio-api-*.blinkin.io| StudioAPI
    Ingress -->|/ai path| Admin
    Ingress -->|/ai/api path| Server
    Ingress -->|/admin path| SuperAdmin

    Server --> QueryEngine
    Server --> Postgres
    Server --> Redis
    QueryEngine --> Qdrant
    QueryEngine --> Redis
    QueryEngine --> LLM_APIs
    QueryEngine --> GCS
    IngestionWorker --> Qdrant
    IngestionWorker --> Redis
    IngestionWorker --> GCS
    IngestionWorker --> Inference
    StudioAPI --> Postgres
    StudioAPI --> Redis
    StudioAPI --> GCS
    Server --> Auth0
    Server --> Mailgun
    Server --> Stripe
    Server --> Deepgram

Key points

Aspect Detail
Cluster Provider GKE (GCP, europe-west3) for GCP track; EKS (AWS, eu-central-1) for AWS track
Container Registry GCP: europe-west3-docker.pkg.dev/blinkin-ai-prod/blinkin-docker-registry/
AWS: 598323198652.dkr.ecr.eu-central-1.amazonaws.com/ (dev)
AWS: 306630622817.dkr.ecr.eu-central-1.amazonaws.com/ (prod)
Ingress NGINX Ingress Controller with cert-manager for automatic Let's Encrypt TLS
Secrets Management ExternalSecrets Operator syncing from GCP Secret Manager (ClusterSecretStore: gcp-store)
Database GCP: Cloud SQL Private IP (10.100.10.3 dev, 10.50.100.3 uat, 10.20.100.3 prod)
AWS: RDS (blinkin-*-postgres.*.eu-central-1.rds.amazonaws.com)

2. Environment Overview

The platform is deployed across three environments. Currently, the AWS track is the primary active deployment for DEV and PROD, while the GCP track exists for GKE-based setups and UAT.

graph LR
    subgraph DEV [DEV Environment]
        D_AWS[AWS EKS Cluster<br/>Primary Active]
        D_GCP[GCP GKE Cluster<br/>Secondary]
    end

    subgraph UAT [UAT Environment]
        U_GCP[GCP GKE Cluster<br/>AI only]
    end

    subgraph PROD [PROD Environment]
        P_AWS[AWS EKS Cluster<br/>Primary Active]
        P_GCP[GCP GKE Cluster<br/>Scaled to 0]
    end

    DevPush[Dev Branch Push] -->|Auto-deploy| DEV
    ManualRelease[Manual Release] -->|Azure Releases| UAT
    ProdRelease[Production Release] -->|Controlled deploy| PROD

Environment URL Table

Service DEV (AWS) UAT (GCP) PROD (AWS)
Houston (flows) app-dev.blinkin.io app-uat.blinkin.io app.blinkin.io
Picasso Editor app-dev.blinkin.io/studio (not in UAT) app.blinkin.io/studio
Studio API studio-api-dev.blinkin.io (not in UAT) studio-api.blinkin.io
Zweistein Admin app-dev.blinkin.io/ai app-uat.blinkin.io/ai app.blinkin.io/ai
Zweistein Server API app-dev.blinkin.io/ai/api app-uat.blinkin.io/ai/api app.blinkin.io/ai/api
Superadmin app-dev.blinkin.io/admin (not in UAT) app.blinkin.io/admin
WebSocket app-dev.blinkin.io/socket.io app-uat.blinkin.io/socket.io app.blinkin.io/socket.io

PROD (AWS) additionally serves tenant-specific domains: blinkin.blinkin.io, bosch.blinkin.io, demo.blinkin.io, emmy.blinkin.io, fortinet.blinkin.io, fraunhofer.blinkin.io, heinemann.blinkin.io, huberranner.blinkin.io, tapgig.blinkin.io, telekom.blinkin.io, wilo.blinkin.io.

Deployment Triggers

Environment Trigger Branch
DEV Auto-deploy on push dev
UAT Manual deploy via Azure Releases main / release tag
PROD Controlled deploy main / release tag

3. GitOps Repository Structure

The blinkin-gitops repository is the single source of truth for all Kubernetes deployment configurations. It is organized by environment first, then by service group.

blinkin-gitops/
|
+-- dev/                              # DEV environment
|   +-- ai/                           # Zweistein AI services (GCP cluster)
|   |   +-- blinkin-ai/               #   Helm chart
|   |   |   +-- Chart.yaml            #     Chart metadata (v0.1.6)
|   |   |   +-- templates/            #     K8s resource templates
|   |   +-- values.dev.yaml           #   Environment-specific values
|   |
|   +-- ai-aws/                       # Zweistein AI services (AWS cluster)
|   |   +-- blinkin-ai/               #   Same Helm chart structure
|   |   +-- values.dev.yaml           #   AWS-specific values (ECR images, RDS host)
|   |
|   +-- studio/                       # Studio services (GCP cluster)
|   |   +-- fe/                       #   Picasso Editor frontend
|   |   |   +-- studio-fe/Chart.yaml
|   |   |   +-- studio-fe/templates/
|   |   |   +-- values.dev.yaml
|   |   +-- blinks/                   #   Houston frontend
|   |   |   +-- houston/Chart.yaml
|   |   |   +-- houston/templates/
|   |   |   +-- values.dev.yaml
|   |   +-- api/                      #   Studio API backend
|   |       +-- studio-api/Chart.yaml
|   |       +-- studio-api/templates/
|   |       +-- values.dev.yaml
|   |
|   +-- studio-aws/                   # Studio services (AWS cluster)
|       +-- fe/                       #   Studio FE (same chart, ECR images)
|       +-- blinks/                   #   Houston (ECR images)
|       +-- api/                      #   Studio API (ECR images)
|       +-- superadmin/               #   Superadmin panel (AWS only)
|
+-- uat/                              # UAT environment
|   +-- ai/                           # Only AI services deployed to UAT
|       +-- values.uat.yaml
|
+-- prod/                             # PROD environment
|   +-- ai/                           # AI services (GCP) - scaled to 0 replicas
|   +-- ai-aws/                       # AI services (AWS) - PRIMARY ACTIVE
|   +-- studio/                       # Studio services (GCP) - scaled to 0 replicas
|   +-- studio-aws/                   # Studio services (AWS) - PRIMARY ACTIVE
|
+-- README.md

What "ai" and "studio" groups mean

Group Meaning Services Included
ai Zweistein AI/ML platform admin (SPA), server (NestJS), queryengine (Python), ingestionworker (Python)
studio Picasso content platform studio-fe (Picasso Editor), houston (Flow viewer), studio-api (NestJS backend)
-aws suffix Same services, but configured for AWS infrastructure Uses ECR images, RDS endpoints, AWS-specific secrets

Note: The GCP-track prod deployments (prod/ai/ and prod/studio/) have all replica counts set to 0, meaning they are effectively dormant. The active production runs entirely on the AWS track (prod/ai-aws/ and prod/studio-aws/).


4. Helm Chart Structure

Zweistein AI Chart (blinkin-ai)

Chart version: 0.1.6 (gitops dev) / 0.1.5 (source repo) App version: 1.16.0

Template File K8s Resource Purpose
admin.deployment.yaml Deployment Zweistein Admin SPA (NGINX serving React app)
admin.service.yaml Service ClusterIP on port 80 -> container 80
admin.nginx.configmap.yaml ConfigMap NGINX config for SPA routing + CSP headers
server.deployment.yaml Deployment Zweistein NestJS backend server
server.service.yaml Service ClusterIP on port 80 -> container 3000
server.configmap.yaml ConfigMap Server env vars from values file
queryengine.deployment.yaml Deployment Python FastAPI query engine
queryengine.service.yaml Service ClusterIP on port 80 -> container 8000
queryengine.configmap.yaml ConfigMap Query engine env vars from values file
ingestionworker.statefulset.yaml StatefulSet Python ingestion workers (3 replicas)
ingestionworker.configmap.yaml ConfigMap Ingestion worker env vars from values file
cluster.ingress.yaml Ingress NGINX Ingress routing /ai, /ai/api, /socket.io
cluster.frontendconfig.yaml FrontendConfig GKE HTTPS redirect config
externalsecrets.yaml ExternalSecret Syncs secrets from GCP Secret Manager
externalsecrets-force-sync.job.yaml Job Forces re-sync of external secrets

Picasso FE Charts (studio-fe and houston)

Chart version: 0.1.0 App version: 1.16.0

Template File K8s Resource Purpose
deployment.yaml Deployment Frontend app container
service.yaml Service ClusterIP on port 80 -> container port
ingress.yaml Ingress NGINX Ingress with TLS
serviceaccount.yaml ServiceAccount Pod identity (creation disabled)
_helpers.tpl Helper Naming conventions and labels

Studio API Chart (studio-api)

Chart version: 0.1.0 App version: 1.0.0

Template File K8s Resource Purpose
deployment.yaml Deployment NestJS API container
service.yaml Service ClusterIP on port 80 -> container 3000
ingress.yaml Ingress NGINX Ingress with TLS
migrations-job.yaml Job (Helm Hook) Runs npm run migration:run before deployments
serviceaccount.yaml ServiceAccount Pod identity (creation disabled)
_helpers.tpl Helper Naming conventions and labels

5. Service Connectivity Map

This diagram shows how services communicate within the Kubernetes cluster.

graph LR
    subgraph Ingress Layer
        NGINX[NGINX Ingress]
    end

    subgraph Studio Services
        Houston[houston<br/>:3000]
        StudioFE[studio-fe<br/>:80]
        StudioAPI[studio-api<br/>:3000]
        SuperAdmin[studio-superadmin<br/>:3000]
    end

    subgraph AI Services
        Admin[admin<br/>:80 NGINX]
        Server[server<br/>:3000]
        QE[queryengine<br/>:8000]
        IW[ingestionworker<br/>x3 StatefulSet]
    end

    subgraph Data Stores
        PG[(PostgreSQL<br/>:5432)]
        Redis[(Redis<br/>:6379)]
        Qdrant[(Qdrant<br/>:6333)]
    end

    subgraph External APIs
        LLMs[OpenAI / Anthropic<br/>Groq / Perplexity]
        InferenceSvc[inference.blinkin.io]
    end

    %% Ingress routing
    NGINX -->|"/ (HTTP)"| Houston
    NGINX -->|"/studio (HTTP)"| StudioFE
    NGINX -->|"/ai (HTTP)"| Admin
    NGINX -->|"/ai/api (HTTP)"| Server
    NGINX -->|"/admin (HTTP)"| SuperAdmin
    NGINX -->|"studio-api-*.blinkin.io (HTTP)"| StudioAPI
    NGINX -->|"/socket.io (WS)"| Server

    %% Server -> internal services
    Server -->|"HTTP :80"| QE
    Server -->|"TCP :5432"| PG
    Server -->|"TCP :6379<br/>Streams + Pub/Sub"| Redis

    %% Query Engine connections
    QE -->|"TCP :6333<br/>gRPC/HTTP"| Qdrant
    QE -->|"TCP :6379"| Redis
    QE -->|"HTTPS"| LLMs

    %% Ingestion Worker connections
    IW -->|"TCP :6333"| Qdrant
    IW -->|"TCP :6379<br/>Stream consumer"| Redis
    IW -->|"HTTPS"| InferenceSvc

    %% Studio API connections
    StudioAPI -->|"TCP :5432"| PG
    StudioAPI -->|"TCP :6379"| Redis

Key Communication Patterns

From To Protocol Purpose
server queryengine-service HTTP :80 LLM queries, agent execution, RAG
server PostgreSQL TCP :5432 Database reads/writes (TypeORM)
server Redis TCP :6379 Job streams (stream:zweistein), notifications, pub/sub
queryengine Qdrant TCP :6333 Vector similarity search for embeddings
queryengine Redis TCP :6379 Caching and state
queryengine LLM APIs HTTPS OpenAI, Anthropic, Groq, Perplexity calls
ingestionworker Redis TCP :6379 Consumes from stream:zweistein (consumer group)
ingestionworker Qdrant TCP :6333 Writes document embeddings
ingestionworker Inference HTTPS Image explanation, audio transcription
studio-api PostgreSQL TCP :5432 Flow/tenant data (TypeORM)
studio-api Redis TCP :6379 Bull queues, caching

Internal DNS Names

Services discover each other via Kubernetes DNS:

Service Internal DNS Port
Zweistein Server server-service 80 (-> 3000)
Zweistein Admin admin-service 80 (-> 80)
Query Engine queryengine-service 80 (-> 8000)
Studio API studio-api 80 (-> 3000)
Houston houston 80 (-> 3000)
Studio FE studio-fe 80 (-> 80)
Redis redis-master.redis.svc.cluster.local 6379
Qdrant qdrant.qdrant.svc.cluster.local 6333

6. CI/CD Pipeline

graph TD
    Dev[Developer pushes<br/>to dev branch] --> Pipeline[Azure DevOps Pipeline]
    Pipeline --> Build[Build Docker image]
    Build --> Push[Push to Container Registry<br/>GCP Artifact Registry or AWS ECR]
    Push --> UpdateValues[Update image tag in<br/>blinkin-gitops values file]
    UpdateValues --> GitCommit[Commit + push to<br/>blinkin-gitops repo]
    GitCommit --> ArgoCD[ArgoCD / Flux detects<br/>git changes]
    ArgoCD --> HelmUpgrade[Helm upgrade/install<br/>in target namespace]
    HelmUpgrade --> K8s[Kubernetes applies<br/>new Deployment spec]
    K8s --> Rolling[Rolling update<br/>replaces pods]

    subgraph Pre-deploy Hooks
        MigrationJob[Migration Job<br/>npm run migration:run]
    end

    HelmUpgrade --> MigrationJob
    MigrationJob -->|success| K8s

Pipeline Flow Details

  1. Developer pushes code to the dev branch of any service repo (picasso-fe, studio-api, zweistein)
  2. Azure DevOps Pipeline picks up the push and builds a Docker image with an auto-incremented version tag (e.g., 1.1.455-dev)
  3. Image is pushed to the container registry:
  4. GCP track: europe-west3-docker.pkg.dev/blinkin-ai-prod/blinkin-docker-registry/
  5. AWS track: 598323198652.dkr.ecr.eu-central-1.amazonaws.com/ (dev) or 306630622817.dkr.ecr.eu-central-1.amazonaws.com/ (prod)
  6. GitOps values file is updated with the new image tag (e.g., in dev/ai-aws/values.dev.yaml)
  7. GitOps controller (ArgoCD or equivalent) detects the commit in blinkin-gitops and runs a Helm upgrade
  8. Helm hooks execute pre-deploy tasks like database migrations (migrations-job.yaml runs npm run migration:run for studio-api)
  9. Kubernetes performs a rolling update, replacing old pods with new ones

Image Tagging Convention

Convention Example Meaning
X.Y.Z-dev 1.1.455-dev Development build, auto-incremented
X.Y.Z 0.5.52 Release build (UAT/PROD)
latest latest Mutable tag (used only for superadmin)

7. Docker Images

All images from the DEV AWS (most actively updated) environment values files:

AI Services (Zweistein)

Service Image Repository Version (DEV AWS) Version (PROD AWS)
admin blinkin-ai/admin 0.4.762-dev 0.5.52
server blinkin-ai/server 0.4.240-dev 0.5.37
queryengine blinkin-ai/query-engine 0.4.232-dev 0.5.71
ingestionworker blinkin-ai/data-ingestion-worker 0.4.29-dev 0.5.3

Studio Services

Service Image Repository Version (DEV AWS) Version (PROD AWS)
studio-fe blinkin-studio/studio-fe 1.1.455-dev 1.1.40
houston blinkin-studio/houston 1.1.156-dev 1.1.32
studio-api blinkin-studio/studio-api 1.1.142-dev 1.1.32
superadmin blinkin-studio/studio-superadmin latest latest

Utility Images

Image Version Purpose
db-redis-connection-checker 1.0.0 Init container that waits for PostgreSQL + Redis readiness before migrations

8. Key Kubernetes Resources

Deployments

Name Namespace Group Container Port Replicas (DEV) Replicas (PROD AWS) Health Check
admin ai 80 1 1 N/A (static SPA)
server ai 3000 1 1 /ai/healthz
queryengine ai 8000 1 1 /healthz
studio-fe studio 80 1 1 /healthz
houston studio 3000 1 1 /healthz
studio-api studio 3000 1 1 /health
studio-superadmin studio-aws 3000 1 1 /health

StatefulSets

Name Namespace Group Replicas (DEV) Replicas (PROD AWS) Purpose
ingestionworker ai 3 3 Document ingestion with stable pod identity for Redis Stream consumers

Services (ClusterIP)

Service Name Target Port Exposed Port Protocol
admin-service 80 80 TCP/HTTP
server-service 3000 80 TCP/HTTP
queryengine-service 8000 80 TCP/HTTP
studio-fe 80 80 TCP/HTTP
houston 3000 80 TCP/HTTP
studio-api 3000 80 TCP/HTTP
studio-superadmin 3000 80 TCP/HTTP

ConfigMaps

ConfigMap Name Source Chart Purpose
server-config-env blinkin-ai Zweistein server environment variables
queryengine-config-env blinkin-ai Query engine environment variables
ingestionworker-config-env blinkin-ai Ingestion worker environment variables
admin-nginx-config blinkin-ai NGINX configuration for admin SPA routing

Secrets (via ExternalSecrets Operator)

ExternalSecret Name Target K8s Secret Keys Synced
gcp-keyfile-external-secrets gcs-keyfile-{env}-es gcs-key.json (GCS service account)
canvas-external-secrets canvas-secrets-es CANVAS_TOKEN
redis-external-secrets redis-secrets-es REDIS_PASS
external-api-external-secrets external-api-secrets-es ANTHROPIC_API_KEY, EXA_API_KEY, FAL_KEY, GROQ_API_KEY, OPENAI_API_KEY, PERPLEXITY_API_KEY, PEXELS_API_KEY, TAVILY_API_KEY, GOOGLE_API_KEY
postgres-external-secrets postgres-secrets-es DB_USERNAME, DB_PASSWORD, STUDIO_DB_USERNAME, STUDIO_DB_PASSWORD

Ingress Rules (DEV AWS example)

Host Path Backend Service Port
app-dev.blinkin.io / houston 80
app-dev.blinkin.io /studio studio-fe 80
app-dev.blinkin.io /ai admin-service 80
app-dev.blinkin.io /ai/api server-service 80
app-dev.blinkin.io /admin studio-superadmin 80
app-dev.blinkin.io /socket.io server-service 80
studio-api-dev.blinkin.io / studio-api 80

9. Redis Streams Architecture

Redis is used heavily for async job processing between the NestJS server and Python workers.

Stream Name Producer Consumer Group Consumer
stream:zweistein server (NestJS) group:zweistein ingestionworker (Python)
stream:zweistein:notifications ingestionworker (Python) group:zweistein:notifications server (NestJS)

The server publishes ingestion tasks (document parsing, embedding creation) to the main stream. Ingestion workers consume from this stream using consumer groups (allowing multiple workers to share the load). When a worker finishes, it publishes a notification back to the notification stream, which the server consumes to update the UI in real-time.


10. External Service Dependencies

Service Used By Purpose
Auth0 server SSO authentication (domain: sso-dev.blinkin.io / picasso-auth-prod.eu.auth0.com)
OpenAI queryengine, ingestionworker LLM completions (GPT-4o) and embeddings (text-embedding-3-large)
Anthropic queryengine Claude LLM completions
Groq queryengine Fast LLM inference
Perplexity queryengine Search-augmented LLM
Deepgram server Speech-to-text transcription
Mailgun server, queryengine Inbound/outbound email (domain: ai.blinkin.io)
Stripe server Payment processing and subscriptions
Tavily queryengine Web search API for agents
Exa queryengine Semantic web search for agents
Fal queryengine AI image generation
Pexels queryengine Stock image search
LangSmith queryengine LLM tracing and observability
PostHog server (PROD) Product analytics
GCS server, queryengine, ingestionworker Cloud file storage (bucket: blinkin-ai-{env}-storage)
Azure Blob studio-api Picasso media storage
Inference Server ingestionworker, server Custom image/audio processing (inference.blinkin.io)