Infrastructure & Deployment¶

How the Blinkin platform is deployed, configured, and managed across environments. Source of truth: the blinkin-gitops repository.

1. Deployment Architecture¶

The platform runs on Kubernetes (both GKE on GCP and EKS on AWS) with NGINX Ingress Controllers routing external traffic to internal services. Each environment has its own cluster and namespace.

graph TB
    subgraph External
        User([End User / Browser])
        Auth0([Auth0 SSO])
        LLM_APIs([LLM APIs<br/>OpenAI / Anthropic / Groq / Perplexity])
        Deepgram([Deepgram STT])
        Mailgun([Mailgun Email])
        Stripe([Stripe Payments])
        Inference([Inference Server<br/>inference.blinkin.io])
        GCS([GCS / Azure Blob<br/>File Storage])
    end

    subgraph Kubernetes Cluster
        LB[Load Balancer]
        Ingress[NGINX Ingress Controller<br/>+ cert-manager TLS]

        subgraph studio-ns [Studio Namespace]
            Houston[houston<br/>Next.js SSR<br/>port 3000]
            PicassoFE[studio-fe<br/>Picasso Editor<br/>port 80]
            StudioAPI[studio-api<br/>NestJS Backend<br/>port 3000]
            SuperAdmin[studio-superadmin<br/>Admin Panel<br/>port 3000]
        end

        subgraph ai-ns [AI Namespace]
            Admin[admin<br/>Zweistein Admin SPA<br/>NGINX port 80]
            Server[server<br/>Zweistein NestJS<br/>port 3000]
            QueryEngine[queryengine<br/>Python FastAPI<br/>port 8000]
            IngestionWorker[ingestionworker<br/>Python StatefulSet<br/>3 replicas]
        end

        subgraph data-ns [Data Layer]
            Postgres[(PostgreSQL<br/>Cloud SQL / RDS)]
            Redis[(Redis<br/>In-cluster)]
            Qdrant[(Qdrant<br/>Vector DB)]
        end
    end

    User --> LB --> Ingress
    Ingress -->|/ path| Houston
    Ingress -->|/studio path| PicassoFE
    Ingress -->|studio-api-*.blinkin.io| StudioAPI
    Ingress -->|/ai path| Admin
    Ingress -->|/ai/api path| Server
    Ingress -->|/admin path| SuperAdmin

    Server --> QueryEngine
    Server --> Postgres
    Server --> Redis
    QueryEngine --> Qdrant
    QueryEngine --> Redis
    QueryEngine --> LLM_APIs
    QueryEngine --> GCS
    IngestionWorker --> Qdrant
    IngestionWorker --> Redis
    IngestionWorker --> GCS
    IngestionWorker --> Inference
    StudioAPI --> Postgres
    StudioAPI --> Redis
    StudioAPI --> GCS
    Server --> Auth0
    Server --> Mailgun
    Server --> Stripe
    Server --> Deepgram

Key points¶

Aspect	Detail
Cluster Provider	GKE (GCP, `europe-west3`) for GCP track; EKS (AWS, `eu-central-1`) for AWS track
Container Registry	GCP: `europe-west3-docker.pkg.dev/blinkin-ai-prod/blinkin-docker-registry/` AWS: `598323198652.dkr.ecr.eu-central-1.amazonaws.com/` (dev) AWS: `306630622817.dkr.ecr.eu-central-1.amazonaws.com/` (prod)
Ingress	NGINX Ingress Controller with `cert-manager` for automatic Let's Encrypt TLS
Secrets Management	ExternalSecrets Operator syncing from GCP Secret Manager (`ClusterSecretStore: gcp-store`)
Database	GCP: Cloud SQL Private IP (`10.100.10.3` dev, `10.50.100.3` uat, `10.20.100.3` prod) AWS: RDS (`blinkin--postgres..eu-central-1.rds.amazonaws.com`)

2. Environment Overview¶

The platform is deployed across three environments. Currently, the AWS track is the primary active deployment for DEV and PROD, while the GCP track exists for GKE-based setups and UAT.

graph LR
    subgraph DEV [DEV Environment]
        D_AWS[AWS EKS Cluster<br/>Primary Active]
        D_GCP[GCP GKE Cluster<br/>Secondary]
    end

    subgraph UAT [UAT Environment]
        U_GCP[GCP GKE Cluster<br/>AI only]
    end

    subgraph PROD [PROD Environment]
        P_AWS[AWS EKS Cluster<br/>Primary Active]
        P_GCP[GCP GKE Cluster<br/>Scaled to 0]
    end

    DevPush[Dev Branch Push] -->|Auto-deploy| DEV
    ManualRelease[Manual Release] -->|Azure Releases| UAT
    ProdRelease[Production Release] -->|Controlled deploy| PROD

Environment URL Table¶

Service	DEV (AWS)	UAT (GCP)	PROD (AWS)
Houston (flows)	`app-dev.blinkin.io`	`app-uat.blinkin.io`	`app.blinkin.io`
Picasso Editor	`app-dev.blinkin.io/studio`	(not in UAT)	`app.blinkin.io/studio`
Studio API	`studio-api-dev.blinkin.io`	(not in UAT)	`studio-api.blinkin.io`
Zweistein Admin	`app-dev.blinkin.io/ai`	`app-uat.blinkin.io/ai`	`app.blinkin.io/ai`
Zweistein Server API	`app-dev.blinkin.io/ai/api`	`app-uat.blinkin.io/ai/api`	`app.blinkin.io/ai/api`
Superadmin	`app-dev.blinkin.io/admin`	(not in UAT)	`app.blinkin.io/admin`
WebSocket	`app-dev.blinkin.io/socket.io`	`app-uat.blinkin.io/socket.io`	`app.blinkin.io/socket.io`

PROD (AWS) additionally serves tenant-specific domains: blinkin.blinkin.io, bosch.blinkin.io, demo.blinkin.io, emmy.blinkin.io, fortinet.blinkin.io, fraunhofer.blinkin.io, heinemann.blinkin.io, huberranner.blinkin.io, tapgig.blinkin.io, telekom.blinkin.io, wilo.blinkin.io.

Deployment Triggers¶

Environment	Trigger	Branch
DEV	Auto-deploy on push	`dev`
UAT	Manual deploy via Azure Releases	`main` / release tag
PROD	Controlled deploy	`main` / release tag

3. GitOps Repository Structure¶

The blinkin-gitops repository is the single source of truth for all Kubernetes deployment configurations. It is organized by environment first, then by service group.

blinkin-gitops/
|
+-- dev/                              # DEV environment
|   +-- ai/                           # Zweistein AI services (GCP cluster)
|   |   +-- blinkin-ai/               #   Helm chart
|   |   |   +-- Chart.yaml            #     Chart metadata (v0.1.6)
|   |   |   +-- templates/            #     K8s resource templates
|   |   +-- values.dev.yaml           #   Environment-specific values
|   |
|   +-- ai-aws/                       # Zweistein AI services (AWS cluster)
|   |   +-- blinkin-ai/               #   Same Helm chart structure
|   |   +-- values.dev.yaml           #   AWS-specific values (ECR images, RDS host)
|   |
|   +-- studio/                       # Studio services (GCP cluster)
|   |   +-- fe/                       #   Picasso Editor frontend
|   |   |   +-- studio-fe/Chart.yaml
|   |   |   +-- studio-fe/templates/
|   |   |   +-- values.dev.yaml
|   |   +-- blinks/                   #   Houston frontend
|   |   |   +-- houston/Chart.yaml
|   |   |   +-- houston/templates/
|   |   |   +-- values.dev.yaml
|   |   +-- api/                      #   Studio API backend
|   |       +-- studio-api/Chart.yaml
|   |       +-- studio-api/templates/
|   |       +-- values.dev.yaml
|   |
|   +-- studio-aws/                   # Studio services (AWS cluster)
|       +-- fe/                       #   Studio FE (same chart, ECR images)
|       +-- blinks/                   #   Houston (ECR images)
|       +-- api/                      #   Studio API (ECR images)
|       +-- superadmin/               #   Superadmin panel (AWS only)
|
+-- uat/                              # UAT environment
|   +-- ai/                           # Only AI services deployed to UAT
|       +-- values.uat.yaml
|
+-- prod/                             # PROD environment
|   +-- ai/                           # AI services (GCP) - scaled to 0 replicas
|   +-- ai-aws/                       # AI services (AWS) - PRIMARY ACTIVE
|   +-- studio/                       # Studio services (GCP) - scaled to 0 replicas
|   +-- studio-aws/                   # Studio services (AWS) - PRIMARY ACTIVE
|
+-- README.md

What "ai" and "studio" groups mean¶

Group	Meaning	Services Included
ai	Zweistein AI/ML platform	`admin` (SPA), `server` (NestJS), `queryengine` (Python), `ingestionworker` (Python)
studio	Picasso content platform	`studio-fe` (Picasso Editor), `houston` (Flow viewer), `studio-api` (NestJS backend)
-aws suffix	Same services, but configured for AWS infrastructure	Uses ECR images, RDS endpoints, AWS-specific secrets

Note: The GCP-track prod deployments (prod/ai/ and prod/studio/) have all replica counts set to 0, meaning they are effectively dormant. The active production runs entirely on the AWS track (prod/ai-aws/ and prod/studio-aws/).

4. Helm Chart Structure¶

Zweistein AI Chart (`blinkin-ai`)¶

Chart version: 0.1.6 (gitops dev) / 0.1.5 (source repo) App version: 1.16.0

Template File	K8s Resource	Purpose
`admin.deployment.yaml`	Deployment	Zweistein Admin SPA (NGINX serving React app)
`admin.service.yaml`	Service	ClusterIP on port 80 -> container 80
`admin.nginx.configmap.yaml`	ConfigMap	NGINX config for SPA routing + CSP headers
`server.deployment.yaml`	Deployment	Zweistein NestJS backend server
`server.service.yaml`	Service	ClusterIP on port 80 -> container 3000
`server.configmap.yaml`	ConfigMap	Server env vars from values file
`queryengine.deployment.yaml`	Deployment	Python FastAPI query engine
`queryengine.service.yaml`	Service	ClusterIP on port 80 -> container 8000
`queryengine.configmap.yaml`	ConfigMap	Query engine env vars from values file
`ingestionworker.statefulset.yaml`	StatefulSet	Python ingestion workers (3 replicas)
`ingestionworker.configmap.yaml`	ConfigMap	Ingestion worker env vars from values file
`cluster.ingress.yaml`	Ingress	NGINX Ingress routing `/ai`, `/ai/api`, `/socket.io`
`cluster.frontendconfig.yaml`	FrontendConfig	GKE HTTPS redirect config
`externalsecrets.yaml`	ExternalSecret	Syncs secrets from GCP Secret Manager
`externalsecrets-force-sync.job.yaml`	Job	Forces re-sync of external secrets

Picasso FE Charts (`studio-fe` and `houston`)¶

Chart version: 0.1.0 App version: 1.16.0

Template File	K8s Resource	Purpose
`deployment.yaml`	Deployment	Frontend app container
`service.yaml`	Service	ClusterIP on port 80 -> container port
`ingress.yaml`	Ingress	NGINX Ingress with TLS
`serviceaccount.yaml`	ServiceAccount	Pod identity (creation disabled)
`_helpers.tpl`	Helper	Naming conventions and labels

Studio API Chart (`studio-api`)¶

Chart version: 0.1.0 App version: 1.0.0

Template File	K8s Resource	Purpose
`deployment.yaml`	Deployment	NestJS API container
`service.yaml`	Service	ClusterIP on port 80 -> container 3000
`ingress.yaml`	Ingress	NGINX Ingress with TLS
`migrations-job.yaml`	Job (Helm Hook)	Runs `npm run migration:run` before deployments
`serviceaccount.yaml`	ServiceAccount	Pod identity (creation disabled)
`_helpers.tpl`	Helper	Naming conventions and labels

5. Service Connectivity Map¶

This diagram shows how services communicate within the Kubernetes cluster.

graph LR
    subgraph Ingress Layer
        NGINX[NGINX Ingress]
    end

    subgraph Studio Services
        Houston[houston<br/>:3000]
        StudioFE[studio-fe<br/>:80]
        StudioAPI[studio-api<br/>:3000]
        SuperAdmin[studio-superadmin<br/>:3000]
    end

    subgraph AI Services
        Admin[admin<br/>:80 NGINX]
        Server[server<br/>:3000]
        QE[queryengine<br/>:8000]
        IW[ingestionworker<br/>x3 StatefulSet]
    end

    subgraph Data Stores
        PG[(PostgreSQL<br/>:5432)]
        Redis[(Redis<br/>:6379)]
        Qdrant[(Qdrant<br/>:6333)]
    end

    subgraph External APIs
        LLMs[OpenAI / Anthropic<br/>Groq / Perplexity]
        InferenceSvc[inference.blinkin.io]
    end

    %% Ingress routing
    NGINX -->|"/ (HTTP)"| Houston
    NGINX -->|"/studio (HTTP)"| StudioFE
    NGINX -->|"/ai (HTTP)"| Admin
    NGINX -->|"/ai/api (HTTP)"| Server
    NGINX -->|"/admin (HTTP)"| SuperAdmin
    NGINX -->|"studio-api-*.blinkin.io (HTTP)"| StudioAPI
    NGINX -->|"/socket.io (WS)"| Server

    %% Server -> internal services
    Server -->|"HTTP :80"| QE
    Server -->|"TCP :5432"| PG
    Server -->|"TCP :6379<br/>Streams + Pub/Sub"| Redis

    %% Query Engine connections
    QE -->|"TCP :6333<br/>gRPC/HTTP"| Qdrant
    QE -->|"TCP :6379"| Redis
    QE -->|"HTTPS"| LLMs

    %% Ingestion Worker connections
    IW -->|"TCP :6333"| Qdrant
    IW -->|"TCP :6379<br/>Stream consumer"| Redis
    IW -->|"HTTPS"| InferenceSvc

    %% Studio API connections
    StudioAPI -->|"TCP :5432"| PG
    StudioAPI -->|"TCP :6379"| Redis

Key Communication Patterns¶

From	To	Protocol	Purpose
`server`	`queryengine-service`	HTTP `:80`	LLM queries, agent execution, RAG
`server`	PostgreSQL	TCP `:5432`	Database reads/writes (TypeORM)
`server`	Redis	TCP `:6379`	Job streams (`stream:zweistein`), notifications, pub/sub
`queryengine`	Qdrant	TCP `:6333`	Vector similarity search for embeddings
`queryengine`	Redis	TCP `:6379`	Caching and state
`queryengine`	LLM APIs	HTTPS	OpenAI, Anthropic, Groq, Perplexity calls
`ingestionworker`	Redis	TCP `:6379`	Consumes from `stream:zweistein` (consumer group)
`ingestionworker`	Qdrant	TCP `:6333`	Writes document embeddings
`ingestionworker`	Inference	HTTPS	Image explanation, audio transcription
`studio-api`	PostgreSQL	TCP `:5432`	Flow/tenant data (TypeORM)
`studio-api`	Redis	TCP `:6379`	Bull queues, caching

Internal DNS Names¶

Services discover each other via Kubernetes DNS:

Service	Internal DNS	Port
Zweistein Server	`server-service`	80 (-> 3000)
Zweistein Admin	`admin-service`	80 (-> 80)
Query Engine	`queryengine-service`	80 (-> 8000)
Studio API	`studio-api`	80 (-> 3000)
Houston	`houston`	80 (-> 3000)
Studio FE	`studio-fe`	80 (-> 80)
Redis	`redis-master.redis.svc.cluster.local`	6379
Qdrant	`qdrant.qdrant.svc.cluster.local`	6333

6. CI/CD Pipeline¶

graph TD
    Dev[Developer pushes<br/>to dev branch] --> Pipeline[Azure DevOps Pipeline]
    Pipeline --> Build[Build Docker image]
    Build --> Push[Push to Container Registry<br/>GCP Artifact Registry or AWS ECR]
    Push --> UpdateValues[Update image tag in<br/>blinkin-gitops values file]
    UpdateValues --> GitCommit[Commit + push to<br/>blinkin-gitops repo]
    GitCommit --> ArgoCD[ArgoCD / Flux detects<br/>git changes]
    ArgoCD --> HelmUpgrade[Helm upgrade/install<br/>in target namespace]
    HelmUpgrade --> K8s[Kubernetes applies<br/>new Deployment spec]
    K8s --> Rolling[Rolling update<br/>replaces pods]

    subgraph Pre-deploy Hooks
        MigrationJob[Migration Job<br/>npm run migration:run]
    end

    HelmUpgrade --> MigrationJob
    MigrationJob -->|success| K8s

Pipeline Flow Details¶

Developer pushes code to the dev branch of any service repo (picasso-fe, studio-api, zweistein)
Azure DevOps Pipeline picks up the push and builds a Docker image with an auto-incremented version tag (e.g., 1.1.455-dev)
Image is pushed to the container registry:
GCP track: europe-west3-docker.pkg.dev/blinkin-ai-prod/blinkin-docker-registry/
AWS track: 598323198652.dkr.ecr.eu-central-1.amazonaws.com/ (dev) or 306630622817.dkr.ecr.eu-central-1.amazonaws.com/ (prod)
GitOps values file is updated with the new image tag (e.g., in dev/ai-aws/values.dev.yaml)
GitOps controller (ArgoCD or equivalent) detects the commit in blinkin-gitops and runs a Helm upgrade
Helm hooks execute pre-deploy tasks like database migrations (migrations-job.yaml runs npm run migration:run for studio-api)
Kubernetes performs a rolling update, replacing old pods with new ones

Image Tagging Convention¶

Convention	Example	Meaning
`X.Y.Z-dev`	`1.1.455-dev`	Development build, auto-incremented
`X.Y.Z`	`0.5.52`	Release build (UAT/PROD)
`latest`	`latest`	Mutable tag (used only for superadmin)

7. Docker Images¶

All images from the DEV AWS (most actively updated) environment values files:

AI Services (Zweistein)¶

Service	Image Repository	Version (DEV AWS)	Version (PROD AWS)
admin	`blinkin-ai/admin`	`0.4.762-dev`	`0.5.52`
server	`blinkin-ai/server`	`0.4.240-dev`	`0.5.37`
queryengine	`blinkin-ai/query-engine`	`0.4.232-dev`	`0.5.71`
ingestionworker	`blinkin-ai/data-ingestion-worker`	`0.4.29-dev`	`0.5.3`

Studio Services¶

Service	Image Repository	Version (DEV AWS)	Version (PROD AWS)
studio-fe	`blinkin-studio/studio-fe`	`1.1.455-dev`	`1.1.40`
houston	`blinkin-studio/houston`	`1.1.156-dev`	`1.1.32`
studio-api	`blinkin-studio/studio-api`	`1.1.142-dev`	`1.1.32`
superadmin	`blinkin-studio/studio-superadmin`	`latest`	`latest`

Utility Images¶

Image	Version	Purpose
`db-redis-connection-checker`	`1.0.0`	Init container that waits for PostgreSQL + Redis readiness before migrations

8. Key Kubernetes Resources¶

Deployments¶

Name	Namespace Group	Container Port	Replicas (DEV)	Replicas (PROD AWS)	Health Check
`admin`	ai	80	1	1	N/A (static SPA)
`server`	ai	3000	1	1	`/ai/healthz`
`queryengine`	ai	8000	1	1	`/healthz`
`studio-fe`	studio	80	1	1	`/healthz`
`houston`	studio	3000	1	1	`/healthz`
`studio-api`	studio	3000	1	1	`/health`
`studio-superadmin`	studio-aws	3000	1	1	`/health`

StatefulSets¶

Name	Namespace Group	Replicas (DEV)	Replicas (PROD AWS)	Purpose
`ingestionworker`	ai	3	3	Document ingestion with stable pod identity for Redis Stream consumers

Services (ClusterIP)¶

Service Name	Target Port	Exposed Port	Protocol
`admin-service`	80	80	TCP/HTTP
`server-service`	3000	80	TCP/HTTP
`queryengine-service`	8000	80	TCP/HTTP
`studio-fe`	80	80	TCP/HTTP
`houston`	3000	80	TCP/HTTP
`studio-api`	3000	80	TCP/HTTP
`studio-superadmin`	3000	80	TCP/HTTP

ConfigMaps¶

ConfigMap Name	Source Chart	Purpose
`server-config-env`	blinkin-ai	Zweistein server environment variables
`queryengine-config-env`	blinkin-ai	Query engine environment variables
`ingestionworker-config-env`	blinkin-ai	Ingestion worker environment variables
`admin-nginx-config`	blinkin-ai	NGINX configuration for admin SPA routing

Secrets (via ExternalSecrets Operator)¶

ExternalSecret Name	Target K8s Secret	Keys Synced
`gcp-keyfile-external-secrets`	`gcs-keyfile-{env}-es`	`gcs-key.json` (GCS service account)
`canvas-external-secrets`	`canvas-secrets-es`	`CANVAS_TOKEN`
`redis-external-secrets`	`redis-secrets-es`	`REDIS_PASS`
`external-api-external-secrets`	`external-api-secrets-es`	`ANTHROPIC_API_KEY`, `EXA_API_KEY`, `FAL_KEY`, `GROQ_API_KEY`, `OPENAI_API_KEY`, `PERPLEXITY_API_KEY`, `PEXELS_API_KEY`, `TAVILY_API_KEY`, `GOOGLE_API_KEY`
`postgres-external-secrets`	`postgres-secrets-es`	`DB_USERNAME`, `DB_PASSWORD`, `STUDIO_DB_USERNAME`, `STUDIO_DB_PASSWORD`

Ingress Rules (DEV AWS example)¶

Host	Path	Backend Service	Port
`app-dev.blinkin.io`	`/`	`houston`	80
`app-dev.blinkin.io`	`/studio`	`studio-fe`	80
`app-dev.blinkin.io`	`/ai`	`admin-service`	80
`app-dev.blinkin.io`	`/ai/api`	`server-service`	80
`app-dev.blinkin.io`	`/admin`	`studio-superadmin`	80
`app-dev.blinkin.io`	`/socket.io`	`server-service`	80
`studio-api-dev.blinkin.io`	`/`	`studio-api`	80

9. Redis Streams Architecture¶

Redis is used heavily for async job processing between the NestJS server and Python workers.

Stream Name	Producer	Consumer Group	Consumer
`stream:zweistein`	`server` (NestJS)	`group:zweistein`	`ingestionworker` (Python)
`stream:zweistein:notifications`	`ingestionworker` (Python)	`group:zweistein:notifications`	`server` (NestJS)

The server publishes ingestion tasks (document parsing, embedding creation) to the main stream. Ingestion workers consume from this stream using consumer groups (allowing multiple workers to share the load). When a worker finishes, it publishes a notification back to the notification stream, which the server consumes to update the UI in real-time.

10. External Service Dependencies¶

Service	Used By	Purpose
Auth0	`server`	SSO authentication (domain: `sso-dev.blinkin.io` / `picasso-auth-prod.eu.auth0.com`)
OpenAI	`queryengine`, `ingestionworker`	LLM completions (GPT-4o) and embeddings (`text-embedding-3-large`)
Anthropic	`queryengine`	Claude LLM completions
Groq	`queryengine`	Fast LLM inference
Perplexity	`queryengine`	Search-augmented LLM
Deepgram	`server`	Speech-to-text transcription
Mailgun	`server`, `queryengine`	Inbound/outbound email (domain: `ai.blinkin.io`)
Stripe	`server`	Payment processing and subscriptions
Tavily	`queryengine`	Web search API for agents
Exa	`queryengine`	Semantic web search for agents
Fal	`queryengine`	AI image generation
Pexels	`queryengine`	Stock image search
LangSmith	`queryengine`	LLM tracing and observability
PostHog	`server` (PROD)	Product analytics
GCS	`server`, `queryengine`, `ingestionworker`	Cloud file storage (bucket: `blinkin-ai-{env}-storage`)
Azure Blob	`studio-api`	Picasso media storage
Inference Server	`ingestionworker`, `server`	Custom image/audio processing (`inference.blinkin.io`)