Infrastructure & Deployment
How the Blinkin platform is deployed, configured, and managed across environments.
Source of truth: the blinkin-gitops repository.
1. Deployment Architecture
The platform runs on Kubernetes (both GKE on GCP and EKS on AWS) with NGINX Ingress Controllers routing external traffic to internal services. Each environment has its own cluster and namespace.
graph TB
subgraph External
User([End User / Browser])
Auth0([Auth0 SSO])
LLM_APIs([LLM APIs<br/>OpenAI / Anthropic / Groq / Perplexity])
Deepgram([Deepgram STT])
Mailgun([Mailgun Email])
Stripe([Stripe Payments])
Inference([Inference Server<br/>inference.blinkin.io])
GCS([GCS / Azure Blob<br/>File Storage])
end
subgraph Kubernetes Cluster
LB[Load Balancer]
Ingress[NGINX Ingress Controller<br/>+ cert-manager TLS]
subgraph studio-ns [Studio Namespace]
Houston[houston<br/>Next.js SSR<br/>port 3000]
PicassoFE[studio-fe<br/>Picasso Editor<br/>port 80]
StudioAPI[studio-api<br/>NestJS Backend<br/>port 3000]
SuperAdmin[studio-superadmin<br/>Admin Panel<br/>port 3000]
end
subgraph ai-ns [AI Namespace]
Admin[admin<br/>Zweistein Admin SPA<br/>NGINX port 80]
Server[server<br/>Zweistein NestJS<br/>port 3000]
QueryEngine[queryengine<br/>Python FastAPI<br/>port 8000]
IngestionWorker[ingestionworker<br/>Python StatefulSet<br/>3 replicas]
end
subgraph data-ns [Data Layer]
Postgres[(PostgreSQL<br/>Cloud SQL / RDS)]
Redis[(Redis<br/>In-cluster)]
Qdrant[(Qdrant<br/>Vector DB)]
end
end
User --> LB --> Ingress
Ingress -->|/ path| Houston
Ingress -->|/studio path| PicassoFE
Ingress -->|studio-api-*.blinkin.io| StudioAPI
Ingress -->|/ai path| Admin
Ingress -->|/ai/api path| Server
Ingress -->|/admin path| SuperAdmin
Server --> QueryEngine
Server --> Postgres
Server --> Redis
QueryEngine --> Qdrant
QueryEngine --> Redis
QueryEngine --> LLM_APIs
QueryEngine --> GCS
IngestionWorker --> Qdrant
IngestionWorker --> Redis
IngestionWorker --> GCS
IngestionWorker --> Inference
StudioAPI --> Postgres
StudioAPI --> Redis
StudioAPI --> GCS
Server --> Auth0
Server --> Mailgun
Server --> Stripe
Server --> Deepgram
Key points
| Aspect |
Detail |
| Cluster Provider |
GKE (GCP, europe-west3) for GCP track; EKS (AWS, eu-central-1) for AWS track |
| Container Registry |
GCP: europe-west3-docker.pkg.dev/blinkin-ai-prod/blinkin-docker-registry/ AWS: 598323198652.dkr.ecr.eu-central-1.amazonaws.com/ (dev) AWS: 306630622817.dkr.ecr.eu-central-1.amazonaws.com/ (prod) |
| Ingress |
NGINX Ingress Controller with cert-manager for automatic Let's Encrypt TLS |
| Secrets Management |
ExternalSecrets Operator syncing from GCP Secret Manager (ClusterSecretStore: gcp-store) |
| Database |
GCP: Cloud SQL Private IP (10.100.10.3 dev, 10.50.100.3 uat, 10.20.100.3 prod) AWS: RDS (blinkin-*-postgres.*.eu-central-1.rds.amazonaws.com) |
2. Environment Overview
The platform is deployed across three environments. Currently, the AWS track is the primary active deployment for DEV and PROD, while the GCP track exists for GKE-based setups and UAT.
graph LR
subgraph DEV [DEV Environment]
D_AWS[AWS EKS Cluster<br/>Primary Active]
D_GCP[GCP GKE Cluster<br/>Secondary]
end
subgraph UAT [UAT Environment]
U_GCP[GCP GKE Cluster<br/>AI only]
end
subgraph PROD [PROD Environment]
P_AWS[AWS EKS Cluster<br/>Primary Active]
P_GCP[GCP GKE Cluster<br/>Scaled to 0]
end
DevPush[Dev Branch Push] -->|Auto-deploy| DEV
ManualRelease[Manual Release] -->|Azure Releases| UAT
ProdRelease[Production Release] -->|Controlled deploy| PROD
Environment URL Table
| Service |
DEV (AWS) |
UAT (GCP) |
PROD (AWS) |
| Houston (flows) |
app-dev.blinkin.io |
app-uat.blinkin.io |
app.blinkin.io |
| Picasso Editor |
app-dev.blinkin.io/studio |
(not in UAT) |
app.blinkin.io/studio |
| Studio API |
studio-api-dev.blinkin.io |
(not in UAT) |
studio-api.blinkin.io |
| Zweistein Admin |
app-dev.blinkin.io/ai |
app-uat.blinkin.io/ai |
app.blinkin.io/ai |
| Zweistein Server API |
app-dev.blinkin.io/ai/api |
app-uat.blinkin.io/ai/api |
app.blinkin.io/ai/api |
| Superadmin |
app-dev.blinkin.io/admin |
(not in UAT) |
app.blinkin.io/admin |
| WebSocket |
app-dev.blinkin.io/socket.io |
app-uat.blinkin.io/socket.io |
app.blinkin.io/socket.io |
PROD (AWS) additionally serves tenant-specific domains: blinkin.blinkin.io, bosch.blinkin.io, demo.blinkin.io, emmy.blinkin.io, fortinet.blinkin.io, fraunhofer.blinkin.io, heinemann.blinkin.io, huberranner.blinkin.io, tapgig.blinkin.io, telekom.blinkin.io, wilo.blinkin.io.
Deployment Triggers
| Environment |
Trigger |
Branch |
| DEV |
Auto-deploy on push |
dev |
| UAT |
Manual deploy via Azure Releases |
main / release tag |
| PROD |
Controlled deploy |
main / release tag |
3. GitOps Repository Structure
The blinkin-gitops repository is the single source of truth for all Kubernetes deployment configurations. It is organized by environment first, then by service group.
blinkin-gitops/
|
+-- dev/ # DEV environment
| +-- ai/ # Zweistein AI services (GCP cluster)
| | +-- blinkin-ai/ # Helm chart
| | | +-- Chart.yaml # Chart metadata (v0.1.6)
| | | +-- templates/ # K8s resource templates
| | +-- values.dev.yaml # Environment-specific values
| |
| +-- ai-aws/ # Zweistein AI services (AWS cluster)
| | +-- blinkin-ai/ # Same Helm chart structure
| | +-- values.dev.yaml # AWS-specific values (ECR images, RDS host)
| |
| +-- studio/ # Studio services (GCP cluster)
| | +-- fe/ # Picasso Editor frontend
| | | +-- studio-fe/Chart.yaml
| | | +-- studio-fe/templates/
| | | +-- values.dev.yaml
| | +-- blinks/ # Houston frontend
| | | +-- houston/Chart.yaml
| | | +-- houston/templates/
| | | +-- values.dev.yaml
| | +-- api/ # Studio API backend
| | +-- studio-api/Chart.yaml
| | +-- studio-api/templates/
| | +-- values.dev.yaml
| |
| +-- studio-aws/ # Studio services (AWS cluster)
| +-- fe/ # Studio FE (same chart, ECR images)
| +-- blinks/ # Houston (ECR images)
| +-- api/ # Studio API (ECR images)
| +-- superadmin/ # Superadmin panel (AWS only)
|
+-- uat/ # UAT environment
| +-- ai/ # Only AI services deployed to UAT
| +-- values.uat.yaml
|
+-- prod/ # PROD environment
| +-- ai/ # AI services (GCP) - scaled to 0 replicas
| +-- ai-aws/ # AI services (AWS) - PRIMARY ACTIVE
| +-- studio/ # Studio services (GCP) - scaled to 0 replicas
| +-- studio-aws/ # Studio services (AWS) - PRIMARY ACTIVE
|
+-- README.md
What "ai" and "studio" groups mean
| Group |
Meaning |
Services Included |
| ai |
Zweistein AI/ML platform |
admin (SPA), server (NestJS), queryengine (Python), ingestionworker (Python) |
| studio |
Picasso content platform |
studio-fe (Picasso Editor), houston (Flow viewer), studio-api (NestJS backend) |
| -aws suffix |
Same services, but configured for AWS infrastructure |
Uses ECR images, RDS endpoints, AWS-specific secrets |
Note: The GCP-track prod deployments (prod/ai/ and prod/studio/) have all replica counts set to 0, meaning they are effectively dormant. The active production runs entirely on the AWS track (prod/ai-aws/ and prod/studio-aws/).
4. Helm Chart Structure
Zweistein AI Chart (blinkin-ai)
Chart version: 0.1.6 (gitops dev) / 0.1.5 (source repo)
App version: 1.16.0
| Template File |
K8s Resource |
Purpose |
admin.deployment.yaml |
Deployment |
Zweistein Admin SPA (NGINX serving React app) |
admin.service.yaml |
Service |
ClusterIP on port 80 -> container 80 |
admin.nginx.configmap.yaml |
ConfigMap |
NGINX config for SPA routing + CSP headers |
server.deployment.yaml |
Deployment |
Zweistein NestJS backend server |
server.service.yaml |
Service |
ClusterIP on port 80 -> container 3000 |
server.configmap.yaml |
ConfigMap |
Server env vars from values file |
queryengine.deployment.yaml |
Deployment |
Python FastAPI query engine |
queryengine.service.yaml |
Service |
ClusterIP on port 80 -> container 8000 |
queryengine.configmap.yaml |
ConfigMap |
Query engine env vars from values file |
ingestionworker.statefulset.yaml |
StatefulSet |
Python ingestion workers (3 replicas) |
ingestionworker.configmap.yaml |
ConfigMap |
Ingestion worker env vars from values file |
cluster.ingress.yaml |
Ingress |
NGINX Ingress routing /ai, /ai/api, /socket.io |
cluster.frontendconfig.yaml |
FrontendConfig |
GKE HTTPS redirect config |
externalsecrets.yaml |
ExternalSecret |
Syncs secrets from GCP Secret Manager |
externalsecrets-force-sync.job.yaml |
Job |
Forces re-sync of external secrets |
Picasso FE Charts (studio-fe and houston)
Chart version: 0.1.0
App version: 1.16.0
| Template File |
K8s Resource |
Purpose |
deployment.yaml |
Deployment |
Frontend app container |
service.yaml |
Service |
ClusterIP on port 80 -> container port |
ingress.yaml |
Ingress |
NGINX Ingress with TLS |
serviceaccount.yaml |
ServiceAccount |
Pod identity (creation disabled) |
_helpers.tpl |
Helper |
Naming conventions and labels |
Studio API Chart (studio-api)
Chart version: 0.1.0
App version: 1.0.0
| Template File |
K8s Resource |
Purpose |
deployment.yaml |
Deployment |
NestJS API container |
service.yaml |
Service |
ClusterIP on port 80 -> container 3000 |
ingress.yaml |
Ingress |
NGINX Ingress with TLS |
migrations-job.yaml |
Job (Helm Hook) |
Runs npm run migration:run before deployments |
serviceaccount.yaml |
ServiceAccount |
Pod identity (creation disabled) |
_helpers.tpl |
Helper |
Naming conventions and labels |
5. Service Connectivity Map
This diagram shows how services communicate within the Kubernetes cluster.
graph LR
subgraph Ingress Layer
NGINX[NGINX Ingress]
end
subgraph Studio Services
Houston[houston<br/>:3000]
StudioFE[studio-fe<br/>:80]
StudioAPI[studio-api<br/>:3000]
SuperAdmin[studio-superadmin<br/>:3000]
end
subgraph AI Services
Admin[admin<br/>:80 NGINX]
Server[server<br/>:3000]
QE[queryengine<br/>:8000]
IW[ingestionworker<br/>x3 StatefulSet]
end
subgraph Data Stores
PG[(PostgreSQL<br/>:5432)]
Redis[(Redis<br/>:6379)]
Qdrant[(Qdrant<br/>:6333)]
end
subgraph External APIs
LLMs[OpenAI / Anthropic<br/>Groq / Perplexity]
InferenceSvc[inference.blinkin.io]
end
%% Ingress routing
NGINX -->|"/ (HTTP)"| Houston
NGINX -->|"/studio (HTTP)"| StudioFE
NGINX -->|"/ai (HTTP)"| Admin
NGINX -->|"/ai/api (HTTP)"| Server
NGINX -->|"/admin (HTTP)"| SuperAdmin
NGINX -->|"studio-api-*.blinkin.io (HTTP)"| StudioAPI
NGINX -->|"/socket.io (WS)"| Server
%% Server -> internal services
Server -->|"HTTP :80"| QE
Server -->|"TCP :5432"| PG
Server -->|"TCP :6379<br/>Streams + Pub/Sub"| Redis
%% Query Engine connections
QE -->|"TCP :6333<br/>gRPC/HTTP"| Qdrant
QE -->|"TCP :6379"| Redis
QE -->|"HTTPS"| LLMs
%% Ingestion Worker connections
IW -->|"TCP :6333"| Qdrant
IW -->|"TCP :6379<br/>Stream consumer"| Redis
IW -->|"HTTPS"| InferenceSvc
%% Studio API connections
StudioAPI -->|"TCP :5432"| PG
StudioAPI -->|"TCP :6379"| Redis
Key Communication Patterns
| From |
To |
Protocol |
Purpose |
server |
queryengine-service |
HTTP :80 |
LLM queries, agent execution, RAG |
server |
PostgreSQL |
TCP :5432 |
Database reads/writes (TypeORM) |
server |
Redis |
TCP :6379 |
Job streams (stream:zweistein), notifications, pub/sub |
queryengine |
Qdrant |
TCP :6333 |
Vector similarity search for embeddings |
queryengine |
Redis |
TCP :6379 |
Caching and state |
queryengine |
LLM APIs |
HTTPS |
OpenAI, Anthropic, Groq, Perplexity calls |
ingestionworker |
Redis |
TCP :6379 |
Consumes from stream:zweistein (consumer group) |
ingestionworker |
Qdrant |
TCP :6333 |
Writes document embeddings |
ingestionworker |
Inference |
HTTPS |
Image explanation, audio transcription |
studio-api |
PostgreSQL |
TCP :5432 |
Flow/tenant data (TypeORM) |
studio-api |
Redis |
TCP :6379 |
Bull queues, caching |
Internal DNS Names
Services discover each other via Kubernetes DNS:
| Service |
Internal DNS |
Port |
| Zweistein Server |
server-service |
80 (-> 3000) |
| Zweistein Admin |
admin-service |
80 (-> 80) |
| Query Engine |
queryengine-service |
80 (-> 8000) |
| Studio API |
studio-api |
80 (-> 3000) |
| Houston |
houston |
80 (-> 3000) |
| Studio FE |
studio-fe |
80 (-> 80) |
| Redis |
redis-master.redis.svc.cluster.local |
6379 |
| Qdrant |
qdrant.qdrant.svc.cluster.local |
6333 |
6. CI/CD Pipeline
graph TD
Dev[Developer pushes<br/>to dev branch] --> Pipeline[Azure DevOps Pipeline]
Pipeline --> Build[Build Docker image]
Build --> Push[Push to Container Registry<br/>GCP Artifact Registry or AWS ECR]
Push --> UpdateValues[Update image tag in<br/>blinkin-gitops values file]
UpdateValues --> GitCommit[Commit + push to<br/>blinkin-gitops repo]
GitCommit --> ArgoCD[ArgoCD / Flux detects<br/>git changes]
ArgoCD --> HelmUpgrade[Helm upgrade/install<br/>in target namespace]
HelmUpgrade --> K8s[Kubernetes applies<br/>new Deployment spec]
K8s --> Rolling[Rolling update<br/>replaces pods]
subgraph Pre-deploy Hooks
MigrationJob[Migration Job<br/>npm run migration:run]
end
HelmUpgrade --> MigrationJob
MigrationJob -->|success| K8s
Pipeline Flow Details
- Developer pushes code to the
dev branch of any service repo (picasso-fe, studio-api, zweistein)
- Azure DevOps Pipeline picks up the push and builds a Docker image with an auto-incremented version tag (e.g.,
1.1.455-dev)
- Image is pushed to the container registry:
- GCP track:
europe-west3-docker.pkg.dev/blinkin-ai-prod/blinkin-docker-registry/
- AWS track:
598323198652.dkr.ecr.eu-central-1.amazonaws.com/ (dev) or 306630622817.dkr.ecr.eu-central-1.amazonaws.com/ (prod)
- GitOps values file is updated with the new image tag (e.g., in
dev/ai-aws/values.dev.yaml)
- GitOps controller (ArgoCD or equivalent) detects the commit in
blinkin-gitops and runs a Helm upgrade
- Helm hooks execute pre-deploy tasks like database migrations (
migrations-job.yaml runs npm run migration:run for studio-api)
- Kubernetes performs a rolling update, replacing old pods with new ones
Image Tagging Convention
| Convention |
Example |
Meaning |
X.Y.Z-dev |
1.1.455-dev |
Development build, auto-incremented |
X.Y.Z |
0.5.52 |
Release build (UAT/PROD) |
latest |
latest |
Mutable tag (used only for superadmin) |
7. Docker Images
All images from the DEV AWS (most actively updated) environment values files:
AI Services (Zweistein)
| Service |
Image Repository |
Version (DEV AWS) |
Version (PROD AWS) |
| admin |
blinkin-ai/admin |
0.4.762-dev |
0.5.52 |
| server |
blinkin-ai/server |
0.4.240-dev |
0.5.37 |
| queryengine |
blinkin-ai/query-engine |
0.4.232-dev |
0.5.71 |
| ingestionworker |
blinkin-ai/data-ingestion-worker |
0.4.29-dev |
0.5.3 |
Studio Services
| Service |
Image Repository |
Version (DEV AWS) |
Version (PROD AWS) |
| studio-fe |
blinkin-studio/studio-fe |
1.1.455-dev |
1.1.40 |
| houston |
blinkin-studio/houston |
1.1.156-dev |
1.1.32 |
| studio-api |
blinkin-studio/studio-api |
1.1.142-dev |
1.1.32 |
| superadmin |
blinkin-studio/studio-superadmin |
latest |
latest |
Utility Images
| Image |
Version |
Purpose |
db-redis-connection-checker |
1.0.0 |
Init container that waits for PostgreSQL + Redis readiness before migrations |
8. Key Kubernetes Resources
Deployments
| Name |
Namespace Group |
Container Port |
Replicas (DEV) |
Replicas (PROD AWS) |
Health Check |
admin |
ai |
80 |
1 |
1 |
N/A (static SPA) |
server |
ai |
3000 |
1 |
1 |
/ai/healthz |
queryengine |
ai |
8000 |
1 |
1 |
/healthz |
studio-fe |
studio |
80 |
1 |
1 |
/healthz |
houston |
studio |
3000 |
1 |
1 |
/healthz |
studio-api |
studio |
3000 |
1 |
1 |
/health |
studio-superadmin |
studio-aws |
3000 |
1 |
1 |
/health |
StatefulSets
| Name |
Namespace Group |
Replicas (DEV) |
Replicas (PROD AWS) |
Purpose |
ingestionworker |
ai |
3 |
3 |
Document ingestion with stable pod identity for Redis Stream consumers |
Services (ClusterIP)
| Service Name |
Target Port |
Exposed Port |
Protocol |
admin-service |
80 |
80 |
TCP/HTTP |
server-service |
3000 |
80 |
TCP/HTTP |
queryengine-service |
8000 |
80 |
TCP/HTTP |
studio-fe |
80 |
80 |
TCP/HTTP |
houston |
3000 |
80 |
TCP/HTTP |
studio-api |
3000 |
80 |
TCP/HTTP |
studio-superadmin |
3000 |
80 |
TCP/HTTP |
ConfigMaps
| ConfigMap Name |
Source Chart |
Purpose |
server-config-env |
blinkin-ai |
Zweistein server environment variables |
queryengine-config-env |
blinkin-ai |
Query engine environment variables |
ingestionworker-config-env |
blinkin-ai |
Ingestion worker environment variables |
admin-nginx-config |
blinkin-ai |
NGINX configuration for admin SPA routing |
Secrets (via ExternalSecrets Operator)
| ExternalSecret Name |
Target K8s Secret |
Keys Synced |
gcp-keyfile-external-secrets |
gcs-keyfile-{env}-es |
gcs-key.json (GCS service account) |
canvas-external-secrets |
canvas-secrets-es |
CANVAS_TOKEN |
redis-external-secrets |
redis-secrets-es |
REDIS_PASS |
external-api-external-secrets |
external-api-secrets-es |
ANTHROPIC_API_KEY, EXA_API_KEY, FAL_KEY, GROQ_API_KEY, OPENAI_API_KEY, PERPLEXITY_API_KEY, PEXELS_API_KEY, TAVILY_API_KEY, GOOGLE_API_KEY |
postgres-external-secrets |
postgres-secrets-es |
DB_USERNAME, DB_PASSWORD, STUDIO_DB_USERNAME, STUDIO_DB_PASSWORD |
Ingress Rules (DEV AWS example)
| Host |
Path |
Backend Service |
Port |
app-dev.blinkin.io |
/ |
houston |
80 |
app-dev.blinkin.io |
/studio |
studio-fe |
80 |
app-dev.blinkin.io |
/ai |
admin-service |
80 |
app-dev.blinkin.io |
/ai/api |
server-service |
80 |
app-dev.blinkin.io |
/admin |
studio-superadmin |
80 |
app-dev.blinkin.io |
/socket.io |
server-service |
80 |
studio-api-dev.blinkin.io |
/ |
studio-api |
80 |
9. Redis Streams Architecture
Redis is used heavily for async job processing between the NestJS server and Python workers.
| Stream Name |
Producer |
Consumer Group |
Consumer |
stream:zweistein |
server (NestJS) |
group:zweistein |
ingestionworker (Python) |
stream:zweistein:notifications |
ingestionworker (Python) |
group:zweistein:notifications |
server (NestJS) |
The server publishes ingestion tasks (document parsing, embedding creation) to the main stream. Ingestion workers consume from this stream using consumer groups (allowing multiple workers to share the load). When a worker finishes, it publishes a notification back to the notification stream, which the server consumes to update the UI in real-time.
10. External Service Dependencies
| Service |
Used By |
Purpose |
| Auth0 |
server |
SSO authentication (domain: sso-dev.blinkin.io / picasso-auth-prod.eu.auth0.com) |
| OpenAI |
queryengine, ingestionworker |
LLM completions (GPT-4o) and embeddings (text-embedding-3-large) |
| Anthropic |
queryengine |
Claude LLM completions |
| Groq |
queryengine |
Fast LLM inference |
| Perplexity |
queryengine |
Search-augmented LLM |
| Deepgram |
server |
Speech-to-text transcription |
| Mailgun |
server, queryengine |
Inbound/outbound email (domain: ai.blinkin.io) |
| Stripe |
server |
Payment processing and subscriptions |
| Tavily |
queryengine |
Web search API for agents |
| Exa |
queryengine |
Semantic web search for agents |
| Fal |
queryengine |
AI image generation |
| Pexels |
queryengine |
Stock image search |
| LangSmith |
queryengine |
LLM tracing and observability |
| PostHog |
server (PROD) |
Product analytics |
| GCS |
server, queryengine, ingestionworker |
Cloud file storage (bucket: blinkin-ai-{env}-storage) |
| Azure Blob |
studio-api |
Picasso media storage |
| Inference Server |
ingestionworker, server |
Custom image/audio processing (inference.blinkin.io) |