Skip to content

Architecture

Technology Stack

LayerTechnology
RuntimeNext.js 15 (App Router) + TypeScript
UIMantine v7 + Tailwind CSS
DatabaseSQLite (default) or MongoDB (multi-tenant)
AuthenticationJWT via jose (Edge-compatible)
LoggingWinston (structured, JSON/pretty)
CacheMemory or Redis (ioredis)
EmailNodemailer + Handlebars templates
BuildTurbopack
TestingVitest
DeploymentDocker + Kubernetes (Helm)

Request Flow

Every request goes through a well-defined pipeline:

Client Request


┌─────────────┐
│  Middleware  │  ← JWT validation, feature checks, CORS
│  (Edge)     │     Injects: x-user-id, x-tenant-id, x-request-id
└──────┬──────┘


┌─────────────────┐
│ withRequestContext│  ← Establishes AsyncLocalStorage context
│  (Route wrapper) │     Checks isShuttingDown → 503
└──────┬──────────┘


┌─────────────────┐
│  Route Handler  │  ← requireApiToken (client routes)
│                 │     Validation, quota checks
└──────┬──────────┘


┌─────────────────┐
│  Domain Service │  ← Business logic
│  (inference,    │     Uses: runtimePool, withResilience, cache
│   vector, etc.) │
└──────┬──────────┘


┌─────────────────┐
│  Provider       │  ← External API call (LLM, vector DB, etc.)
│  Runtime        │     Wrapped in withResilience (retry + circuit breaker)
└─────────────────┘

Multi-Tenant Data Model

The gateway supports two database backends. SQLite is the default for zero-dependency setups; MongoDB is available for production deployments requiring higher concurrency.

SQLite (default)

Each tenant gets a separate .sqlite file — full data isolation with no external services:

data/
├── console_main.sqlite       # Shared database (tenant registry)
├── tenant_acme.sqlite        # Tenant "acme"
├── tenant_globex.sqlite      # Tenant "globex"
└── ...

MongoDB

MongoDB Server
├── console_main             # Shared database
│   └── tenants              # Tenant metadata, license, slug
├── tenant_{slug}            # Per-company database
│   ├── users                # User accounts
│   ├── api_tokens           # API tokens
│   ├── projects             # Projects
│   ├── models               # Model configurations
│   ├── provider_configs     # Provider credentials (encrypted)
│   ├── vector_indexes       # Vector index metadata
│   ├── model_usage_logs     # Inference usage logs
│   ├── agent_tracing_*      # Tracing sessions & events
│   ├── guardrails           # Guardrail policies
│   ├── prompts              # Prompt templates
│   ├── rag_modules          # RAG configurations
│   ├── memory_stores        # Memory store configs
│   └── ...
└── tenant_{another_slug}    # Another company (complete isolation)

Switch between providers with DB_PROVIDER=sqlite (default) or DB_PROVIDER=mongodb.

Core Module Stack

The infrastructure layer (src/lib/core/) provides cross-cutting concerns that are used by all services:

ModulePurpose
ConfigCentralized typed configuration with ENV abstraction
LoggerStructured logging with automatic request context
Request ContextPer-request AsyncLocalStorage propagation
CacheProvider-based caching (none / memory / Redis)
ResilienceRetry with exponential back-off + circuit breaker
Runtime PoolLRU cache for provider SDK client instances
Async TasksFire-and-forget background operations
Health ChecksSubsystem health registry for readiness probes
LifecycleGraceful shutdown with ordered handler teardown
CORSConfigurable CORS for client API endpoints

Bootstrap Sequence

The application bootstrap is managed by src/instrumentation.ts (Next.js instrumentation hook):

1. Validate config (getConfig + validateConfig)
2. Initialize lifecycle (signal handlers: SIGTERM, SIGINT)
3. Initialize cache provider (memory / Redis / none)
4. Register health check for cache
5. Register shutdown handlers (async-tasks → cache → runtime-pool)
6. Start background schedulers (inference monitoring, alerts)
7. Log startup summary

Service Layer

Business logic is in src/lib/services/:

ServiceResponsibility
models/inferenceServiceChat completions & embeddings (OpenAI-compatible)
models/runtimeServiceProvider runtime creation with pool caching
models/modelServiceModel CRUD with metadata caching
vector/vectorServiceVector index CRUD, upsert, query
files/fileServiceFile upload/download/delete via provider runtime
guardrail/Input/output evaluation with regex/keyword/LLM
agentTracing/Tracing session & event persistence
providers/providerServiceProvider config CRUD, credential encryption
rag/RAG module management, document ingestion, queries
memory/Memory store CRUD, semantic search
prompts/Prompt template management, versioning
apiTokenAuthAPI token validation with caching

Community edition is AGPL-3.0. Commercial licensing and support are available separately.