Smoke Testing (End-to-End)
The smoke suite is the L8 layer described in TESTING.md: it boots the real application server over HTTP, signs up a brand-new tenant the same way the UI sign-up form does, and then drives every module's API end to end — auth hook → RBAC → service layer → SQLite — exactly as a real client would.
Where the unit (L2) and API-plugin (L3) layers mount pieces of the app in isolation with mocked dependencies, the smoke layer answers a different question: "does a freshly-started server actually work for a brand-new user across every module?"
Running it
npm run test:smokeThat's it. No external services, no .env setup, no database to provision. The runner:
- Creates a throwaway SQLite data directory under the OS temp folder and points the process at it (
SQLITE_DATA_DIR), with in-memory cache + rate limiting. Your dev data is never touched. - Boots the real Fastify API (
fastifyApiPlugin, mounted under/api) on an ephemeral port. - Registers a tenant + owner via
POST /api/auth/registerand proves the session cookie works. - Runs every per-module suite against the live server.
- Prints a grouped summary, writes reports to
scripts/smoke/reports/latest.{json,md}, and exits non-zero on any failure. - Deletes the temp data directory.
A successful run looks like:
────────────────────────────────────────────
TOTAL 127 | PASS 127 FAIL 0 SKIP 0
────────────────────────────────────────────What it covers
The suite exercises the full dashboard (cookie-session) API surface plus the OpenAI-compatible client API:
| Area | Coverage |
|---|---|
| Auth | register, session, login, wrong-password rejection, forgot-password, unauthenticated → 401 |
| Health | /api/health/live, /api/health/ready |
| CRUD lifecycles | projects, prompts, guardrails, PII policies, evaluation datasets, analysis definitions, rerankers, alert rules, config groups/items, vector providers+indexes (+upsert/query against the built-in SQLite vector store), users (invite/delete), API tokens |
| Read paths | models, providers, memory, RAG, tracing, redteam, audit, automations, cluster, crawler, browser, files, inference-monitoring, license, MCP, OCR jobs, quota, sandbox, tools |
| Client API | Bearer-token auth enforcement, GET /api/client/v1/prompts, policy-based POST /api/client/v1/pii/detect |
| Metrics | Prometheus /api/metrics with an API token |
Each step asserts a specific set of acceptable HTTP status codes. Fully self-contained operations (no external provider/credentials, no network egress) run a complete create → read → delete lifecycle; operations that require external infrastructure are verified at the read + validation-contract level.
Reports
After every run, two machine- and human-readable reports are written (git-ignored):
scripts/smoke/reports/latest.json— full structured results (per-step method, path, expected vs. actual status, duration).scripts/smoke/reports/latest.md— the same as a Markdown table, with a Failures section when applicable.
How it's structured
scripts/smoke/
├── server.ts # Boots the real Fastify API over HTTP (no Next.js)
├── client.ts # fetch wrapper + cookie jar + step/skip recording
├── suites.ts # Per-module test definitions (the bulk of the coverage)
└── run.ts # Orchestrator: env isolation → boot → signup → suites → reportAdding coverage for a new module
When you add a new API plugin, add a suite entry in scripts/smoke/suites.ts:
{
module: 'my-feature',
run: async (c, ctx) => {
// Read path — proves the full stack works for this module.
await c.step('list things', 'GET', '/api/my-feature/things', [200]);
// Optional lifecycle when the create path is self-contained.
const created = await c.step('create thing', 'POST', '/api/my-feature/things', [200, 201], {
body: { name: `Smoke Thing ${ctx.stamp}` },
});
const id = idOf(created?.body);
if (id) {
await c.step('delete thing', 'DELETE', `/api/my-feature/things/${id}`, [200]);
} else {
c.skip('thing lifecycle', 'no id returned');
}
},
}Guidelines:
- Use
ctx.stampin any unique keys/emails so repeated runs never collide. c.step(name, method, path, expectedStatuses, { body, headers })records the result and returns the response (ornullif it failed) so you can chain.- Prefer real create → read → delete lifecycles when the operation needs nothing external. Otherwise assert the read path + the validation contract (e.g. empty body →
400). - Clean up anything you create. The temp database is destroyed at the end of every run, so cleanup is mostly about keeping a single run's state tidy.
When to run it
- Locally before opening a PR that touches API plugins, services, auth, or the database layer.
- In CI as the L8 stage (see the CI/CD sequence in
TESTING.md). It needs no secrets and runs fully self-contained, so it's safe in any environment with Node.js 20+.