Guardrails

This example demonstrates policy enforcement before and after model execution.

Open source: examples/guardrails/guardrails.ts

Use this when

input must be blocked before it reaches the model
output must be filtered before it reaches the user
you want policy violations recorded in runtime state

What it shows

regex-based and code-based guardrail checks
phase-aware interception before or after generation
policy enforcement without rewriting core agent logic

Run it

bash

cd examples
npm run example:guardrails

Core code

const passwordGuardrail = createRegexGuardrail(/password|secret/i, {
	guardrailId: "password-filter",
	guardrailTitle: "Sensitive Secret Filter",
	phases: [GuardrailPhase.Request],
	rule: { failureMessage: "Outbound request blocked: sensitive secret detected." },
});

const codeGuardrail = createCodeGuardrail({
	guardrailId: "code-ban",
	guardrailTitle: "No Code Responses",
	phases: [GuardrailPhase.Response],
	rule: { disposition: "block" },
});

const agent = createAgent({
	model: fakeModel,
	guardrails: [passwordGuardrail, codeGuardrail],
});

End-to-end flow

A request guardrail scans user input for disallowed patterns.
A response guardrail checks model output after generation.
The runtime blocks or rewrites behavior according to the configured disposition.
Incidents are stored in state for inspection.

Why it matters

Guardrails are useful when product policy must remain explicit and testable rather than hidden inside prompt wording.

What to inspect

request blocking before model execution
response filtering after model generation
incidents recorded on state.guardrailResult

Production takeaway

Guardrails are strongest when they are explicit runtime policy. This example shows that enforcement path without mixing it into prompt wording.

Expected output

the first invocation prints a blocked request result
the second invocation prints a filtered or blocked response result
guardrail incidents are visible in state

Common failure modes

the regex or code rule is too weak, so the test input does not trigger the guardrail
you inspect only final content and miss the richer guardrail incident data in state

Guardrails ​

Use this when ​

What it shows ​

Run it ​

Core code ​

End-to-end flow ​

Why it matters ​

What to inspect ​

Production takeaway ​

Expected output ​

Common failure modes ​