how it works

How unhardcoded works.

Policies route calls. Workflows compose policies. Traces make every decision replayable.

object 1 · policy

A policy routes one call.

A policy is a routing rule your backend sends with a single LLM call. The router validates it, evaluates it against the live catalog, selects the cheapest model that passes, and records the decision. A gateway forwards a request to a model you already chose; the policy layer makes the choice — at request time, over your own keys.

Generate

Your backend builds a policy_ir from your own tenants, tiers, and logic. No dashboard, no redeploy.

Send with the call

Attach it to one OpenAI-compatible request. The router validates it against the grammar and hashes it.

Route

It evaluates the rule over the live catalog and routes to the cheapest model that passes — over your keys.

Trace

Every candidate, the pick, any fallback hop, latency, and cost are written to a replayable record.

filter

Drop every model that fails a hard requirement — context window, tools, region, or price ceiling.

rank

Order the survivors by what you optimize for — usually cost, sometimes latency or a quality score.

select

Take the top-ranked model that passes. The cheapest survivor wins by default.

fallback

If the pick errors or times out, advance to the next survivor — same rules, no redeploy.

Read the policy_ir grammar in the docs

object 2 · workflow

A workflow composes policies.

Not every task is one call. Some need retrieval, drafting, verification, classification, or synthesis. A workflow is a bounded DAG of LLM steps sent with one call — and each step carries its own policy, so every step can route differently. Cheap long context here, a stronger model where it matters.

lineartriage → draft → guard · cheap where it can be, a strict final gate before send

Ticket

triageclassify · extract account id · cheapest JSONgemini-3.5-flash

draftreply from ticket + triage · quality floorclaude-sonnet-4-6

guardbrand voice · PII · refund limits · no-logclaude-opus-4-8can abort

Each step routes independentlyEvery node carries its own policy and resolves over the live catalog on its own.

Failover is per stepIf a node's model errors or times out, that node falls back without restarting the flow.

One stitched traceThe whole graph writes a single replayable record — every node's decision, in order.

billing A flow counts as one billable run, up to 5 nodes.

Read the flow_ir docs

object 3 · trace

The trace is the product memory.

Every run — one call or a whole flow — writes a receipt: the rule that was sent, the model selected and why, the fallback path if one was taken, the latency, the cost, and the fingerprint you need to replay the decision months later.

trace · req_8f41c2 200 OK

selectedgemini-3.5-flash

reasontools ✓, price ceiling passed, cheapest survivor

fallbackclaude-sonnet-4-6 → gemini-3.5-flash · 504 timed out

latency412 ms

cost$0.018 · per run, not per token

policysupport · fingerprint pol_8f41c2 · sigma-pol/v2

A flow stitches one trace across every node — so a replay shows each step's decision, in order. Read the trace schema →

open core

Open policy layer. Managed host with your keys.

The language is yours; the upkeep is ours. The open layer defines how decisions are expressed and executed. The host is what you'd otherwise build and operate yourself — that's maintenance, not lock-in.

Open source · Apache-2.0

The open layer

The language for model decisions. Anything expressible as a policy or workflow can be generated by code, validated, hashed, and run deterministically by any conformant interpreter.

The policy_ir and flow_ir language
Canonical encoding and fingerprints
Reference interpreter and conformance tests
A self-host path with your own catalog

Read the spec

Managed host · your keys

The host, maintained

The router you'd probably build internally, already operated. Bring your own provider keys, connect the providers you use, and skip running the host yourself.

Provider modules and live catalog data, maintained
Key management and OAuth
Replayable traces, history, failover, and uptime
Priced per run, not per token

Request access

That's the system. Now wire it in.

A policy routes a call, a workflow composes policies, and the trace proves what happened. The docs take it from here — quickstart, the OpenAI-compatible API, the full grammar, and copyable presets.

Read the docs Request access

No SDK rewriteYour provider keysOpen policy_irEvery run traced