How unhardcoded works.
Policies route calls. Workflows compose policies. Traces make every decision replayable.
A policy routes one call.
A policy is a routing rule your backend sends with a single LLM call. The router validates it, evaluates it against the live catalog, selects the cheapest model that passes, and records the decision. A gateway forwards a request to a model you already chose; the policy layer makes the choice — at request time, over your own keys.
Generate
Your backend builds a policy_ir from your own tenants, tiers, and logic. No dashboard, no redeploy.
Send with the call
Attach it to one OpenAI-compatible request. The router validates it against the grammar and hashes it.
Route
It evaluates the rule over the live catalog and routes to the cheapest model that passes — over your keys.
Trace
Every candidate, the pick, any fallback hop, latency, and cost are written to a replayable record.
Drop every model that fails a hard requirement — context window, tools, region, or price ceiling.
Order the survivors by what you optimize for — usually cost, sometimes latency or a quality score.
Take the top-ranked model that passes. The cheapest survivor wins by default.
If the pick errors or times out, advance to the next survivor — same rules, no redeploy.
A workflow composes policies.
Not every task is one call. Some need retrieval, drafting, verification, classification, or synthesis. A workflow is a bounded DAG of LLM steps sent with one call — and each step carries its own policy, so every step can route differently. Cheap long context here, a stronger model where it matters.
billing A flow counts as one billable run, up to 5 nodes.
The trace is the product memory.
Every run — one call or a whole flow — writes a receipt: the rule that was sent, the model selected and why, the fallback path if one was taken, the latency, the cost, and the fingerprint you need to replay the decision months later.
A flow stitches one trace across every node — so a replay shows each step's decision, in order. Read the trace schema →
Open policy layer. Managed host with your keys.
The language is yours; the upkeep is ours. The open layer defines how decisions are expressed and executed. The host is what you'd otherwise build and operate yourself — that's maintenance, not lock-in.
The open layer
The language for model decisions. Anything expressible as a policy or workflow can be generated by code, validated, hashed, and run deterministically by any conformant interpreter.
- The policy_ir and flow_ir language
- Canonical encoding and fingerprints
- Reference interpreter and conformance tests
- A self-host path with your own catalog
The host, maintained
The router you'd probably build internally, already operated. Bring your own provider keys, connect the providers you use, and skip running the host yourself.
- Provider modules and live catalog data, maintained
- Key management and OAuth
- Replayable traces, history, failover, and uptime
- Priced per run, not per token
That's the system. Now wire it in.
A policy routes a call, a workflow composes policies, and the trace proves what happened. The docs take it from here — quickstart, the OpenAI-compatible API, the full grammar, and copyable presets.