Your policy decides which model runs.
Most routers hide the interesting part: why a model was chosen. unhardcoded makes that decision explicit. Each request carries a policy; the router applies it at runtime, selects the cheapest model that passes, and records the decision as a replayable trace.
One request, one explicit decision.
unhardcoded is not a black box that decides for you. The policy travels with the request, the router applies it at runtime, the call runs on your provider keys, and the decision is written down. Five beats, in order.
The policy travels with the request
Your backend builds the routing rule and sends it on one OpenAI-compatible call — no dashboard, no redeploy. The router validates it against the grammar and fingerprints it. In the wire format that rule is a policy_ir (sigma-pol/v1).
The router filters and ranks the catalog
It evaluates the rule over the live catalog and routes to the cheapest model that passes. Hard requirements drop the rest; survivors are ranked by what you optimize for, usually cost.
Execution runs on your provider keys
The selected model is called with your own keys. unhardcoded never resells tokens — you pay providers directly for inference and pay us per run. If the pick errors or times out, the router advances to the next survivor under the same rule.
The trace records the decision
Every run writes a receipt: the policy that was sent, the candidates, accept and reject reasons, the fallback path, the model used, latency, and cost. The decision is explicit, not inferred. See it below.
Replay makes the decision auditable
The trace carries the policy fingerprint, so any run replays months later — same rule, same reasons, same outcome. That is what makes routing here auditable rather than opaque.
Each step makes its own decision.
Some tasks are one call; some are a bounded graph of steps — triage, draft, verify, synthesize. The mechanism is the same at each node: a step carries its own policy and resolves to the cheapest model that passes. Cheap long context here, a stronger model where it matters.
billing A workflow counts as one billable run, up to 5 nodes.
This is the decision, written down.
Here is the receipt from a single run. The selected model, why it won, the fallback path that stood ready, the cost per run, and the policy fingerprint that replays it. Nothing about the choice is hidden.
A workflow stitches one trace across every node — so a replay shows each step's decision, in order. Read the trace schema →
Questions, answered plainly.
Do I need to rewrite my app?
baseURL at the endpoint, replace the model name with a policy:* name, and send the policy_ir with the call. Your messages and parameters pass through unchanged.Do I need to give you provider keys?
Can I self-host?
policy_ir and flow_ir language, its canonical encoding, and the reference interpreter are open source, so you can run the decision layer yourself. The managed host adds provider modules, the live catalog, key management, traces, and failover.What happens when a model fails?
fallback step in your policy decides. By default the router advances to the next survivor under the floor, cheapest-first, without a redeploy. Every hop — with latency, cost, and reason — is written to the trace.How is this different from a model router or orchestrator?
How do I know why a model was chosen?
More on pricing, self-hosting, and the open core in the docs →
A decision layer can't be a black box if you can read it. The policy_ir and flow_ir language, its canonical encoding, and the reference interpreter are open source — the language is yours, the upkeep is ours. The managed host runs the provider modules, live catalog, key management, traces, and failover, priced per run.
That's the mechanism. Now wire it in.
Point your existing SDK at one endpoint, send a policy with the call, keep your provider keys, and get a trace for every decision. The docs take it from here — quickstart, the OpenAI-compatible API, the full grammar, and copyable presets.