Stop hardcoding model decisions.
Send a policy with each LLM call. unhardcoded routes to the cheapest model that passes your rules — over your own provider keys — and records why it chose.
One request carries the rule, the route, and the receipt.
Model choice is frozen in code.
A pinned model name is a decision baked into your application — it can't adapt to the request in front of it, and no one can explain it later. It shows up as three production problems.
Cost
Easy requests still hit your most expensive model, every time.
Reliability
A provider outage turns into fallback logic you hand-write in app code.
Control
No one can say why a model was used for a given request, after the fact.
The policy travels with the request.
Four steps, the same every time.
No dashboard config and no redeploy. Your backend builds the rule, sends it with the call, and the router does the rest — the same way every time, with a record you can replay.
- Built in your backend, per request — from your own tenants and tiers.
- Validated and fingerprinted before it runs — unknown ops are rejected.
- Replayable from the trace — reconstruct any decision months later.
One decision layer, three production wins.
The same primitive — a policy sent with the call — answers each of the three problems above.
Spend less
Use the cheapest model that clears the floor, instead of your priciest model for every easy call. Cut spend →
Fail over safely
When a provider errors, advance to the next passing model — no retry code, no redeploy. Reliability →
Know why
Every run records which models passed, which were rejected and why, and what it cost. Traces →
Per-customer rules and multi-step workflows run on the same mechanism. See all use cases →
Every run leaves a receipt.
The trace is the receipt: the rule that was sent, every candidate it considered, the model that won, and why. Replay it any time.
Questions, answered plainly.
Do I have to rewrite my app?
baseURL at the endpoint, replace the model name with a policy:* name, and send the policy_ir with the call. Your messages and parameters pass through unchanged.Are you reselling tokens?
What happens when a model fails?
fallback step in your policy decides. By default the router moves to the next candidate under the floor, cheapest-first. Every hop — with latency, cost, and reason — is written to the trace.More on pricing, self-hosting, and the open core in the docs →
Stop hardcoding. Send the decision.
Point your SDK at one endpoint, keep your keys, and send your first policy. We route it through your providers and trace every decision.
Prefer to dig in first? Read the docs →