use cases

Where teams put the model decision layer to work.

The same primitive — a policy sent with the call — solves several production problems. Pick the one that hurts.

cost

Cut model spend

Use the cheapest model that passes the floor — gemini-3.5-flash at $0.018 instead of a pinned gpt-5.5. Illustrative.

Learn more

reliability

Handle provider failures

Fallback is policy, not retry code — on a timeout the router advances to the next model that still passes your rules.

Learn more

portability

Remove hardcoded model names

Route by requirement, not a model string — a tools floor, a price ceiling, a bench_intelligence score, never a literal gpt-5.5.

Learn more

tenant policy

Enforce per-customer rules

Different tenants, regions, and plans carry different policies — compiled per request, no dashboard, no redeploy.

Learn more

traces

Trace every decision

See why the winner won and why each rejected model failed — by the rule that filtered it, with a fingerprint to replay it.

Learn more

composition

Route multi-step tasks

Each step gets its own policy: classify, draft, verify, return. A workflow is just policies composed into one bounded call.

Learn more

mechanism first, workload second

One decision layer, many workloads.

We do not ship a workload-specific product per box. The mechanism is the same in every one — a policy your backend generates and sends with the call, evaluated filter → rank → select → fallback over the live catalog.

support RAG agents batch extraction compliance legal review

Whatever the workload, the primitive underneath is the same: policy is the thing you send, and the cheapest model that passes your rules answers — with a trace to prove it.

Find your problem, then send a policy.

Cost, reliability, hardcoded names, tenant rules, traces — one primitive sits under all of them. Start with the use case that hurts most, or join the waitlist and we'll help you wire it in.

Read the docs

No SDK rewriteYour provider keysEvery request traced