use cases

Where model choice shouldn’t live in app code.

Use unhardcoded wherever model choice should not be buried in application code. The policy travels with the request, the cheapest model that passes answers, and the trace shows why — the mechanism is the same everywhere. Pick the place it bites.

one mechanism, five places it bites

Same policy, same trace. Pick where it hurts.

It’s the same policy and the same kind of trace every time — only the decision inside changes. Switch tabs to see it re-shot for each place.

cost

Easy requests should not hit expensive models by default.

Set a quality floor and a price ceiling; the router takes the cheapest model that clears them — here gemini-3.5-flash at $0.018, not a pinned premium model.

See the cost use case

cost · cheapest that passesreq-204815

deepseek-v4-flash0.42 · $0.011below floor

gemini-3.5-flash0.54 · $0.018selected

gpt-5.50.60 · $0.063over floor

cost$0.018 · ↓71% vs gpt-5.5

The cheapest model that clears the floor wins — the trace shows the spread you’d otherwise overpay.

The workload changes; the primitive does not. Policy is the thing you send — and the same trace proves every decision.

Move model choice out of app code, then send a policy.

Cost, reliability, tenant rules, workflows, audit — one control layer sits under all of them. Point your existing SDK at one endpoint, send a policy with the call, keep your provider keys, and get a trace for every decision.

Read the docs

No SDK rewriteYour provider keysEvery request traced