use case · routing traces

Know why every model decision happened.

Logs tell you which model answered. The trace tells you why it was chosen — the policy sent, every candidate considered, which models were rejected and by which rule, and the fingerprint to replay it months later. New here? unhardcoded is a runtime LLM policy layer — you send a policy with your OpenAI-compatible call, and it routes to the cheapest model that passes your rules, over your own provider keys.

the problem

Your logs show what answered, not why it was chosen.

A line in your logs says gemini-3.5-flash returned a reply. It doesn't say which models were in the running, which rule dropped the cheaper one, whether a fallback fired, or what the call would have cost on your old baseline. When a finance lead asks "why did this run cost what it did," or a customer disputes an answer, you're reconstructing the decision from memory and scattered config.

the unhardcoded way

Every run writes a replayable receipt.

The router runs your policy — filter → rank → select → fallback — and records the decision in one object. Nothing is implicit. The receipt below is the proof that the rules were followed; every row is in it because the router put it there.

trace · req-204815 200 OK

selectedgemini-3.5-flash · score 0.54

reasontools ✓, over quality floor, price ceiling passed, cheapest survivor

rejecteddeepseek-v4-flash filtered: below quality floor (score 0.42 < 0.5)

rejectedmistral-small-4 filtered: no tools

fallbackclaude-sonnet-4-6 → gpt-5.5 · standby cascade, not triggered this run

latency412 ms

cost$0.018 · ↓71% vs gpt-5.5 baseline ($0.063) · illustrative

policysupport · fingerprint 301140696-1054914287 · sigma-pol/v1

Every rule on the receipt points at a catalog field you can inspect — the quality floor is bench_intelligence, not a black box. The same receipt is written whether the run was one call or a 5-node workflow that stitches one trace across every step. Read the trace schema →

who uses it

One receipt, four different questions.

The trace is the same object for everyone who needs to answer for a model decision — they just read different rows of it.

Engineering · investigate a reply

A reply looks wrong, or a customer disputes one. Pull that request's trace by its id and read the decision: which model produced it, which rule let it through, and whether a fallback quietly changed the pick.

Finance · cost review

Every run carries the baseline it beat. Roll the traces up to see what routing to the cheapest passing model actually saved — per workload, with the pricier rejected candidates listed.

Audit · governance

Show the rule that was enforced on a request, the policy fingerprint that produced it, and the models excluded by region or capability — on the record, not reconstructed after the fact.

replayable

Reconstruct the decision months later.

The fingerprint on the receipt — 301140696-1054914287 (sigma-pol/v1) — pins the exact rule that was sent. Dry-run it against the catalog with POST /x/rank and GET /x/fields to confirm the same model still passes, with no inference and no production traffic. The decision is auditable because it's reproducible.

Pair traces with cost control to prove the savings are real, not estimated — every run carries the baseline it beat.

Make every model decision answer for itself.

A replayable trace on every run — the policy sent, the rejected candidates and the rule, the winner and its cost versus baseline. Join the waitlist and put the decision on the record.

No SDK rewriteYour provider keysEvery request traced