Product Open Source Pricing Docs Blog
Read the docs Request access
Runtime router · OpenAI-compatible

Stop hardcoding model choices.

Route each LLM call to the cheapest model that still passes your rules. Keep your OpenAI SDK, bring your provider keys, and get a trace showing why each model was accepted, rejected, or used.

No SDK rewriteYour provider keysCheapest model that passesAutomatic failoverEvery request traced
POST api.unhardcoded.com/v1/chat/completions
Request · generated at runtime, sent with the callpolicy_ir
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://api.unhardcoded.com/v1",
apiKey: process.env.UNHARDCODED_KEY,
});
const res = await client.chat.completions.create({
model: "policy:support", // free-form trace label
policy_ir, // the routing rule your backend generated
messages,
});
policy_ir hashed & validated before it runs
Response · routing decision ready
deepseek-v4-flash score 86 · $0.011 below floor
mistral-small-4 no tools · $0.014 filtered
gemini-3.5-flash score 94 · $0.018 selected
claude-sonnet-4-6 score 95 · $0.041 over floor
gpt-5.5 score 98 · $0.062 over floor
claude-opus-4-8 score 97 · $0.121 over floor
cost$0.018 · ↓71% vs gpt-5.5
tracereq_8f41
latency412 ms
failoverclaude-sonnet-4-6 → gpt-5.5 · standby cascade
The policy travels with the request Demo trace using a fixed example request — no inference runs in your browser. Replay it any time.
how it works

From hardcoded model to runtime decision.

A pinned model name is a decision frozen in application code — it overpays, fails without a plan, and no one can explain it later. unhardcoded turns model selection into a rule your backend generates and sends with each call.

Cost

Easy calls still hit your most expensive model.

Reliability

Provider failures need manual fallback logic in your app code.

Control

No one can explain why a model was used for a specific request.

Send the rule with the call, and the router does the rest:

1

Generate

Your backend builds a policy_ir — a small JSON rule — at request time, from your own tenants, tiers, and logic.

2

Send with the call

Attach it to one OpenAI-compatible request. The router validates it against the grammar and hashes it; unknown ops or fields are rejected.

3

Route & trace

It routes to the cheapest model that passes your rules over your own keys, fails over if needed, and writes a replayable, auditable trace.

policy_ir

The rule is just data. It travels with the call, gets validated, hashed, and traced before it runs.

See how policies work
where teams use it

One rule, many workloads.

Support

Cheap models for routine replies; stronger models when tools or a quality floor are required.

RAG

Require long context only when the request actually needs it.

Agents

Route planning, tool use, and verification with different floors.

Batch jobs

Run high-volume extraction, classification, and evals on the cheapest passing model.

Compliance

Keep region, capability, and budget rules visible in every trace.

Workflows

Multi-step flows where each step carries its own policy. See workflows in Product →

trust from precision, not promises

Built for control, not lock-in.

No vanity metrics, no logos we can't show yet — here is exactly what the router does with your traffic and your keys.

Your provider accounts stay yours

Bring your OpenAI, Anthropic, and Gemini keys. unhardcoded routes through them — no token resale, no markup on inference.

Open policy layer

The policy language, canonical encoding, and reference interpreter are open source under Apache-2.0. Self-host against your own catalog if you want. Read the spec →

Every decision is replayable

Each routing decision is recorded and replayable by fingerprint — reconstruct months later exactly why a model was chosen, from the policy_ir that was sent.

Managed host runs the parts you don't want to maintain

Provider modules, live catalog, key management, traces, history, failover, and uptime — maintained for you, priced per run.

Unlike config-based gateways, unhardcoded sends the routing rule with each call. Read the comparison →

$pricing

Priced per run, not per token.

Start free and keep your own provider keys — you pay your providers for inference. What you pay us runs and maintains the host you would otherwise build yourself, not token margin. A flow counts as one billable run up to 5 nodes.

Free

$0 · 10K runs / mo

One policy, BYO keys, full replayable traces.

Pro

$49 / mo · 250K runs

Unlimited policies & flows, quality floors, 90-day history.

Scale

$199 / mo · 1.5M runs

Roles, SSO, audit log, shared policy library, priority support.

All figures illustrative · priced per run, not per token · a flow = one billable run up to 5 nodes · keep your provider accounts

FAQbefore you wire it in

Questions, answered plainly.

Do I have to rewrite my app?
No. unhardcoded is OpenAI-compatible. Point the SDK's baseURL at the endpoint, replace the model name with a policy:* name, and send the policy_ir with the call. Your messages and parameters pass through unchanged.
What does "send the policy with the call" actually mean?
The policy is a policy_ir — a JSON array — that travels in the request body. The router hashes it, validates it (rejecting unknown ops or undeclared fields), then evaluates it over the live catalog. The same call can carry a different policy; nothing is pinned to a dashboard config.
Are you reselling tokens?
No. You bring your own provider keys and pay your providers directly for inference. We price the routing per run, never the tokens.
Can I run it myself?
Yes. The policy_ir term language, its canonical encoding, and the reference interpreter are open source. You can self-host the router against your own catalog, or use the managed endpoint for traces, history, and support.
Why pay if the core is open source?
The open core gives you the policy language and interpreter. The hosted product gives you the maintained host: provider modules, live catalog data, key management, OAuth, traces, history, failover, uptime, and support. You can self-host, but most teams do not want to operate their own LLM routing infrastructure.
What happens when a model fails?
The fallback step in your policy decides. By default the router moves to the next candidate under the floor, cheapest-first. Every hop — with latency, cost, and reason — is written to the trace.

Stop hardcoding. Send the decision.

Point your traffic at one endpoint, generate a policy or workflow at runtime, and send it with the call. unhardcoded routes through your provider keys and traces every decision.

No credit cardKeep your provider accountsEvery request traced