open source · policy routing for AI models

Build AI systems without hardcoded model choices.

Send a policy with each request. unhardcoded filters the live model catalog, picks the cheapest model that satisfies your rules, runs it through your provider keys, and returns a trace of the decision.

request

Prompt + policy

tools · intel ge 0.5 · cheapest

→

candidateprice_outintelverdict

deepseek-v4-flash$0.400.465below floor

minimax-m2.7$0.500.496below floor

deepseek-v4-pro$1.500.515selected

glm-5.1$2.000.514passes

gpt-5.5$10.000.602passes

→

trace

deepseek-v4-pro

cheapest passing · 412 ms
fp 301140696-1054914287

Route one call

Point your SDK at the host, attach a policy, read the trace.

Quickstart →

Run a workflow

Compose routed calls into a DAG: triage, fan-out, judge, merge.

See the patterns →

Read a trace

Every decision leaves a receipt: who was considered, who won, why.

Inspect a decision →

the mental model

The model routing loop

One policy decides one call. Same inputs, same catalog, same decision. Every time, and written down.

Request

Your call hits the OpenAI-compatible endpoint, carrying a policy.

Policy sigma-pol/v2

A small term: filter, rank, select, mutate, fallback.

Catalog

The live set of (provider, model) candidates with prices, benchmarks, capabilities.

Filter

Drop anything that fails a rule. No silent downgrade.

Rank

Score the survivors: cheapest, strongest, fastest, your call.

Select

Take the top one, or a top_k failover cascade.

Run

Inference over your provider keys, with failover on error.

Trace

A receipt of every candidate, the winner, and the fingerprint.

How policies work, in depth →

compose routed calls

Workflow patterns

A workflow is a bounded, acyclic graph of routed steps. Each node carries its own policy and routes independently; the whole graph writes one stitched trace. This is where unhardcoded stops being "routing with rules" and becomes a system.

Support ticketlinear · guard

Triage cheap, draft to a quality floor, then a strong no-log guard that can refuse before anything ships.

Why it matters: the last step is a policy-enforced gate that can abort the send, not a hoped-for check.

Ticket

→

triageclassify · extract iddeepseek-v4-flash

→

draftreply · quality floorgemini-3.1-pro-preview

→

guardbrand · PII · no-loggpt-5.5 · can abort

→

Reply

View flow_ir

flow.support-ticket.json

["flow", {
  "u": {"kind": "input"},
  "t": {"kind": "llm", "system": "Classify the ticket and extract the account id as JSON.",
    "policy": ["policy", ["and", ["meets_req"], ["not", ["is", "disabled"]], ["has_cap", "supports_json_mode"]],
      ["neg", ["normalize", ["field", "price_out"]]], ["argmax"], ["id"], ["always", {"action": "next_candidate"}]],
    "inputs": ["u"]},
  "d": {"kind": "llm", "system": "Write a reply using the ticket and the triage.",
    "policy": ["policy", ["and", ["meets_req"], ["not", ["is", "disabled"]], ["cmp", "bench_intelligence", "ge", 0.55]],
      ["neg", ["normalize", ["field", "price_out"]]], ["argmax"], ["id"], ["always", {"action": "next_candidate"}]],
    "inputs": ["u", "t"], "template": "Ticket:\n$1\n\nTriage:\n$2"},
  "g": {"kind": "llm", "system": "Check brand voice, PII, refund limits. Refuse if any fail.",
    "policy": ["policy", ["and", ["meets_req"], ["not", ["is", "disabled"]], ["is", "no_log"]],
      ["field", "bench_intelligence"], ["argmax"], ["id"], ["always", {"action": "next_candidate"}]],
    "inputs": ["d"]},
  "out": {"kind": "output", "inputs": ["g"]}
}]

Draft → critique → reviselinear · refine

A cheap first draft, a strong critic that lists the flaws, then a rewrite that fixes every point.

Why it matters: a quality jump on a budget: most of the tokens run on the cheap model.

Question

→

draftcheapestdeepseek-v4-flash

→

critiquelist the flawsgpt-5.5

→

revisefix every point · fan-ingpt-5.5

→

Answer

View flow_ir

flow.draft-critique-revise.json

["flow", {
  "u": {"kind": "input"},
  "d": {"kind": "llm", "system": "Draft an answer.",
    "policy": ["policy", ["and", ["meets_req"], ["not", ["is", "disabled"]]],
      ["neg", ["normalize", ["field", "price_out"]]], ["argmax"], ["id"], ["always", {"action": "next_candidate"}]],
    "inputs": ["u"]},
  "c": {"kind": "llm", "system": "Critique the draft: list concrete flaws and gaps.",
    "policy": ["policy", ["and", ["meets_req"], ["not", ["is", "disabled"]]],
      ["field", "bench_intelligence"], ["argmax"], ["id"], ["always", {"action": "next_candidate"}]],
    "inputs": ["d"]},
  "r": {"kind": "llm", "system": "Rewrite the answer, fixing every point in the critique.",
    "policy": ["policy", ["and", ["meets_req"], ["not", ["is", "disabled"]]],
      ["field", "bench_intelligence"], ["argmax"], ["id"], ["always", {"action": "next_candidate"}]],
    "inputs": ["u", "d", "c"], "template": "Question:\n$1\n\nDraft:\n$2\n\nCritique:\n$3"},
  "out": {"kind": "output", "inputs": ["r"]}
}]

Best-of-N + judgeensemble

N seeded draws from the same strong policy, then a judge picks the single best.

Why it matters: sample spreads the draws across the top of the ranking deterministically: reproducible diversity, not luck.

Prompt

→

draft Asample · T=0.5gpt-5.5

draft Bsample · T=0.5gpt-5.4

draft Csample · T=0.5gemini-3.1-pro-preview

→

judgerank · pick winnergpt-5.5

→

Answer

View flow_ir

flow.best-of-n.json

["flow", {
  "u": {"kind": "input"},
  "n1": {"kind": "llm", "system": "Answer the question.",
    "policy": ["policy", ["and", ["meets_req"], ["not", ["is", "disabled"]]],
      ["field", "bench_intelligence"], ["sample", 0.5], ["id"], ["always", {"action": "next_candidate"}]],
    "inputs": ["u"]},
  // n2, n3: the same policy, two more seeded draws
  "j": {"kind": "llm", "system": "Pick the single best candidate; return it verbatim.",
    "policy": ["policy", ["and", ["meets_req"], ["not", ["is", "disabled"]]],
      ["field", "bench_intelligence"], ["argmax"], ["id"], ["always", {"action": "next_candidate"}]],
    "inputs": ["n1", "n2", "n3"], "template": "A:\n$1\n\nB:\n$2\n\nC:\n$3"},
  "out": {"kind": "output", "inputs": ["j"]}
}]

Model panelfan-out · fuse

Three models pinned by family draft in parallel; a fourth synthesizes the best single answer.

Why it matters: family_eq pins an exact model line, so a panel is reproducible, not "whatever was cheapest today."

Question

→

draft afamily_eqgemini-3.1-pro-preview

draft bfamily_eqclaude-opus-4-8

draft cfamily_eqdeepseek-v4-flash

→

fusesynthesize · fan-ingpt-5.5

→

Answer

View flow_ir

flow.panel.json

["flow", {
  "u": {"kind": "input"},
  "a": {"kind": "llm", "system": "Draft an answer.",
    "policy": ["policy", ["and", ["meets_req"], ["not", ["is", "disabled"]], ["family_eq", "gemini-3.1-pro-preview"]],
      ["zero"], ["argmax"], ["id"], ["always", {"action": "next_candidate"}]], "inputs": ["u"]},
  // b: family_eq "claude-opus-4-8"  ·  c: family_eq "deepseek-v4-flash"
  "f": {"kind": "llm", "system": "Synthesize the single best answer from the drafts.",
    "policy": ["policy", ["and", ["meets_req"], ["not", ["is", "disabled"]]],
      ["field", "bench_intelligence"], ["argmax"], ["id"], ["always", {"action": "next_candidate"}]],
    "inputs": ["a", "b", "c"]},
  "out": {"kind": "output", "inputs": ["f"]}
}]

Specialist splitfan-out · merge

A reasoner and a coder run in parallel under different filters, then a merge step fuses them.

Why it matters: each branch picks the best model for its job (reasoning vs coding) instead of one model doing everything.

Question

→

reasonis cap_reasoninggpt-5.5

codecoding top-5gpt-5.4

→

mergeone answer · fan-ingemini-3.1-pro-preview

→

Answer

View flow_ir

flow.specialist-split.json

["flow", {
  "u": {"kind": "input"},
  "rz": {"kind": "llm", "system": "Reason through the problem step by step.",
    "policy": ["policy", ["and", ["meets_req"], ["not", ["is", "disabled"]], ["is", "cap_reasoning"]],
      ["field", "bench_intelligence"], ["argmax"], ["id"], ["always", {"action": "next_candidate"}]], "inputs": ["u"]},
  "cd": {"kind": "llm", "system": "Produce any code the problem needs.",
    "policy": ["policy", ["and", ["meets_req"], ["not", ["is", "disabled"]], ["cmp", "bench_coding_rank", "le", 5]],
      ["field", "bench_coding"], ["argmax"], ["id"], ["always", {"action": "next_candidate"}]], "inputs": ["u"]},
  "m": {"kind": "llm", "system": "Merge the reasoning and the code into one answer.",
    "policy": ["policy", ["and", ["meets_req"], ["not", ["is", "disabled"]]],
      ["field", "bench_intelligence"], ["argmax"], ["id"], ["always", {"action": "next_candidate"}]],
    "inputs": ["rz", "cd"], "template": "Reasoning:\n$1\n\nCode:\n$2"},
  "out": {"kind": "output", "inputs": ["m"]}
}]

The full workflow guide: node kinds, graph limits, templates →

ship the first request

Quickstart

unhardcoded is OpenAI-compatible. Three changes from a normal call.

Point your SDK at the host.

Change the baseURL; everything else in the SDK stays the same.

client.ts

const client = new OpenAI({
  baseURL: "https://<your-host>/v1",
  apiKey: process.env.UNHARDCODED_KEY,
});

Attach a policy to the call.

Build a policy_ir in your backend and send it alongside messages. Routing comes from the policy, so model is just a trace label.

route.ts

const res = await client.chat.completions.create({
  model: "policy:support",
  policy_ir: ["policy",
    ["and", ["meets_req"], ["not", ["is", "disabled"]],
           ["cmp", "bench_intelligence", "ge", 0.5]],   // filter
    ["neg", ["normalize", ["field", "price_out"]]],          // rank: cheapest
    ["argmax"], ["id"], ["always", {"action": "next_candidate"}]],
  messages,
});

Read the trace.

The response carries the decision: the chosen model, the candidates it ranked and rejected, and the policy fingerprint.

response · trace (illustrative)

{
  "chosen": { "model_family": "deepseek-v4-pro", "price_out": 1.5 },
  "trace": {
    "policy_fingerprint": "301140696-1054914287",
    "rejected": [{ "model_family": "deepseek-v4-flash", "reason": "cmp bench_intelligence ge 0.5" }],
    "total_latency_ms": 425
  }
}

Full quickstart: auth, dry-runs, a runnable example →

copy a starting point

Policy presets

Common routing patterns as cards. Read the rules, copy the policy, adjust the floor and ceiling.

Cheapest decent

Cut cost without dropping below a quality bar.

filtertools-met · bench_intelligence ge 0.5

rankcheapest price_out

fallbacknext passing model

View JSON

cheapest-decent.json

["policy", ["and", ["meets_req"], ["not", ["is", "disabled"]],
         ["cmp", "bench_intelligence", "ge", 0.5]],
  ["neg", ["normalize", ["field", "price_out"]]],
  ["argmax"], ["id"], ["always", {"action": "next_candidate"}]]

Smart balance

No strong preference: weigh capability against price.

filtertools-met · not disabled

rank0.6 intelligence + 0.4 cheap

fallbacknext passing model

View JSON

smart-balance.json

["policy", ["and", ["meets_req"], ["not", ["is", "disabled"]]],
  ["add",
    ["scale", 0.6, ["normalize", ["field", "bench_intelligence"]]],
    ["scale", 0.4, ["neg", ["normalize", ["field", "price_out"]]]]],
  ["argmax"], ["id"], ["always", {"action": "next_candidate"}]]

Best intelligence

Critical task, no cost ceiling: take the most capable.

filtertools-met · not disabled

rankhighest bench_intelligence

fallbacknext passing model

View JSON

best-intelligence.json

["policy", ["and", ["meets_req"], ["not", ["is", "disabled"]]],
  ["field", "bench_intelligence"],
  ["argmax"], ["id"], ["always", {"action": "next_candidate"}]]

Reasoning only

The task needs a reasoning-capable model.

filter+ is cap_reasoning

rankhighest intelligence

fallbacknext passing model

View JSON

reasoning-only.json

["policy", ["and", ["meets_req"], ["not", ["is", "disabled"]], ["is", "cap_reasoning"]],
  ["field", "bench_intelligence"],
  ["argmax"], ["id"], ["always", {"action": "next_candidate"}]]

Vision · cheapest

Image input: the cheapest model that can see.

filter+ is in_image

rankcheapest price_out

fallbacknext passing model

View JSON

vision-cheapest.json

["policy", ["and", ["meets_req"], ["not", ["is", "disabled"]], ["is", "in_image"]],
  ["neg", ["normalize", ["field", "price_out"]]],
  ["argmax"], ["id"], ["always", {"action": "next_candidate"}]]

Long-context RAG

Needs a large context window, then the cheapest that fits.

filter+ context ge 200000

rankcheapest price_out

fallbacknext passing model

View JSON

long-context-rag.json

["policy", ["and", ["meets_req"], ["not", ["is", "disabled"]], ["cmp", "context", "ge", 200000]],
  ["neg", ["normalize", ["field", "price_out"]]],
  ["argmax"], ["id"], ["always", {"action": "next_candidate"}]]

Agentic fleet

Best tool-users for an agent loop.

filter+ tools · bench_agentic_rank le 5

rankhighest bench_agentic

fallbacknext passing model

View JSON

agentic-fleet.json

["policy", ["and", ["meets_req"], ["not", ["is", "disabled"]],
         ["has_cap", "supports_tools"], ["cmp", "bench_agentic_rank", "le", 5]],
  ["field", "bench_agentic"],
  ["argmax"], ["id"], ["always", {"action": "next_candidate"}]]

Private / compliant

TEE-only and no logging: capability first.

filter+ is has_tee · is no_log

rankhighest intelligence

fallbacknext passing model

View JSON

private-compliant.json

["policy", ["and", ["meets_req"], ["not", ["is", "disabled"]],
         ["is", "has_tee"], ["is", "no_log"]],
  ["field", "bench_intelligence"],
  ["argmax"], ["id"], ["always", {"action": "next_candidate"}]]

All presets, plus practical recipes →

the trust object

Every decision leaves a receipt

A trace is a structured, replayable record of how the model was chosen: which models were considered, why each passed or failed, what ran, and how to reproduce it.

decision_trace

deepseek-v4-flashrejected · cmp bench_intelligence ge 0.5

minimax-m2.7rejected · cmp bench_intelligence ge 0.5

deepseek-v4-proranked #1 · chosen

glm-5.1ranked #2 · cascade

gpt-5.5ranked #3 · cascade

deepseek-v4-prodecision_path · attempted · 412 ms

policy_fingerprint 301140696-1054914287 · sigma-pol/v2

rejected[]

Every filtered-out candidate with the exact rule that dropped it.

ranked[]

The survivors in selection order; the first is the pick, the rest are the failover cascade.