Docs & Quickstart — unhardcoded · Ship a Policy in 5 Minutes

quickstart

From your SDK to a traced call.

No app rewrite, no server config. Point the OpenAI SDK at one endpoint, build a policy in your backend, send it with the call, and read the trace it returns.

You need an unhardcoded API key for the apiKey below, and your provider keys (OpenAI, Anthropic, Gemini, …) configured on the workspace — inference always runs over your own accounts. During early access, both are set up with you during onboarding.

Install

Keep the OpenAI SDK you already use. The policy_ir is plain JSON — no extra package required.

terminalshell

$ npm i openai

Point the client at the endpoint

Same SDK, one new baseURL. Your messages and parameters pass through unchanged.

client.tstypescript

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://api.unhardcoded.com/v1",
  apiKey: process.env.UNHARDCODED_KEY,
});

Build a policy and send it with the call

Generate the policy_ir term at request time, then attach it to create(). The router picks the cheapest survivor that clears your rules. The model field is a free-form label used only to group traces — routing comes from the attached policy_ir, so any string works.

route.tstypescript

// built in your backend, at request time — a plain JSON term
const policy_ir = [
  "policy",
  ["ev_zero"],
  ["and", ["meets_req"], ["not", ["is", "disabled"]], ["has_cap", "supports_tools"],
         ["cmp", "bench_intelligence", "ge", 0.5]],  // filter
  ["neg", ["normalize", ["field", "price_out"]]],          // cheapest survivor
  ["argmax"], ["id"],
  ["always", { action: "next_candidate" }],
];

const res = await client.chat.completions.create({
  model: "policy:support",  // free-form trace label, not a route
  policy_ir,
  messages,
});

Not on the OpenAI SDK? It is a plain HTTP call — policy_ir is a top-level sibling of model and messages in the JSON body:

POST /v1/chat/completionsbash

$ curl https://api.unhardcoded.com/v1/chat/completions \
    -H "Authorization: Bearer $UNHARDCODED_KEY" \
    -H "Content-Type: application/json" \
    -d '{
      "model": "policy:support",
      "messages": [{ "role": "user", "content": "…" }],
      "policy_ir": ["policy", ["ev_zero"], …]
    }'

The response carries the decision — which model was selected, why, the cost, and a replayable trace id (one illustrative run; the live pick depends on your catalog at request time):

response · trace (illustrative)json

{
  "selected": "gemini-3.5-flash",
  "reason":   "cheapest passing candidate",
  "policy":   "301140696-1054914287",
  "cost":     "$0.018",
  "trace":    "req-204815"
}

api reference

One OpenAI-compatible surface.

An OpenAI-compatible completions endpoint, dry-run helpers, a field-schema lookup, and one header. Carry a policy_ir or flow_ir in the request body; inference runs over the provider keys you configured on the host.

POST /v1/chat/completions: OpenAI-compatible completions. Carry policy_ir or flow_ir in the body; everything else is the standard request shape. The router resolves the model over the live catalog and writes the decision to the trace.
POST /x/rank: dry-runReturns the candidate ranking and per-model verdicts without running inference. Use it to preview which models clear the floor and why before a single token is spent.
POST /x/policy/normalize: dry-runAdmits a policy_ir and returns its canonical form, content fingerprint, and grammar version — identify and cache a term without running it.
POST /x/flow/normalize: dry-runAdmits and identifies a flow_ir — the bounded graph and each node's policy — before use, so a malformed workflow fails fast instead of mid-run.
GET /x/fields: Returns the live field vocabulary — the core fields plus this host's registered extensions — that policies gate on with cmp/is and score on with field. The source of truth for valid field names (e.g. price_out, context, bench_intelligence).
Authorization: Bearer <key>: Authenticate every request with your unhardcoded key. The key identifies your workspace and its trace history — never a provider account.

Note: BYO provider keys are configured on the host. unhardcoded routes through your own OpenAI, Anthropic, Gemini, and other accounts — it does not resell tokens or mark up inference, and it never bills the model spend.

the sdk

The raw `policy_ir` term.

The real interface is the term itself: a plain JSON array you inspect, hash, log, and replay. A policy is a seven-element array — ["policy", evidence, filter, rank, select, mutate, fallback]. You author the filter, rank, and select; keep evidence (["ev_zero"]), mutate (["id"]), and fallback (["always", {"action":"next_candidate"}]) as-is. The evidence slot reserves room for future provenance-weighting of candidates; today every policy leaves it ["ev_zero"] (none attached). Hover a verb to see where it lives in the term.

policy:support.ir.json sigma-pol/v1

[ "policy",
  // evidence — the fixed evidence slot, kept as-is
  ["ev_zero"],
  // filter — the gate; narrows the host floor, never widens it
  ["and", ["meets_req"], ["not",["is","disabled"]],
        ["has_cap","supports_tools"],
        ["cmp","bench_intelligence","ge",0.5],
        ["cmp","price_out","le",5]],
  // rank — score the survivors, cheapest wins
  ["neg",["normalize",["field","price_out"]]],
  // select — take the single top of the ranked list
  ["argmax"],
  // mutate — pass the request through unchanged
  ["id"],
  // fallback — on any failure, next candidate
  ["always",{"action":"next_candidate"}] ]

The term is just data — built in your backend, sent with the call. A typed builder (@unhardcoded/policy / buildPolicy(...)) is planned convenience sugar over this language — it will lower a compact spec to the same array. It is not shipping yet; today the raw term above is the interface, and every preset below is one of these arrays you can copy as-is.

The full sigma-pol/v1 operator vocabulary, grouped by position. These are the core ops — there is no filter/rank/select/fallback wrapper.

filter — predicates: and / or / not — boolean combinators · has_cap <name> — a candidate capability flag (e.g. supports_tools, supports_json_mode) · is <flag> — a boolean model field (e.g. cap_tools, cap_reasoning, in_image, no_log, has_tee, disabled); image and reasoning gates use is (in_image / cap_reasoning), not has_cap · cmp <field> <op> <value> — a numeric bound (ge, le, …) · meets_req — the request's own implied requirements, auto-derived from the call (tool calls require tool support, image inputs require vision, a JSON response format requires structured output) · family_eq <name> — keep only candidates in one model family (pin a node to a single provider or model line). Returns the surviving candidates.
rank — scorers: field <name> — a raw catalog field · normalize — scale a sub-score to 0–1 · neg — invert (cheapest, lowest-latency) · scale <w> — weight a sub-score · add — sum weighted sub-scores · zero — a constant score (when the filter already leaves a single survivor). Produces one score per survivor.
select — selectors: argmax — the single highest-scoring model · top_k <n> <sel> — keep the top n as an ordered cascade · sample <t> — a reproducible stochastic pick.
mutate — request mutators: id — pass the request through unchanged (the default). Request-mutators such as clamp_param also exist for shaping parameters on the chosen call.
fallback: always {"action":"next_candidate"} — on a provider failure, advance to the next survivor in the cascade.

routing semantics

Filter first. Rank survivors. No silent downgrade.

Routing is deterministic and ordered. Spend ceilings and quality minimums live in the filter, not the score — a cheap model can never win on points against your rules. If it does not clear the floor, it is not a candidate at all.

Filter first

Candidates that lack a required capability, miss the quality floor, or exceed the price ceiling are eliminated — removed, never silently substituted.

Rank the survivors

Among the models that cleared the floor, the scorer orders them — usually cheapest-first.

Select the top

One model, chosen deterministically — the cheapest that passed.

Fall back in order — and the floor is a guarantee, not a suggestion.

If the selected model times out or errors, the router moves to the next passing candidate, cheapest-first, and every hop is recorded. It optimizes cost beneath your floor, never around it. If no model meets the requirements, the request fails loudly — you never get a silent downgrade you didn't ask for.

playground · interactive

Change the rule. Watch the winner move.

The same filter-then-rank semantics, made tangible. Toggle the rules and the winner moves; tighten the floor past what any model can meet and the call fails loudly instead.

tools

quality

price ≤

vision

Illustrative catalog and figures. Against the live catalog, POST /x/rank returns the same per-model verdicts without running inference. See the worked example →

trace schema

The trace object.

Every completion carries a trace: a structured record of how the model was chosen. The trace is what makes every routing decision replayable and auditable — the same term, the same catalog snapshot, the same verdicts.

trace: The replayable identifier/fingerprint for the run — e.g. "req-204815".
policy.fingerprint: Content fingerprint of the normalized policy_ir. Identical terms share a fingerprint.
policy.version: The grammar version — "sigma-pol/v1".
selected: The chosen model id.
candidates[]: One entry per considered model: { model, status: "winner" | "passed" | "rejected", dropped_by: "<rule that eliminated it>" | null }. A rejected model carries the exact rule that dropped it; a passing model carries null.
reason: Human-readable why-this-model.
fallback[]: Ordered hops taken on failure: { from, to, cause }. An empty array if the first pick succeeded.
latency_ms: Total routing + inference latency.
cost: USD billed for the run.
usage: { prompt_tokens, completion_tokens } — informational only; billing is per run, not per token.
created: ISO timestamp.

Every rejected model names the exact rule that eliminated it in dropped_by, so the trace reads as a complete account of the decision. To see the same per-model verdicts before spending a token, dry-run with POST /x/rank — or try the playground above.

workflows · flow_ir

Compose policies into a workflow.

Not every task is one call. A workflow is a bounded (up to 256 nodes), acyclic graph of LLM steps — ["flow", { id: node, … }] — with exactly one input node and one output node. Each llm node carries its own system prompt and a full policy_ir term, so every step routes independently at request time over the live catalog. You send the whole graph as flow_ir on a single call, and it writes one stitched, replayable trace.

Edges are pull-model: a node's inputs list names the nodes it consumes, so "b": { inputs: ["a"] } means a → b — a node with two or more inputs is a fusion step, and there are no loops or conditionals. Switch the examples below; each box is an llm node with its own policy and model, and the dashed ends are the input / output nodes.

lineartriage → draft → guard · cheap where it can be, a strict final gate before send

Ticket

triageclassify · extract account id · cheapest JSONgemini-3.5-flash

draftreply from ticket + triage · quality floorclaude-sonnet-4-6

guardbrand voice · PII · refund limits · no-logclaude-opus-4-8can abort

Reply

That graph is just data. Below is a workflow written out as flow_ir — the same input / llm / output nodes, the same inputs edges, each llm node carrying its own policy:

flow_ir · draft → critique → revisejson

[
  "flow",
  {
    "u":        { "kind": "input" },

    "draft":    { "kind": "llm",
      "system": "Draft an answer.",
      "policy": [ "policy", … ],                // a full policy_ir term — cheapest survivor
      "inputs": [ "u" ] },

    "critique": { "kind": "llm",
      "system": "List the concrete flaws in the draft.",
      "policy": [ "policy", … ],                // strongest model, no cost ceiling
      "inputs": [ "draft" ] },

    "revise":   { "kind": "llm",
      "system": "Rewrite the answer, fixing every point.",
      "policy": [ "policy", … ],                // strongest model
      "inputs": [ "u", "draft", "critique" ],    // fan-in: three predecessors
      "template": "Q:\n$1\n\nDraft:\n$2\n\nCritique:\n$3" },

    "out":      { "kind": "output", "inputs": [ "revise" ] }
  }
]

Each policy is a full policy_ir term — the same seven-element array from The SDK above, elided here as …. Copyable end-to-end workflows are in Presets below.

A workflow has three node kinds. Author the llm nodes; input and output are the single entry and exit.

input: The single entry node — { "kind": "input" }. The call's messages enter the graph here. Exactly one per workflow.
llm: A routed step. system — its prompt · policy — a full policy_ir term, resolved over the live catalog at runtime · inputs — the node ids it consumes · template (optional) — joins multiple inputs with $1, $2, … placeholders in input order. Each node routes — and fails over — on its own.
output: The single exit node — { "kind": "output", "inputs": ["…"] }. Its inputs name the node whose result is returned to the caller. Exactly one per workflow.
edges & shape: Pull-model: inputs defines the DAG ("b": { inputs: ["a"] } is a → b). Fan-out is one node feeding several; fan-in is several feeding one fusion node. Acyclic, bounded, no loops or conditionals — every node runs once, so cost and latency are knowable up front.

Send a workflow as flow_ir on any completion; dry-run with POST /x/flow/normalize to admit the DAG and each node's policy before a single token is spent. Limits: ≤ 256 nodes, in-degree ≤ 32. Billing counts a workflow as one run up to 5 nodes, and the whole graph writes one stitched trace — see the trace object. For the picture, the workflow diagrams on the product page show three workflow shapes end to end.

presets

Copy a term, send it.

Ready-to-send policies and workflows. Each is a piece of data — drop a policy into policy_ir or a workflow into flow_ir on any OpenAI-compatible call, and the host admits it, fingerprints it, and interprets it deterministically over the live catalog. Dry-run first with POST /x/rank for policies or POST /x/flow/normalize for workflows.

Explore the interactive policy playground →

Each is a complete policy_ir term — the seven-element array from above, with the filter, rank, and select filled in for one job. Copy one as your starting point and edit the rules.

More starting points

Workflows · flow_ir

Six ready-to-send workflows — each a bounded graph where every node carries its own policy. The full flow_ir grammar is in Workflows above; the complete term on each card is folded, so open it to copy. Dry-run with POST /x/flow/normalize first.

troubleshooting

Common failures.

What each one means and what to change. Dry-run with POST /x/rank to see the decision before it costs anything.

No model passes the floor: Your filter excluded every candidate, so the request fails loudly instead of silently downgrading. Loosen a cmp bound or a hard ["is", …], and run /x/rank to see which rule dropped each model.
Unknown field or operator: Admission rejects a term that names a field the host doesn't serve or an op the interpreter doesn't know (invalid_policy). Use a real field from GET /x/fields — e.g. price_out, not price.
Provider key missing: The chosen model's provider isn't configured on your workspace, so the upstream call can't authenticate. Add the provider key during onboarding — inference always runs over your own accounts.
Provider timeout or error: The selected model errored or timed out; the router fails over to the next passing candidate and writes the hop to the trace. If every candidate fails, the request errors — widen the cascade with ["top_k", N, ["argmax"]].
Workflow exceeds the limits: A workflow is admitted only within bounds (≤ 256 nodes, in-degree ≤ 32); past that it's rejected before running — split the workflow. (Billing counts a workflow as one billable run up to 5 nodes; the structural cap is separate.)
Dry run passes, live call fails: /x/rank and /x/*/normalize admit and evaluate the term but run no inference — a live failure is a provider/runtime issue (rate limit, timeout, auth), not a policy error. Read the trace's decision path for the failing hop.

Ship a policy in five minutes.

From your SDK to a traced call.

Install

Point the client at the endpoint

Build a policy and send it with the call

One OpenAI-compatible surface.

The raw `policy_ir` term.

Which candidates are eligible

Order the survivors

Take the cheapest that passes

Leave the request untouched

Fall back in order

Filter first. Rank survivors. No silent downgrade.

Filter first

Rank the survivors

Select the top

Change the rule. Watch the winner move.

The trace object.

Compose policies into a workflow.

Copy a term, send it.

More starting points

Workflows · flow_ir

Common failures.

Stop hardcoding. Send the decision.

Ship a policy in five minutes.

From your SDK to a traced call.

Install

Point the client at the endpoint

Build a policy and send it with the call

One OpenAI-compatible surface.

The raw policy_ir term.

Which candidates are eligible

Order the survivors

Take the cheapest that passes

Leave the request untouched

Fall back in order

Filter first. Rank survivors. No silent downgrade.

Filter first

Rank the survivors

Select the top

Change the rule. Watch the winner move.

The trace object.

Compose policies into a workflow.

Copy a term, send it.

More starting points

Workflows · flow_ir

Common failures.

Stop hardcoding. Send the decision.

The raw `policy_ir` term.