Product Use cases Compare Pricing Docs Blog
Read the docs Join the waitlist
use case · per-customer policies

Different customers, different model rules.

In B2B SaaS, no two tenants want the same thing from your AI. Compile each customer's state into a policy your backend sends with the call — no dashboard, no redeploy.

the problem

One model rule can't serve every tenant.

Your enterprise account wants approved providers only. The free tier needs to stay cheap. An EU customer needs region-allowed models. A regulated tenant can't send data to certain providers and wants nothing logged. A premium account is paying for the best answer. Same product, six different model decisions — and they change as customers upgrade, expand to new regions, or sign a DPA.

the old way

Branching code and gateway-config sprawl.

So you branch. if tier == "enterprise" picks one model; an EU flag swaps another; a sensitivity check forks a third path. The logic spreads across product code and a stack of per-tenant gateway configs that drift out of sync. Adding a region or a provider restriction means a code change, a review, and a deploy — and you still can't easily prove, for a given request, which rule applied.

the unhardcoded way

Compile tenant state into a runtime policy.

You already know who the customer is at request time — their plan, region, and data class live in your database. Read that state and build a policy_ir: a small JSON rule the router validates, fingerprints, and evaluates against the live catalog. It routes to the cheapest model that passes that tenant's rules — over your own provider keys. The decision lives in data your backend generates per request, not in a dashboard you click through or a deploy you wait on.

1

Read tenant state

Plan, region, data class, and any DPA flags — straight from your own records, at request time.

2

Build the policy

Compile that state into a policy_ir and send it with one OpenAI-compatible call. No per-tenant config to maintain.

3

Route & trace

The router runs filter → rank → select → fallback and writes a trace showing which tenant rule applied.

A tenant policy is the same primitive as any other — a rule sent with the call. See how runtime policies work end to end.

examples

One field of tenant state, one rule.

Each customer attribute maps to a clause your backend adds to that request's policy. Model names and figures below are illustrative.

Free tier → cheaper floor

Relax the quality floor so the cheapest passing model can win — usually gemini-3.5-flash at $0.018 a run instead of a frontier model. Keeps unit economics sane on accounts that don't pay.

["cmp","bench_intelligence","ge",0.5]

Enterprise → approved providers only

The account's contract names which providers are allowed. The filter drops everything off-list before ranking, so routing can only ever land on a model their procurement team signed off on.

["in","provider",["…approved"]]

EU tenant → region-allowed models

Restrict candidates to models served from permitted regions. A model hosted outside the allowed set is filtered out — no manual gateway swap, no separate EU deployment.

["eq","region","eu"]

Sensitive data → block providers & no_log

For regulated workloads, filter out providers the tenant won't allow and set no_log so request bodies aren't retained. The trace still records the decision; the payload doesn't persist.

["nin","provider",["…blocked"]] · no_log

VIP account → higher quality floor

Raise the floor so only stronger survivors qualify — e.g. claude-sonnet-4-6 (score 0.57) or gpt-5.5 (score 0.60). They pay for the best answer; the policy guarantees it without pinning a single model name.

["cmp","bench_intelligence","ge",0.55]

Mixed estate → one endpoint

All of the above run through the same OpenAI-compatible endpoint. The only thing that varies per customer is the policy your backend builds — preview it with GET /x/fields and POST /x/rank before it ships.

dry-run · /x/fields · /x/rank

Every tenant's rules — the floor, the allowed providers, the region, the no_log flag — are visible in that request's trace, by fingerprint pol_8f41c2 (sigma-pol/v1). See why every decision is auditable.

Read how the trace proves the rule that applied

Give every customer their own model rules.

Compile tenant state into a policy your backend sends with the call — cheaper floors for free, approved providers for enterprise, region limits for the EU, and a trace that proves which rule applied. Join the waitlist to build it on your own keys.

No SDK rewriteYour provider keysEvery request traced