開源 · AI 模型的策略路由層

建構 AI 系統，無需寫死模型選擇。

每個請求附帶一條策略。unhardcoded 篩選即時模型目錄，挑選符合規則、成本最低的候選模型，透過您自己的供應商金鑰執行，並回傳決策的追蹤紀錄。

request

Prompt + policy

tools · intel ge 0.5 · cheapest

→

candidateprice_outintelverdict

deepseek-v4-flash$0.400.465未達下限

minimax-m2.7$0.500.496未達下限

deepseek-v4-pro$1.500.515勝出

glm-5.1$2.000.514通過

gpt-5.5$10.000.602通過

→

trace

deepseek-v4-pro

cheapest passing · 412 ms
fp 301140696-1054914287

路由單次呼叫

將 SDK 指向託管位址，附加策略，讀取追蹤紀錄。

快速入門 →

執行工作流程

將路由呼叫組合成 DAG：分類、扇出、評判、合併。

查看模式 →

讀取追蹤紀錄

每次決策都留下憑證：哪些候選模型被考慮、誰勝出，以及原因。

檢視決策 →

核心概念

模型路由迴圈

一條策略決定一次呼叫。相同輸入、相同目錄、相同決策。每次皆然，且有書面紀錄。

Request

您的呼叫送達相容 OpenAI 的端點，並攜帶一條策略。

Policy sigma-pol/v2

一個小型項(term)：filter、rank、select、mutate、fallback。

Catalog

含有價格、基準分數與能力的（供應商、模型）候選模型即時集合。

Filter

排除不符合規則的候選模型，不存在靜默降級。

Rank

對通過篩選的候選模型評分：最便宜、能力最強、速度最快，由您決定。

Select

取排名最高的一個，或使用 top_k 備援串接。

Run

透過您的供應商金鑰執行推論，發生錯誤時自動備援。

Trace

記錄所有候選模型、勝出者及指紋(fingerprint)的憑證。

深入了解策略運作原理 →

組合路由呼叫

工作流程模式

工作流程是由路由步驟組成的有界無環圖。每個節點攜帶自己的策略並獨立路由；整個圖產生一條統一的追蹤紀錄。這正是 unhardcoded 從「帶規則的路由」演變為完整系統的地方。

支援工單linear · guard

低成本分類，依品質下限起草回覆，最後由一個強力 no-log 守衛在送出前決定是否拒絕。

關鍵：最後一步是由策略強制執行的閘門，可以中止送出，而不只是寄望它能發揮作用。

工單

→

triageclassify · extract iddeepseek-v4-flash

→

draftreply · quality floorgemini-3.1-pro-preview

→

guardbrand · PII · no-loggpt-5.5 · can abort

→

回覆

查看 flow_ir

flow.support-ticket.json

["flow", {
  "u": {"kind": "input"},
  "t": {"kind": "llm", "system": "Classify the ticket and extract the account id as JSON.",
    "policy": ["policy", ["and", ["meets_req"], ["not", ["is", "disabled"]], ["has_cap", "supports_json_mode"]],
      ["neg", ["normalize", ["field", "price_out"]]], ["argmax"], ["id"], ["always", {"action": "next_candidate"}]],
    "inputs": ["u"]},
  "d": {"kind": "llm", "system": "Write a reply using the ticket and the triage.",
    "policy": ["policy", ["and", ["meets_req"], ["not", ["is", "disabled"]], ["cmp", "bench_intelligence", "ge", 0.55]],
      ["neg", ["normalize", ["field", "price_out"]]], ["argmax"], ["id"], ["always", {"action": "next_candidate"}]],
    "inputs": ["u", "t"], "template": "Ticket:\n$1\n\nTriage:\n$2"},
  "g": {"kind": "llm", "system": "Check brand voice, PII, refund limits. Refuse if any fail.",
    "policy": ["policy", ["and", ["meets_req"], ["not", ["is", "disabled"]], ["is", "no_log"]],
      ["field", "bench_intelligence"], ["argmax"], ["id"], ["always", {"action": "next_candidate"}]],
    "inputs": ["d"]},
  "out": {"kind": "output", "inputs": ["g"]}
}]

起草 → 批評 → 修改linear · refine

先由低成本模型生成初稿，再由強力批評者列出缺陷，最後重寫並修正每個問題。

關鍵：在預算範圍內實現品質跨越——大部分權杖消耗在低成本模型上。

問題

→

draftcheapestdeepseek-v4-flash

→

critiquelist the flawsgpt-5.5

→

revisefix every point · fan-ingpt-5.5

→

答案

查看 flow_ir

flow.draft-critique-revise.json

["flow", {
  "u": {"kind": "input"},
  "d": {"kind": "llm", "system": "Draft an answer.",
    "policy": ["policy", ["and", ["meets_req"], ["not", ["is", "disabled"]]],
      ["neg", ["normalize", ["field", "price_out"]]], ["argmax"], ["id"], ["always", {"action": "next_candidate"}]],
    "inputs": ["u"]},
  "c": {"kind": "llm", "system": "Critique the draft: list concrete flaws and gaps.",
    "policy": ["policy", ["and", ["meets_req"], ["not", ["is", "disabled"]]],
      ["field", "bench_intelligence"], ["argmax"], ["id"], ["always", {"action": "next_candidate"}]],
    "inputs": ["d"]},
  "r": {"kind": "llm", "system": "Rewrite the answer, fixing every point in the critique.",
    "policy": ["policy", ["and", ["meets_req"], ["not", ["is", "disabled"]]],
      ["field", "bench_intelligence"], ["argmax"], ["id"], ["always", {"action": "next_candidate"}]],
    "inputs": ["u", "d", "c"], "template": "Question:\n$1\n\nDraft:\n$2\n\nCritique:\n$3"},
  "out": {"kind": "output", "inputs": ["r"]}
}]

N 選一 + 評判ensemble

以同一強力策略進行 N 次種子採樣，再由評判者從中擇優選出勝出結果。

關鍵：sample 以確定性方式在排名靠前的候選模型中分散採樣——可重現的多樣性，而非隨機。

提示詞

→

draft Asample · T=0.5gpt-5.5

draft Bsample · T=0.5gpt-5.4

draft Csample · T=0.5gemini-3.1-pro-preview

→

judgerank · pick winnergpt-5.5

→

答案

查看 flow_ir

flow.best-of-n.json

["flow", {
  "u": {"kind": "input"},
  "n1": {"kind": "llm", "system": "Answer the question.",
    "policy": ["policy", ["and", ["meets_req"], ["not", ["is", "disabled"]]],
      ["field", "bench_intelligence"], ["sample", 0.5], ["id"], ["always", {"action": "next_candidate"}]],
    "inputs": ["u"]},
  // n2, n3: the same policy, two more seeded draws
  "j": {"kind": "llm", "system": "Pick the single best candidate; return it verbatim.",
    "policy": ["policy", ["and", ["meets_req"], ["not", ["is", "disabled"]]],
      ["field", "bench_intelligence"], ["argmax"], ["id"], ["always", {"action": "next_candidate"}]],
    "inputs": ["n1", "n2", "n3"], "template": "A:\n$1\n\nB:\n$2\n\nC:\n$3"},
  "out": {"kind": "output", "inputs": ["j"]}
}]

模型評審團fan-out · fuse

三個依族系固定的模型並行起草，第四個模型將它們合成為一個更優的單一答案。

關鍵：family_eq 鎖定確切的模型系列，使評審團可重現——而非「今天最便宜的那個」。

問題

→

draft afamily_eqgemini-3.1-pro-preview

draft bfamily_eqclaude-opus-4-8

draft cfamily_eqdeepseek-v4-flash

→

fusesynthesize · fan-ingpt-5.5

→

答案

查看 flow_ir

flow.panel.json

["flow", {
  "u": {"kind": "input"},
  "a": {"kind": "llm", "system": "Draft an answer.",
    "policy": ["policy", ["and", ["meets_req"], ["not", ["is", "disabled"]], ["family_eq", "gemini-3.1-pro-preview"]],
      ["zero"], ["argmax"], ["id"], ["always", {"action": "next_candidate"}]], "inputs": ["u"]},
  // b: family_eq "claude-opus-4-8"  ·  c: family_eq "deepseek-v4-flash"
  "f": {"kind": "llm", "system": "Synthesize the single best answer from the drafts.",
    "policy": ["policy", ["and", ["meets_req"], ["not", ["is", "disabled"]]],
      ["field", "bench_intelligence"], ["argmax"], ["id"], ["always", {"action": "next_candidate"}]],
    "inputs": ["a", "b", "c"]},
  "out": {"kind": "output", "inputs": ["f"]}
}]

專家分工fan-out · merge

推論器與程式碼產生器在不同篩選條件下並行執行，再由合併步驟整合。

關鍵：每個分支為其任務（推論 vs 程式設計）挑選最合適的模型，而非讓單一模型包辦所有工作。

問題

→

reasonis cap_reasoninggpt-5.5

codecoding top-5gpt-5.4

→

mergeone answer · fan-ingemini-3.1-pro-preview

→

答案

查看 flow_ir

flow.specialist-split.json

["flow", {
  "u": {"kind": "input"},
  "rz": {"kind": "llm", "system": "Reason through the problem step by step.",
    "policy": ["policy", ["and", ["meets_req"], ["not", ["is", "disabled"]], ["is", "cap_reasoning"]],
      ["field", "bench_intelligence"], ["argmax"], ["id"], ["always", {"action": "next_candidate"}]], "inputs": ["u"]},
  "cd": {"kind": "llm", "system": "Produce any code the problem needs.",
    "policy": ["policy", ["and", ["meets_req"], ["not", ["is", "disabled"]], ["cmp", "bench_coding_rank", "le", 5]],
      ["field", "bench_coding"], ["argmax"], ["id"], ["always", {"action": "next_candidate"}]], "inputs": ["u"]},
  "m": {"kind": "llm", "system": "Merge the reasoning and the code into one answer.",
    "policy": ["policy", ["and", ["meets_req"], ["not", ["is", "disabled"]]],
      ["field", "bench_intelligence"], ["argmax"], ["id"], ["always", {"action": "next_candidate"}]],
    "inputs": ["rz", "cd"], "template": "Reasoning:\n$1\n\nCode:\n$2"},
  "out": {"kind": "output", "inputs": ["m"]}
}]

完整工作流程指南：節點類型、圖限制、範本 →

送出初始請求

快速入門

unhardcoded 相容 OpenAI 介面。相比一般呼叫，僅需三處變更。

將 SDK 指向託管位址。

修改 baseURL；SDK 的其餘部分維持不變。

client.ts

const client = new OpenAI({
  baseURL: "https://<your-host>/v1",
  apiKey: process.env.UNHARDCODED_KEY,
});

在呼叫中附加策略。

在後端建構 policy_ir，與 messages 一同傳送。路由來自策略，因此 model 只是追蹤紀錄標籤。

route.ts

const res = await client.chat.completions.create({
  model: "policy:support",
  policy_ir: ["policy",
    ["and", ["meets_req"], ["not", ["is", "disabled"]],
           ["cmp", "bench_intelligence", "ge", 0.5]],   // filter
    ["neg", ["normalize", ["field", "price_out"]]],          // rank: cheapest
    ["argmax"], ["id"], ["always", {"action": "next_candidate"}]],
  messages,
});

讀取追蹤紀錄。

回應中包含決策內容：所選模型、經過排序和拒絕的候選模型，以及策略指紋(fingerprint)。

response · trace (illustrative)

{
  "chosen": { "model_family": "deepseek-v4-pro", "price_out": 1.5 },
  "trace": {
    "policy_fingerprint": "301140696-1054914287",
    "rejected": [{ "model_family": "deepseek-v4-flash", "reason": "cmp bench_intelligence ge 0.5" }],
    "total_latency_ms": 425
  }
}

完整快速入門：驗證、試執行、可執行範例 →

複製一個起點

策略預設

以卡片形式呈現的常用路由模式。閱讀規則，複製策略，調整下限與上限。

低成本適當品質

在不低於品質下限的前提下降低成本。

filtertools-met · bench_intelligence ge 0.5

rankcheapest price_out

fallback下一個通過的候選模型

查看 JSON

cheapest-decent.json

["policy", ["and", ["meets_req"], ["not", ["is", "disabled"]],
         ["cmp", "bench_intelligence", "ge", 0.5]],
  ["neg", ["normalize", ["field", "price_out"]]],
  ["argmax"], ["id"], ["always", {"action": "next_candidate"}]]

智慧均衡

無強烈偏好：綜合權衡能力與價格。

filtertools-met · not disabled

rank0.6 intelligence + 0.4 cheap

fallback下一個通過的候選模型

查看 JSON

smart-balance.json

["policy", ["and", ["meets_req"], ["not", ["is", "disabled"]]],
  ["add",
    ["scale", 0.6, ["normalize", ["field", "bench_intelligence"]]],
    ["scale", 0.4, ["neg", ["normalize", ["field", "price_out"]]]]],
  ["argmax"], ["id"], ["always", {"action": "next_candidate"}]]

優先智慧能力

關鍵任務，無成本上限：選擇能力更勝一籌的模型。

filtertools-met · not disabled

rankhighest bench_intelligence

fallback下一個通過的候選模型

查看 JSON

best-intelligence.json

["policy", ["and", ["meets_req"], ["not", ["is", "disabled"]]],
  ["field", "bench_intelligence"],
  ["argmax"], ["id"], ["always", {"action": "next_candidate"}]]

僅限推論模型

任務需要具備推論能力的模型。

filter+ is cap_reasoning

rankhighest intelligence

fallback下一個通過的候選模型

查看 JSON

reasoning-only.json

["policy", ["and", ["meets_req"], ["not", ["is", "disabled"]], ["is", "cap_reasoning"]],
  ["field", "bench_intelligence"],
  ["argmax"], ["id"], ["always", {"action": "next_candidate"}]]

視覺 · 低成本

圖像輸入：能夠處理圖像且成本最低的模型。

filter+ is in_image

rankcheapest price_out

fallback下一個通過的候選模型

查看 JSON

vision-cheapest.json

["policy", ["and", ["meets_req"], ["not", ["is", "disabled"]], ["is", "in_image"]],
  ["neg", ["normalize", ["field", "price_out"]]],
  ["argmax"], ["id"], ["always", {"action": "next_candidate"}]]

長情境 RAG

需要大型情境視窗時，選擇能滿足要求且成本最低的模型。

filter+ context ge 200000

rankcheapest price_out

fallback下一個通過的候選模型

查看 JSON

long-context-rag.json

["policy", ["and", ["meets_req"], ["not", ["is", "disabled"]], ["cmp", "context", "ge", 200000]],
  ["neg", ["normalize", ["field", "price_out"]]],
  ["argmax"], ["id"], ["always", {"action": "next_candidate"}]]

代理人調度

為代理人迴圈選擇工具呼叫能力出色的模型。

filter+ tools · bench_agentic_rank le 5

rankhighest bench_agentic

fallback下一個通過的候選模型

查看 JSON

agentic-fleet.json

["policy", ["and", ["meets_req"], ["not", ["is", "disabled"]],
         ["has_cap", "supports_tools"], ["cmp", "bench_agentic_rank", "le", 5]],
  ["field", "bench_agentic"],
  ["argmax"], ["id"], ["always", {"action": "next_candidate"}]]

私密 / 合規

僅限 TEE 且禁止紀錄：能力優先。

filter+ is has_tee · is no_log

rankhighest intelligence

fallback下一個通過的候選模型

查看 JSON

private-compliant.json

["policy", ["and", ["meets_req"], ["not", ["is", "disabled"]],
         ["is", "has_tee"], ["is", "no_log"]],
  ["field", "bench_intelligence"],
  ["argmax"], ["id"], ["always", {"action": "next_candidate"}]]