策略
路由如何決策:先過濾,再對存活者排序,不存在靜默降級。依目標分組的可複製策略預設。
路由語義
先過濾,再對存活者排序,不存在靜默降級。路由是確定性且有序的——成本上限與品質下限位於過濾器中,而非評分中,因此低價模型無法憑分數繞過您的規則。若未能通過門檻,該模型就根本不會成為候選模型。
先過濾
缺少所需能力、未達品質下限或超出價格上限的候選模型,將被排除——絕不會被悄悄替換。
對存活者排序
在通過門檻的模型中,評分器依序排列,通常以成本由低至高為優先。
選取排名靠前者
確定性地選出一個模型。相同策略、相同輸入、相同目錄,始終選出相同模型。
依序備援
若所選模型逾時或發生錯誤,路由器將依序切換至下一個通過篩選的候選模型(成本由低至高),並記錄每次切換。
逐步決策示例
一條策略(需要工具能力、bench_intelligence ge 0.5、成本擇優)作用於即時目錄(price_out 為每百萬輸出權杖的 USD 價格)。過濾器排除了兩個未達下限的模型;成本較低的存活者勝出:
modelprice_outintelverdict
deepseek-v4-flash$0.400.465未達下限
minimax-m2.7$0.500.496未達下限
deepseek-v4-pro$1.500.515勝出
glm-5.1$2.000.514高於下限
gpt-5.5$10.000.602高於下限
下限是約束,不是建議。路由在您的下限之內優化成本,而不會繞過下限。若沒有模型符合要求,請求將明確失敗(
no_candidates);您不會在未要求的情況下收到靜默降級的結果。項(term)在執行前會經過驗證並產生指紋(fingerprint),因此格式錯誤的策略會被拒絕(invalid_policy),而不是被錯誤路由。在不產生推理成本的情況下預覽決策:POST /x/rank 會回傳排序後的存活者,以及每個模型被排除的規則。
觀察策略決策過程
同樣的原理在工作流程中體現:每個節點各自宣告策略——過濾器加排序規則。下方圖表展示了每一步如何排除未達標的候選模型,並鎖定成本較低的存活者(output 節點不宣告模型,僅回傳上一步的結果)。
策略預設
依目標分組的可複製路由模式。每個預設都是一個普通的 policy_ir:貼上後調整下限與上限即可使用。建議先用 POST /x/rank 進行演練。
智慧均衡能力與價格兼顧,無需複雜取捨
["policy", ["and", ["meets_req"], ["not", ["is", "disabled"]]],
["add",
["scale", 0.6, ["normalize", ["field", "bench_intelligence"]]],
["scale", 0.4, ["neg", ["normalize", ["field", "price_out"]]]]],
["argmax"], ["id"], ["always", {"action": "next_candidate"}]]低價可用在智能下限內降低成本,unhardcoded 的核心機制
["policy", ["and", ["meets_req"], ["not", ["is", "disabled"]],
["cmp", "bench_intelligence", "ge", 0.5]],
["neg", ["normalize", ["field", "price_out"]]],
["argmax"], ["id"], ["always", {"action": "next_candidate"}]]僅免費僅限輸出成本為零的模型
["policy", ["and", ["meets_req"], ["not", ["is", "disabled"]], ["cmp", "price_out", "le", 0]],
["field", "bench_intelligence"],
["argmax"], ["id"], ["always", {"action": "next_candidate"}]]高智能優先不設成本上限,選取能力更強的模型,適用於關鍵任務
["policy", ["and", ["meets_req"], ["not", ["is", "disabled"]]],
["field", "bench_intelligence"],
["argmax"], ["id"], ["always", {"action": "next_candidate"}]]僅推理模型僅限具備推理能力的模型
["policy", ["and", ["meets_req"], ["not", ["is", "disabled"]], ["is", "cap_reasoning"]],
["field", "bench_intelligence"],
["argmax"], ["id"], ["always", {"action": "next_candidate"}]]視覺 · 成本優先支援圖像輸入,選取成本較低的模型
["policy", ["and", ["meets_req"], ["not", ["is", "disabled"]], ["is", "in_image"]],
["neg", ["normalize", ["field", "price_out"]]],
["argmax"], ["id"], ["always", {"action": "next_candidate"}]]長情境 RAG要求大型情境視窗,再從符合條件者中選取成本較低的模型
["policy", ["and", ["meets_req"], ["not", ["is", "disabled"]],
["cmp", "context", "ge", 200000]],
["neg", ["normalize", ["field", "price_out"]]],
["argmax"], ["id"], ["always", {"action": "next_candidate"}]]結構化輸出依能力過濾後,在能力與成本間擇優
["policy", ["and", ["meets_req"], ["not", ["is", "disabled"]],
["has_cap", "supports_json_mode"]],
["add",
["scale", 0.5, ["normalize", ["field", "bench_intelligence"]]],
["scale", 0.5, ["neg", ["normalize", ["field", "price_out"]]]]],
["argmax"], ["id"], ["always", {"action": "next_candidate"}]]代理程式機群工具呼叫能力更強,取排名前 5 的代理程式模型
["policy", ["and", ["meets_req"], ["not", ["is", "disabled"]],
["has_cap", "supports_tools"],
["cmp", "bench_agentic_rank", "le", 5]],
["field", "bench_agentic"],
["argmax"], ["id"], ["always", {"action": "next_candidate"}]]限價程式設計具備工具呼叫且有硬性輸出價格上限的前 5 名程式設計模型
["policy", ["and", ["meets_req"], ["not", ["is", "disabled"]],
["has_cap", "supports_tools"],
["cmp", "bench_coding_rank", "le", 5],
["cmp", "price_out", "le", 5]],
["field", "bench_coding"],
["argmax"], ["id"], ["always", {"action": "next_candidate"}]]可重現抽樣用於集成多樣性的可重現隨機採樣
["policy", ["and", ["meets_req"], ["not", ["is", "disabled"]]],
["field", "bench_intelligence"],
["sample", 0.3], ["id"], ["always", {"action": "next_candidate"}]]低延遲對話依延遲過濾,再在速度與能力間擇優
["policy", ["and", ["meets_req"], ["not", ["is", "disabled"]],
["cmp", "latency_ms", "le", 2000]],
["add",
["scale", 0.7, ["neg", ["normalize", ["field", "latency_ms"]]]],
["scale", 0.3, ["normalize", ["field", "bench_intelligence"]]]],
["argmax"], ["id"], ["always", {"action": "next_candidate"}]]私密 / 合規僅限 TEE 且不記錄日誌,依能力優先排序
["policy", ["and", ["meets_req"], ["not", ["is", "disabled"]],
["is", "has_tee"], ["is", "no_log"]],
["field", "bench_intelligence"],
["argmax"], ["id"], ["always", {"action": "next_candidate"}]]彈性串聯備援保留綜合排名前 3(智能度 + 可靠性),構成備援串聯
["policy", ["and", ["meets_req"], ["not", ["is", "disabled"]]],
["add",
["scale", 0.6, ["normalize", ["field", "bench_intelligence"]]],
["scale", 0.4, ["normalize", ["field", "success_rate"]]]],
["top_k", 3, ["argmax"]], ["id"], ["always", {"action": "next_candidate"}]]