策略
路由如何决策:先过滤,再对存活者排序,不存在静默降级。按目标分组的可复用策略预设。
路由语义
先过滤,再对存活者排序,不存在静默降级。路由是确定性且有序的——成本上限和质量下限位于过滤器中,而非评分中,因此低价模型永远无法凭分数超越你的规则。若无法通过最低门槛,它就根本不是候选模型。
先过滤
缺少所需能力、未达到质量下限或超出价格上限的候选模型将被淘汰——永远不会被悄悄替换。
对存活者排序
在通过门槛的模型中,评分器对其排序,通常以成本由低到高为优先。
选取排名最高者
确定性地选出一个模型。相同策略、相同输入、相同目录,始终选出相同模型。
按顺序回退
若所选模型超时或出错,路由器将依次切换至下一个通过候选(成本由低到高),并记录每次切换。
逐步决策示例
一条策略(需要工具能力、bench_intelligence ge 0.5、成本择优)作用于实时目录(price_out 为每百万输出令牌的 USD 价格)。过滤器淘汰了两个未达下限的模型;成本最低的存活者胜出:
modelprice_outintelverdict
deepseek-v4-flash$0.400.465未达下限
minimax-m2.7$0.500.496未达下限
deepseek-v4-pro$1.500.515胜出
glm-5.1$2.000.514高于下限
gpt-5.5$10.000.602高于下限
下限是约束,不是建议。路由在你的下限之内优化成本,而不会绕过下限。若没有模型满足要求,请求将明确失败(
no_candidates);你永远不会得到未经请求的静默降级。项(term)在运行前会被验证并生成指纹(fingerprint),因此格式错误的策略会被拒绝(invalid_policy),而不是被错误路由。在不产生推理成本的情况下预览决策:POST /x/rank 返回排序后的存活者,以及每个模型被淘汰的规则。
观察策略决策过程
同样的原理在工作流中体现:每个节点声明自己的策略——过滤器加排序规则。下方图表展示了每一步如何淘汰未达标的候选模型,并锁定成本最低的存活者(output 节点不声明模型,仅返回上一步的结果)。图表中的模型 ID 与成本均来自真实目录数据。
策略预设
按目标分组的可复用路由模式。每个预设都是一个普通的 policy_ir:粘贴后调整下限与上限即可使用。建议先用 POST /x/rank 进行演练。
智能均衡能力与价格兼顾,无需复杂权衡
["policy", ["and", ["meets_req"], ["not", ["is", "disabled"]]],
["add",
["scale", 0.6, ["normalize", ["field", "bench_intelligence"]]],
["scale", 0.4, ["neg", ["normalize", ["field", "price_out"]]]]],
["argmax"], ["id"], ["always", {"action": "next_candidate"}]]低价可用在智能下限内降低成本,unhardcoded 的核心机制
["policy", ["and", ["meets_req"], ["not", ["is", "disabled"]],
["cmp", "bench_intelligence", "ge", 0.5]],
["neg", ["normalize", ["field", "price_out"]]],
["argmax"], ["id"], ["always", {"action": "next_candidate"}]]仅免费仅限输出成本为零的模型
["policy", ["and", ["meets_req"], ["not", ["is", "disabled"]], ["cmp", "price_out", "le", 0]],
["field", "bench_intelligence"],
["argmax"], ["id"], ["always", {"action": "next_candidate"}]]高智能优先不设成本上限,选取能力更强的模型,适用于关键任务
["policy", ["and", ["meets_req"], ["not", ["is", "disabled"]]],
["field", "bench_intelligence"],
["argmax"], ["id"], ["always", {"action": "next_candidate"}]]仅推理模型仅限具备推理能力的模型
["policy", ["and", ["meets_req"], ["not", ["is", "disabled"]], ["is", "cap_reasoning"]],
["field", "bench_intelligence"],
["argmax"], ["id"], ["always", {"action": "next_candidate"}]]视觉 · 成本优先支持图像输入,选取成本更低的模型
["policy", ["and", ["meets_req"], ["not", ["is", "disabled"]], ["is", "in_image"]],
["neg", ["normalize", ["field", "price_out"]]],
["argmax"], ["id"], ["always", {"action": "next_candidate"}]]长上下文 RAG要求大上下文窗口,再选取满足条件中成本更低的模型
["policy", ["and", ["meets_req"], ["not", ["is", "disabled"]],
["cmp", "context", "ge", 200000]],
["neg", ["normalize", ["field", "price_out"]]],
["argmax"], ["id"], ["always", {"action": "next_candidate"}]]结构化输出按能力过滤后,在能力与成本间择优
["policy", ["and", ["meets_req"], ["not", ["is", "disabled"]],
["has_cap", "supports_json_mode"]],
["add",
["scale", 0.5, ["normalize", ["field", "bench_intelligence"]]],
["scale", 0.5, ["neg", ["normalize", ["field", "price_out"]]]]],
["argmax"], ["id"], ["always", {"action": "next_candidate"}]]智能体集群工具调用能力更强,取排名前 5 的智能体模型
["policy", ["and", ["meets_req"], ["not", ["is", "disabled"]],
["has_cap", "supports_tools"],
["cmp", "bench_agentic_rank", "le", 5]],
["field", "bench_agentic"],
["argmax"], ["id"], ["always", {"action": "next_candidate"}]]限价编程具备工具调用和硬性输出价格上限的前 5 名编程模型
["policy", ["and", ["meets_req"], ["not", ["is", "disabled"]],
["has_cap", "supports_tools"],
["cmp", "bench_coding_rank", "le", 5],
["cmp", "price_out", "le", 5]],
["field", "bench_coding"],
["argmax"], ["id"], ["always", {"action": "next_candidate"}]]可复现抽样用于集成多样性的可复现随机采样
["policy", ["and", ["meets_req"], ["not", ["is", "disabled"]]],
["field", "bench_intelligence"],
["sample", 0.3], ["id"], ["always", {"action": "next_candidate"}]]低延迟对话按延迟过滤,再在速度与能力间择优
["policy", ["and", ["meets_req"], ["not", ["is", "disabled"]],
["cmp", "latency_ms", "le", 2000]],
["add",
["scale", 0.7, ["neg", ["normalize", ["field", "latency_ms"]]]],
["scale", 0.3, ["normalize", ["field", "bench_intelligence"]]]],
["argmax"], ["id"], ["always", {"action": "next_candidate"}]]私密 / 合规仅限 TEE 且不记录日志,按能力优先排序
["policy", ["and", ["meets_req"], ["not", ["is", "disabled"]],
["is", "has_tee"], ["is", "no_log"]],
["field", "bench_intelligence"],
["argmax"], ["id"], ["always", {"action": "next_candidate"}]]弹性级联保留综合排名前 3(智能度 + 可靠性),构成回退级联
["policy", ["and", ["meets_req"], ["not", ["is", "disabled"]]],
["add",
["scale", 0.6, ["normalize", ["field", "bench_intelligence"]]],
["scale", 0.4, ["normalize", ["field", "success_rate"]]]],
["top_k", 3, ["argmax"]], ["id"], ["always", {"action": "next_candidate"}]]