라우팅된 호출 구성

워크플로우

다섯 가지 워크플로우 패턴. 각각은 하나의 통합 트레이스를 작성하는 정책 기반 라우팅 단계들의 유계 그래프입니다.

워크플로우

다섯 가지 공통 형태. 각 탭에는 플로우와 이 예시에서 각 단계가 라우팅되는 모델이 표시됩니다. flow_ir 보기를 열어 전체 복사 가능한 항(term)을 확인하세요. 먼저 POST /x/flow/normalize로 드라이런하세요. 각 노드는 자체 정책(filter + rank)을 선언하며, 다이어그램의 모델 ID와 비용은 실제 라우팅 결과를 보여줍니다 — 내부 값은 영어로 유지됩니다.

저비용으로 분류하고, 품질 기준에 맞춰 초안을 작성한 후, 전송 전에 거부 가능한 강력한 no-log 가드를 수행합니다. 다이어그램의 각 노드는 자체 정책을 결정하며, "제외됨" 레이블은 기준 미달 후보, "확정 · 실행"은 선택된 모델을 나타냅니다.

flow_ir 보기

flow.support-ticket.json

["flow", {
  "u": {"kind": "input"},
  "t": {"kind": "llm", "system": "Classify the ticket and extract the account id as JSON.",
    "policy": ["policy", ["and", ["meets_req"], ["not", ["is", "disabled"]], ["has_cap", "supports_json_mode"]],
      ["neg", ["normalize", ["field", "price_out"]]], ["argmax"], ["id"], ["always", {"action": "next_candidate"}]],
    "inputs": ["u"]},
  "d": {"kind": "llm", "system": "Write a reply using the ticket and the triage.",
    "policy": ["policy", ["and", ["meets_req"], ["not", ["is", "disabled"]], ["cmp", "bench_intelligence", "ge", 0.55]],
      ["neg", ["normalize", ["field", "price_out"]]], ["argmax"], ["id"], ["always", {"action": "next_candidate"}]],
    "inputs": ["u", "t"], "template": "Ticket:\n$1\n\nTriage:\n$2"},
  "g": {"kind": "llm", "system": "Check brand voice, PII, refund limits. Refuse if any fail.",
    "policy": ["policy", ["and", ["meets_req"], ["not", ["is", "disabled"]], ["is", "no_log"]],
      ["field", "bench_intelligence"], ["argmax"], ["id"], ["always", {"action": "next_candidate"}]],
    "inputs": ["d"]},
  "out": {"kind": "output", "inputs": ["g"]}
}]

저렴한 초안, 강력한 비평가, 그리고 재작성. 예산 안에서의 품질 도약. 다이어그램에서 각 단계가 독립적으로 정책을 결정하는 과정을 확인하세요 — "대기" → "실행 중" → "확정" 순서로 진행됩니다.

flow_ir 보기

flow.draft-critique-revise.json

["flow", {
  "u": {"kind": "input"},
  "d": {"kind": "llm", "system": "Draft an answer.",
    "policy": ["policy", ["and", ["meets_req"], ["not", ["is", "disabled"]]],
      ["neg", ["normalize", ["field", "price_out"]]], ["argmax"], ["id"], ["always", {"action": "next_candidate"}]],
    "inputs": ["u"]},
  "c": {"kind": "llm", "system": "Critique the draft: list concrete flaws and gaps.",
    "policy": ["policy", ["and", ["meets_req"], ["not", ["is", "disabled"]]],
      ["field", "bench_intelligence"], ["argmax"], ["id"], ["always", {"action": "next_candidate"}]],
    "inputs": ["d"]},
  "r": {"kind": "llm", "system": "Rewrite the answer, fixing every point in the critique.",
    "policy": ["policy", ["and", ["meets_req"], ["not", ["is", "disabled"]]],
      ["field", "bench_intelligence"], ["argmax"], ["id"], ["always", {"action": "next_candidate"}]],
    "inputs": ["u", "d", "c"], "template": "Question:\n$1\n\nDraft:\n$2\n\nCritique:\n$3"},
  "out": {"kind": "output", "inputs": ["r"]}
}]

동일한 강력한 정책에서 N번의 시드 추출 (sample은 순위 상위권에서 다양성을 확보), 그런 다음 심판이 최선을 선택합니다. 다이어그램에서 병렬 초안 노드들이 각각 독립적으로 정책을 결정하는 방식을 확인하세요.

flow_ir 보기

flow.best-of-n.json

["flow", {
  "u": {"kind": "input"},
  "n1": {"kind": "llm", "system": "Answer the question.",
    "policy": ["policy", ["and", ["meets_req"], ["not", ["is", "disabled"]]],
      ["field", "bench_intelligence"], ["sample", 0.5], ["id"], ["always", {"action": "next_candidate"}]],
    "inputs": ["u"]},
  "n2": {"kind": "llm", "system": "Answer the question.",
    "policy": ["policy", ["and", ["meets_req"], ["not", ["is", "disabled"]]],
      ["field", "bench_intelligence"], ["sample", 0.5], ["id"], ["always", {"action": "next_candidate"}]],
    "inputs": ["u"]},
  "n3": {"kind": "llm", "system": "Answer the question.",
    "policy": ["policy", ["and", ["meets_req"], ["not", ["is", "disabled"]]],
      ["field", "bench_intelligence"], ["sample", 0.5], ["id"], ["always", {"action": "next_candidate"}]],
    "inputs": ["u"]},
  "j": {"kind": "llm", "system": "Pick the single best candidate; return it verbatim.",
    "policy": ["policy", ["and", ["meets_req"], ["not", ["is", "disabled"]]],
      ["field", "bench_intelligence"], ["argmax"], ["id"], ["always", {"action": "next_candidate"}]],
    "inputs": ["n1", "n2", "n3"], "template": "A:\n$1\n\nB:\n$2\n\nC:\n$3"},
  "out": {"kind": "output", "inputs": ["j"]}
}]

패밀리로 고정된 세 모델이 병렬로 초안을 작성하고, 네 번째 모델이 최선의 단일 답변을 합성합니다. 각 핀은 family_eq와 상수 zero 채점기를 사용하며 (필터가 이미 한 패밀리만 남김), 다이어그램의 "최종 모델" 레이블이 합성 단계를 가리킵니다.

flow_ir 보기

flow.panel.json

["flow", {
  "u": {"kind": "input"},
  "a": {"kind": "llm", "system": "Draft an answer.",
    "policy": ["policy", ["and", ["meets_req"], ["not", ["is", "disabled"]], ["family_eq", "gemini-3.1-pro-preview"]],
      ["zero"], ["argmax"], ["id"], ["always", {"action": "next_candidate"}]], "inputs": ["u"]},
  "b": {"kind": "llm", "system": "Draft an answer.",
    "policy": ["policy", ["and", ["meets_req"], ["not", ["is", "disabled"]], ["family_eq", "claude-opus-4-8"]],
      ["zero"], ["argmax"], ["id"], ["always", {"action": "next_candidate"}]], "inputs": ["u"]},
  "c": {"kind": "llm", "system": "Draft an answer.",
    "policy": ["policy", ["and", ["meets_req"], ["not", ["is", "disabled"]], ["family_eq", "deepseek-v4-flash"]],
      ["zero"], ["argmax"], ["id"], ["always", {"action": "next_candidate"}]], "inputs": ["u"]},
  "f": {"kind": "llm", "system": "Synthesize the single best answer from the drafts.",
    "policy": ["policy", ["and", ["meets_req"], ["not", ["is", "disabled"]]],
      ["add", ["scale", 0.7, ["normalize", ["field", "bench_intelligence"]]],
             ["scale", 0.3, ["neg", ["normalize", ["field", "price_in"]]]]],
      ["argmax"], ["id"], ["always", {"action": "next_candidate"}]],
    "inputs": ["a", "b", "c"]},
  "out": {"kind": "output", "inputs": ["f"]}
}]

추론기와 코더가 서로 다른 필터 아래 병렬로 실행되고, 병합 단계에서 결합됩니다. 추론과 코드가 혼합된 작업을 위한 패턴입니다. 다이어그램에서 두 분기가 각각 독립적으로 "제외됨" / "확정" 과정을 거치는 방식을 확인하세요.

flow_ir 보기

flow.specialist-split.json

["flow", {
  "u": {"kind": "input"},
  "rz": {"kind": "llm", "system": "Reason through the problem step by step.",
    "policy": ["policy", ["and", ["meets_req"], ["not", ["is", "disabled"]], ["is", "cap_reasoning"]],
      ["field", "bench_intelligence"], ["argmax"], ["id"], ["always", {"action": "next_candidate"}]], "inputs": ["u"]},
  "cd": {"kind": "llm", "system": "Produce any code the problem needs.",
    "policy": ["policy", ["and", ["meets_req"], ["not", ["is", "disabled"]], ["cmp", "bench_coding_rank", "le", 5]],
      ["add", ["scale", 0.7, ["normalize", ["field", "bench_coding"]]],
             ["scale", 0.3, ["neg", ["normalize", ["field", "price_out"]]]]],
      ["argmax"], ["id"], ["always", {"action": "next_candidate"}]], "inputs": ["u"]},
  "m": {"kind": "llm", "system": "Merge the reasoning and the code into one answer.",
    "policy": ["policy", ["and", ["meets_req"], ["not", ["is", "disabled"]]],
      ["add", ["scale", 0.6, ["normalize", ["field", "bench_intelligence"]]],
             ["scale", 0.4, ["neg", ["normalize", ["field", "price_out"]]]]],
      ["argmax"], ["id"], ["always", {"action": "next_candidate"}]],
    "inputs": ["rz", "cd"], "template": "Reasoning:\n$1\n\nCode:\n$2"},
  "out": {"kind": "output", "inputs": ["m"]}
}]

← 문서로 돌아가기