오픈소스 · AI 모델을 위한 정책 라우팅

하드코딩된 모델 선택 없이 AI 시스템을 구축하세요.

각 요청마다 정책을 함께 전송하세요. unhardcoded는 실시간 모델 카탈로그를 필터링하고, 규칙을 충족하는 가장 저렴한 모델을 선택하여 제공자 키를 통해 실행한 후 결정 트레이스를 반환합니다.

request

Prompt + policy

tools · intel ge 0.5 · cheapest

→

candidateprice_outintelverdict

deepseek-v4-flash$0.400.465기준 미달

minimax-m2.7$0.500.496기준 미달

deepseek-v4-pro$1.500.515선택됨

glm-5.1$2.000.514기준 통과

gpt-5.5$10.000.602기준 통과

→

trace

deepseek-v4-pro

cheapest passing · 412 ms
fp 301140696-1054914287

호출 하나 라우팅하기

SDK를 호스트에 연결하고, 정책을 첨부하고, 트레이스를 읽으세요.

퀵스타트 →

워크플로우 실행하기

라우팅된 호출을 DAG으로 구성하세요: 분류, 팬아웃, 심판, 병합.

패턴 보기 →

트레이스 읽기

모든 결정은 영수증을 남깁니다: 누가 검토되었고, 누가 선택되었으며, 그 이유는 무엇인지.

결정 검사하기 →

멘탈 모델

모델 라우팅 루프

하나의 정책이 하나의 호출을 결정합니다. 동일한 입력값, 동일한 카탈로그, 동일한 결정. 항상, 그리고 기록됩니다.

Request

호출이 OpenAI 호환 엔드포인트에 도달하며, 정책을 함께 전달합니다.

Policy sigma-pol/v2

소형 항(term): filter, rank, select, mutate, fallback.

Catalog

가격, 벤치마크, 기능이 담긴 (제공자, 모델) 후보 모델의 실시간 집합.

Filter

규칙을 통과하지 못한 후보를 제거합니다. 조용한 다운그레이드는 없습니다.

Rank

통과한 후보에 점수를 매깁니다: 가장 저렴한 것, 가장 강력한 것, 가장 빠른 것 — 선택은 여러분의 몫.

Select

상위 하나를 선택하거나, top_k 폴백 캐스케이드를 사용합니다.

Run

제공자 키를 통해 추론을 실행하며, 오류 시 폴백합니다.

Trace

모든 후보 모델, 최종 선택된 모델, 지문(fingerprint)의 영수증.

정책 작동 방식 심층 분석 →

라우팅된 호출 구성

워크플로우 패턴

워크플로우는 라우팅된 단계들로 이루어진 유계 비순환 그래프입니다. 각 노드는 자체 정책을 갖고 독립적으로 라우팅되며, 전체 그래프는 하나의 통합 트레이스를 작성합니다. 이것이 unhardcoded가 "규칙 기반 라우팅"을 넘어 하나의 시스템이 되는 지점입니다.

지원 티켓linear · guard

저비용으로 분류하고, 품질 기준에 맞춰 초안을 작성한 후, 전송 전에 거부 가능한 강력한 no-log 가드를 수행합니다.

핵심: 마지막 단계는 전송을 중단할 수 있는 정책으로 제어되는 게이트이며, 단순한 희망 사항이 아닙니다.

티켓

→

triageclassify · extract iddeepseek-v4-flash

→

draftreply · quality floorgemini-3.1-pro-preview

→

guardbrand · PII · no-loggpt-5.5 · can abort

→

답변

flow_ir 보기

flow.support-ticket.json

["flow", {
  "u": {"kind": "input"},
  "t": {"kind": "llm", "system": "Classify the ticket and extract the account id as JSON.",
    "policy": ["policy", ["and", ["meets_req"], ["not", ["is", "disabled"]], ["has_cap", "supports_json_mode"]],
      ["neg", ["normalize", ["field", "price_out"]]], ["argmax"], ["id"], ["always", {"action": "next_candidate"}]],
    "inputs": ["u"]},
  "d": {"kind": "llm", "system": "Write a reply using the ticket and the triage.",
    "policy": ["policy", ["and", ["meets_req"], ["not", ["is", "disabled"]], ["cmp", "bench_intelligence", "ge", 0.55]],
      ["neg", ["normalize", ["field", "price_out"]]], ["argmax"], ["id"], ["always", {"action": "next_candidate"}]],
    "inputs": ["u", "t"], "template": "Ticket:\n$1\n\nTriage:\n$2"},
  "g": {"kind": "llm", "system": "Check brand voice, PII, refund limits. Refuse if any fail.",
    "policy": ["policy", ["and", ["meets_req"], ["not", ["is", "disabled"]], ["is", "no_log"]],
      ["field", "bench_intelligence"], ["argmax"], ["id"], ["always", {"action": "next_candidate"}]],
    "inputs": ["d"]},
  "out": {"kind": "output", "inputs": ["g"]}
}]

초안 → 비평 → 수정linear · refine

저렴한 첫 번째 초안, 결함을 나열하는 강력한 비평가, 그리고 모든 지적 사항을 수정하는 재작성.

핵심: 예산 안에서의 품질 도약 — 대부분의 토큰은 저렴한 모델에서 처리됩니다.

질문

→

draftcheapestdeepseek-v4-flash

→

critiquelist the flawsgpt-5.5

→

revisefix every point · fan-ingpt-5.5

→

답변

flow_ir 보기

flow.draft-critique-revise.json

["flow", {
  "u": {"kind": "input"},
  "d": {"kind": "llm", "system": "Draft an answer.",
    "policy": ["policy", ["and", ["meets_req"], ["not", ["is", "disabled"]]],
      ["neg", ["normalize", ["field", "price_out"]]], ["argmax"], ["id"], ["always", {"action": "next_candidate"}]],
    "inputs": ["u"]},
  "c": {"kind": "llm", "system": "Critique the draft: list concrete flaws and gaps.",
    "policy": ["policy", ["and", ["meets_req"], ["not", ["is", "disabled"]]],
      ["field", "bench_intelligence"], ["argmax"], ["id"], ["always", {"action": "next_candidate"}]],
    "inputs": ["d"]},
  "r": {"kind": "llm", "system": "Rewrite the answer, fixing every point in the critique.",
    "policy": ["policy", ["and", ["meets_req"], ["not", ["is", "disabled"]]],
      ["field", "bench_intelligence"], ["argmax"], ["id"], ["always", {"action": "next_candidate"}]],
    "inputs": ["u", "d", "c"], "template": "Question:\n$1\n\nDraft:\n$2\n\nCritique:\n$3"},
  "out": {"kind": "output", "inputs": ["r"]}
}]

Best-of-N + 심판ensemble

동일한 강력한 정책에서 N번의 시드 추출, 그런 다음 심판이 최선의 것을 선택합니다.

핵심: sample은 순위 상위권에서 결정론적으로 다양성을 분산시킵니다 — 운이 아닌 재현 가능한 다양성.

프롬프트

→

draft Asample · T=0.5gpt-5.5

draft Bsample · T=0.5gpt-5.4

draft Csample · T=0.5gemini-3.1-pro-preview

→

judgerank · pick winnergpt-5.5

→

답변

flow_ir 보기

flow.best-of-n.json

["flow", {
  "u": {"kind": "input"},
  "n1": {"kind": "llm", "system": "Answer the question.",
    "policy": ["policy", ["and", ["meets_req"], ["not", ["is", "disabled"]]],
      ["field", "bench_intelligence"], ["sample", 0.5], ["id"], ["always", {"action": "next_candidate"}]],
    "inputs": ["u"]},
  // n2, n3: the same policy, two more seeded draws
  "j": {"kind": "llm", "system": "Pick the single best candidate; return it verbatim.",
    "policy": ["policy", ["and", ["meets_req"], ["not", ["is", "disabled"]]],
      ["field", "bench_intelligence"], ["argmax"], ["id"], ["always", {"action": "next_candidate"}]],
    "inputs": ["n1", "n2", "n3"], "template": "A:\n$1\n\nB:\n$2\n\nC:\n$3"},
  "out": {"kind": "output", "inputs": ["j"]}
}]

모델 패널fan-out · fuse

패밀리로 고정된 세 모델이 병렬로 초안을 작성하고, 네 번째 모델이 최선의 단일 답변을 합성합니다.

핵심: family_eq가 정확한 모델 라인을 고정하므로 패널이 재현 가능합니다 — "오늘 가장 저렴한 것"이 아닙니다.

질문

→

draft afamily_eqgemini-3.1-pro-preview

draft bfamily_eqclaude-opus-4-8

draft cfamily_eqdeepseek-v4-flash

→

fusesynthesize · fan-ingpt-5.5

→

답변

flow_ir 보기

flow.panel.json

["flow", {
  "u": {"kind": "input"},
  "a": {"kind": "llm", "system": "Draft an answer.",
    "policy": ["policy", ["and", ["meets_req"], ["not", ["is", "disabled"]], ["family_eq", "gemini-3.1-pro-preview"]],
      ["zero"], ["argmax"], ["id"], ["always", {"action": "next_candidate"}]], "inputs": ["u"]},
  // b: family_eq "claude-opus-4-8"  ·  c: family_eq "deepseek-v4-flash"
  "f": {"kind": "llm", "system": "Synthesize the single best answer from the drafts.",
    "policy": ["policy", ["and", ["meets_req"], ["not", ["is", "disabled"]]],
      ["field", "bench_intelligence"], ["argmax"], ["id"], ["always", {"action": "next_candidate"}]],
    "inputs": ["a", "b", "c"]},
  "out": {"kind": "output", "inputs": ["f"]}
}]

전문가 분리fan-out · merge

추론기와 코더가 서로 다른 필터 아래 병렬로 실행되고, 병합 단계에서 결합됩니다.

핵심: 각 분기는 하나의 모델이 모든 것을 처리하는 대신, 그 역할에 가장 적합한 모델을 선택합니다(추론 vs 코딩).

질문

→

reasonis cap_reasoninggpt-5.5

codecoding top-5gpt-5.4

→

mergeone answer · fan-ingemini-3.1-pro-preview

→

답변

flow_ir 보기

flow.specialist-split.json

["flow", {
  "u": {"kind": "input"},
  "rz": {"kind": "llm", "system": "Reason through the problem step by step.",
    "policy": ["policy", ["and", ["meets_req"], ["not", ["is", "disabled"]], ["is", "cap_reasoning"]],
      ["field", "bench_intelligence"], ["argmax"], ["id"], ["always", {"action": "next_candidate"}]], "inputs": ["u"]},
  "cd": {"kind": "llm", "system": "Produce any code the problem needs.",
    "policy": ["policy", ["and", ["meets_req"], ["not", ["is", "disabled"]], ["cmp", "bench_coding_rank", "le", 5]],
      ["field", "bench_coding"], ["argmax"], ["id"], ["always", {"action": "next_candidate"}]], "inputs": ["u"]},
  "m": {"kind": "llm", "system": "Merge the reasoning and the code into one answer.",
    "policy": ["policy", ["and", ["meets_req"], ["not", ["is", "disabled"]]],
      ["field", "bench_intelligence"], ["argmax"], ["id"], ["always", {"action": "next_candidate"}]],
    "inputs": ["rz", "cd"], "template": "Reasoning:\n$1\n\nCode:\n$2"},
  "out": {"kind": "output", "inputs": ["m"]}
}]

전체 워크플로우 가이드: 노드 종류, 그래프 제한, 템플릿 →

첫 번째 요청 전송

퀵스타트

unhardcoded는 OpenAI 호환입니다. 일반 호출에서 세 가지만 변경하면 됩니다.

SDK를 호스트에 연결하세요.

baseURL을 변경하세요. SDK의 나머지 부분은 그대로입니다.
client.ts
```
const client = new OpenAI({
  baseURL: "https://<your-host>/v1",
  apiKey: process.env.UNHARDCODED_KEY,
});
```

호출에 정책을 첨부하세요.

백엔드에서 policy_ir을 구성하여 messages와 함께 전송하세요. 라우팅은 정책에서 나오므로 model은 단지 트레이스 레이블입니다.

route.ts

const res = await client.chat.completions.create({
  model: "policy:support",
  policy_ir: ["policy",
    ["and", ["meets_req"], ["not", ["is", "disabled"]],
           ["cmp", "bench_intelligence", "ge", 0.5]],   // filter
    ["neg", ["normalize", ["field", "price_out"]]],          // rank: cheapest
    ["argmax"], ["id"], ["always", {"action": "next_candidate"}]],
  messages,
});

트레이스를 읽으세요.

응답에는 결정 내용이 담겨 있습니다: 선택된 모델, 순위 매겨지고 거부된 후보 모델들, 그리고 정책 지문(fingerprint).

response · trace (illustrative)

{
  "chosen": { "model_family": "deepseek-v4-pro", "price_out": 1.5 },
  "trace": {
    "policy_fingerprint": "301140696-1054914287",
    "rejected": [{ "model_family": "deepseek-v4-flash", "reason": "cmp bench_intelligence ge 0.5" }],
    "total_latency_ms": 425
  }
}

전체 퀵스타트: 인증, 드라이런, 실행 가능한 예제 →

시작점 복사

정책 프리셋

카드 형태의 일반적인 라우팅 패턴. 규칙을 읽고, 정책을 복사하고, 하한선과 상한선을 조정하세요.

최저 비용 적정 품질

품질 기준 이하로 내려가지 않으면서 비용을 절감합니다.

filtertools-met · bench_intelligence ge 0.5

rankcheapest price_out

fallback통과한 다음 모델

JSON 보기

cheapest-decent.json

["policy", ["and", ["meets_req"], ["not", ["is", "disabled"]],
         ["cmp", "bench_intelligence", "ge", 0.5]],
  ["neg", ["normalize", ["field", "price_out"]]],
  ["argmax"], ["id"], ["always", {"action": "next_candidate"}]]

스마트 균형

강한 선호 없음: 능력과 가격을 균형 있게 고려합니다.

filtertools-met · not disabled

rank0.6 intelligence + 0.4 cheap

fallback통과한 다음 모델

JSON 보기

smart-balance.json

["policy", ["and", ["meets_req"], ["not", ["is", "disabled"]]],
  ["add",
    ["scale", 0.6, ["normalize", ["field", "bench_intelligence"]]],
    ["scale", 0.4, ["neg", ["normalize", ["field", "price_out"]]]]],
  ["argmax"], ["id"], ["always", {"action": "next_candidate"}]]

최고 지능

중요한 작업, 비용 상한 없음: 가장 유능한 모델을 선택합니다.

filtertools-met · not disabled

rankhighest bench_intelligence

fallback통과한 다음 모델

JSON 보기

best-intelligence.json

["policy", ["and", ["meets_req"], ["not", ["is", "disabled"]]],
  ["field", "bench_intelligence"],
  ["argmax"], ["id"], ["always", {"action": "next_candidate"}]]

추론 전용

작업에 추론 가능한 모델이 필요합니다.

filter+ is cap_reasoning

rankhighest intelligence

fallback통과한 다음 모델

JSON 보기

reasoning-only.json

["policy", ["and", ["meets_req"], ["not", ["is", "disabled"]], ["is", "cap_reasoning"]],
  ["field", "bench_intelligence"],
  ["argmax"], ["id"], ["always", {"action": "next_candidate"}]]

비전 · 최저 비용

이미지 입력: 볼 수 있는 가장 저렴한 모델.

filter+ is in_image

rankcheapest price_out

fallback통과한 다음 모델

JSON 보기

vision-cheapest.json

["policy", ["and", ["meets_req"], ["not", ["is", "disabled"]], ["is", "in_image"]],
  ["neg", ["normalize", ["field", "price_out"]]],
  ["argmax"], ["id"], ["always", {"action": "next_candidate"}]]

장문 컨텍스트 RAG

대형 컨텍스트 윈도우가 필요할 때, 그 다음 가장 저렴한 것을 선택합니다.

filter+ context ge 200000

rankcheapest price_out

fallback통과한 다음 모델

JSON 보기

long-context-rag.json

["policy", ["and", ["meets_req"], ["not", ["is", "disabled"]], ["cmp", "context", "ge", 200000]],
  ["neg", ["normalize", ["field", "price_out"]]],
  ["argmax"], ["id"], ["always", {"action": "next_candidate"}]]

에이전틱 플릿

에이전트 루프를 위한 최고의 도구 사용자.

filter+ tools · bench_agentic_rank le 5

rankhighest bench_agentic

fallback통과한 다음 모델

JSON 보기

agentic-fleet.json

["policy", ["and", ["meets_req"], ["not", ["is", "disabled"]],
         ["has_cap", "supports_tools"], ["cmp", "bench_agentic_rank", "le", 5]],
  ["field", "bench_agentic"],
  ["argmax"], ["id"], ["always", {"action": "next_candidate"}]]

프라이빗 / 규정 준수

TEE 전용 및 로그 없음: 기능 우선.

filter+ is has_tee · is no_log

rankhighest intelligence

fallback통과한 다음 모델

JSON 보기

private-compliant.json

["policy", ["and", ["meets_req"], ["not", ["is", "disabled"]],
         ["is", "has_tee"], ["is", "no_log"]],
  ["field", "bench_intelligence"],
  ["argmax"], ["id"], ["always", {"action": "next_candidate"}]]