unhardcoded vs workflow & agent builders.
They build the graph; we route every node by policy. Vellum, LangGraph, n8n, and Dify orchestrate multi-step flows. unhardcoded decides which model runs each step — and proves it with a trace.
unhardcoded vs Vellum · LangGraph · n8n · Dify
Build the flow, or decide the model.
A flow builder lays out the steps and ships the app. unhardcoded sits inside each LLM node and picks the cheapest model that clears your rules. Different jobs — they stack.
| unhardcoded | Workflow / agent builders | |
|---|---|---|
| What it builds | The model decision, per step | The multi-step flow or agent |
| Where the route is decided | Sent with the call (policy_ir) | Pinned in the node, or by your code |
| Provider keys & billing | Your keys, your pricing | Your keys, through their runtime |
| Choosing a model | Cheapest that clears your rules | The model you wired into the node |
| Quality floor | Enforced per call, fails loud | Up to your prompts and evals |
| Decision audit | Replayable by fingerprint | Run logs and traces of the flow |
| Open & self-host | Open policy_ir, self-host free | Varies — several are open-source |
Costs and the quality score are illustrative; the quality floor is the catalog field bench_intelligence, a 0..1 score. See the full lineup on the compare hub.
Strong tools for building flows and agents.
Vellum, LangGraph, n8n, and Dify are credible builders. They give you visual or code orchestration, prompt management, evals, and deployment for multi-step flows and agents — the scaffolding that turns a chain of calls into a shippable product.
Orchestration
Lay out steps, branches, loops, and human-in-the-loop as a visual canvas or in code — Vellum and Dify lean visual, LangGraph leans code, n8n bridges into your wider automation.
Prompt & eval workflow
Manage prompts, version them, and run evals against test cases so a flow can be measured before it ships and watched after.
Deployment
Turn the graph into an endpoint or a hosted app, with logs and run history for the flow as a whole.
Integrations
Reach the tools, data sources, and APIs a real task touches — retrieval, webhooks, queues, and the rest of your stack.
We don't build the flow — we decide the model.
unhardcoded is not a workflow or agent builder. It is the runtime policy layer that sits inside an LLM node: your backend sends a policy_ir with the call, and the router runs filter → rank → select → fallback over the live catalog to pick the cheapest model that clears your rules — over your own provider keys.
Decision, not orchestration
A builder picks the order of steps. unhardcoded picks which model runs a given step — gemini-3.5-flash at $0.018 when it clears the floor, a stronger model only where the rules demand it. Figures illustrative.
An explicit quality floor
The floor is the catalog field bench_intelligence, a 0..1 score, e.g. ["cmp","bench_intelligence","ge",0.5]. deepseek-v4-flash (0.42) is filtered out; gemini-3.5-flash (0.54) passes. No silent downgrade — if nothing clears, it fails loud.
A replayable trace
Every node writes a receipt: candidates, the winner and why, which rule rejected the rest, fallback, latency, cost vs baseline, and the fingerprint pol_8f41c2 (sigma-pol/v1) to replay it later.
Dry-run a node before production: GET /x/fields for the live vocabulary and POST /x/rank to preview which models pass and which rule rejected the rest — no inference. More on how flows work in unhardcoded.
Pick the builder for the graph, unhardcoded for the model.
These aren't substitutes. The question is which job you're doing right now — building the flow, or deciding the model inside it.
Choose a workflow / agent builder when…
You need to design and ship a multi-step flow or agent: branching logic, tool calls, retrieval, human approval, a visual canvas, prompt versioning, evals, and a deployable endpoint. That orchestration is their core job, and it's a good one.
Choose unhardcoded when…
The app needs to express the model decision as policy at runtime — cheapest model that clears the floor, automatic per-step fallback, no hardcoded model strings, and a trace that proves why each model won. That's the node, not the graph.
Keep your builder. Route each node through unhardcoded.
The honest answer is that they're complementary. unhardcoded speaks the OpenAI-compatible API, so it sits underneath the builder you already use: keep the graph where it is, and point each LLM node at unhardcoded.
What you get by stacking them
- Cheapest passing model per node — each step routes on its own policy; classify on
gemini-3.5-flash, synthesize onclaude-sonnet-4-6, guard onclaude-opus-4-8only where it earns its cost. - Per-step fallback — when a node's model errors or times out, it advances to the next passing candidate without restarting the flow or redeploying.
- One audit you can replay — every node's decision lands in the trace, keyed by fingerprint, so a flow run is reviewable months later.
And if you'd rather express the whole task in unhardcoded, flow_ir is a bounded DAG where each step carries its own policy — a composition of policies, one stitched trace, billed as one run up to 5 nodes. It composes policies; it does not replace your builder.
Build the graph your way. Route every node by policy.
Keep the workflow or agent builder you like, and let unhardcoded pick the cheapest passing model for each step — with per-step fallback and a replayable trace. Join the waitlist to put the model decision layer under your flows.