compare · workflow & agent builders

unhardcoded vs workflow & agent builders.

They build the graph; we route every node by policy. Vellum, LangGraph, n8n, and Dify orchestrate multi-step flows. unhardcoded decides which model runs each step — and proves it with a trace.

unhardcoded vs Vellum · LangGraph · n8n · Dify

at a glance

Build the flow, or decide the model.

A flow builder lays out the steps and ships the app. unhardcoded sits inside each LLM node and picks the cheapest model that clears your rules. Different jobs — they stack.

	unhardcoded	Workflow / agent builders
What it builds	The model decision, per step	The multi-step flow or agent
Where the route is decided	Sent with the call (policy_ir)	Pinned in the node, or by your code
Provider keys & billing	Your keys, your pricing	Your keys, through their runtime
Choosing a model	Cheapest that clears your rules	The model you wired into the node
Quality floor	Enforced per call, fails loud	Up to your prompts and evals
Decision audit	Replayable by fingerprint	Run logs and traces of the flow
Open & self-host	Open policy_ir, self-host free	Varies — several are open-source

Costs and the quality score are illustrative; the quality floor is the catalog field bench_intelligence, a 0..1 score. See the full lineup on the compare hub.

01 · what they're good at

Strong tools for building flows and agents.

Vellum, LangGraph, n8n, and Dify are credible builders. They give you visual or code orchestration, prompt management, evals, and deployment for multi-step flows and agents — the scaffolding that turns a chain of calls into a shippable product.

Orchestration

Lay out steps, branches, loops, and human-in-the-loop as a visual canvas or in code — Vellum and Dify lean visual, LangGraph leans code, n8n bridges into your wider automation.

Prompt & eval workflow

Manage prompts, version them, and run evals against test cases so a flow can be measured before it ships and watched after.

Deployment

Turn the graph into an endpoint or a hosted app, with logs and run history for the flow as a whole.

Integrations

Reach the tools, data sources, and APIs a real task touches — retrieval, webhooks, queues, and the rest of your stack.

02 · where unhardcoded is different

We don't build the flow — we decide the model.

unhardcoded is not a workflow or agent builder. It is the runtime policy layer that sits inside an LLM node: your backend sends a policy_ir with the call, and the router runs filter → rank → select → fallback over the live catalog to pick the cheapest model that clears your rules — over your own provider keys.

Decision, not orchestration

A builder picks the order of steps. unhardcoded picks which model runs a given step — gemini-3.5-flash at $0.018 when it clears the floor, a stronger model only where the rules demand it. Figures illustrative.

An explicit quality floor

The floor is the catalog field bench_intelligence, a 0..1 score, e.g. ["cmp","bench_intelligence","ge",0.5]. deepseek-v4-flash (0.42) is filtered out; gemini-3.5-flash (0.54) passes. No silent downgrade — if nothing clears, it fails loud.

A replayable trace

Every node writes a receipt: candidates, the winner and why, which rule rejected the rest, fallback, latency, cost vs baseline, and the fingerprint pol_8f41c2 (sigma-pol/v1) to replay it later.

Dry-run a node before production: GET /x/fields for the live vocabulary and POST /x/rank to preview which models pass and which rule rejected the rest — no inference. More on how flows work in unhardcoded.

03–04 · when to choose which

Pick the builder for the graph, unhardcoded for the model.

These aren't substitutes. The question is which job you're doing right now — building the flow, or deciding the model inside it.

Choose a workflow / agent builder when…

You need to design and ship a multi-step flow or agent: branching logic, tool calls, retrieval, human approval, a visual canvas, prompt versioning, evals, and a deployable endpoint. That orchestration is their core job, and it's a good one.

Choose unhardcoded when…

The app needs to express the model decision as policy at runtime — cheapest model that clears the floor, automatic per-step fallback, no hardcoded model strings, and a trace that proves why each model won. That's the node, not the graph.

05 · use them together

Keep your builder. Route each node through unhardcoded.

The honest answer is that they're complementary. unhardcoded speaks the OpenAI-compatible API, so it sits underneath the builder you already use: keep the graph where it is, and point each LLM node at unhardcoded.

What you get by stacking them

Cheapest passing model per node — each step routes on its own policy; classify on gemini-3.5-flash, synthesize on claude-sonnet-4-6, guard on claude-opus-4-8 only where it earns its cost.
Per-step fallback — when a node's model errors or times out, it advances to the next passing candidate without restarting the flow or redeploying.
One audit you can replay — every node's decision lands in the trace, keyed by fingerprint, so a flow run is reviewable months later.

And if you'd rather express the whole task in unhardcoded, flow_ir is a bounded DAG where each step carries its own policy — a composition of policies, one stitched trace, billed as one run up to 5 nodes. It composes policies; it does not replace your builder.

See how the per-step decision cuts model spend

Build the graph your way. Route every node by policy.

Keep the workflow or agent builder you like, and let unhardcoded pick the cheapest passing model for each step — with per-step fallback and a replayable trace. Join the waitlist to put the model decision layer under your flows.

See the full comparison

Sits under your builderYour provider keysOpen policy_irEvery node traced