Budget-Aware Model Selection
cycgraph can dynamically choose which LLM model to use for each agent at runtime. Instead of hardcoding a model, agents declare a capability tier (high, medium, or low), and the engine resolves it to a concrete model — downgrading automatically when the workflow budget is running low.
How it works
Section titled “How it works”- An agent declares
model_preference: 'high'(ormedium/low) instead of relying solely on its staticmodelfield - You provide a tier map that maps each tier to concrete models per provider
- Before each agent execution, the engine’s model resolver checks the remaining budget and picks the best model the workflow can afford
- If no resolver is configured, the agent’s static
modelis used as a fallback
Capability tiers
Section titled “Capability tiers”| Tier | Use Case | Example Models |
|---|---|---|
high | Complex reasoning, planning, code generation | claude-opus-4-20250514, o3 |
medium | General-purpose tasks, summarization | claude-sonnet-4-20250514, gpt-4o |
low | Simple formatting, extraction, classification | claude-haiku-4-5-20251001, gpt-4o-mini |
Setting up a tier map
Section titled “Setting up a tier map”A ModelTierMap maps each capability tier to concrete model IDs per provider:
import { defaultModelResolver } from '@cycgraph/orchestrator';import type { ModelTierMap } from '@cycgraph/orchestrator';
const tierMap: ModelTierMap = { high: { anthropic: 'claude-opus-4-20250514', openai: 'o3' }, medium: { anthropic: 'claude-sonnet-4-20250514', openai: 'gpt-4o' }, low: { anthropic: 'claude-haiku-4-5-20251001', openai: 'gpt-4o-mini' },};
const modelResolver = defaultModelResolver(tierMap);You only need to include the tiers and providers your workflow uses. If a tier/provider combination is missing, the agent falls back to its static model.
Configuring agents
Section titled “Configuring agents”Set model_preference on the agent config. The model field still serves as the fallback when no resolver is configured or the tier can’t be resolved:
const researcherId = registry.register({ name: 'Researcher', model: 'claude-sonnet-4-20250514', // fallback model_preference: 'high', // prefers high-tier when budget allows provider: 'anthropic', system_prompt: 'You are a research specialist...', tools: [{ type: 'mcp', server_id: 'web-search' }], permissions: { read_keys: ['topic'], write_keys: ['notes'] },});
const formatterId = registry.register({ name: 'Formatter', model: 'claude-haiku-4-5-20251001', // fallback model_preference: 'low', // always use cheapest tier provider: 'anthropic', system_prompt: 'You format text into clean markdown...', tools: [], permissions: { read_keys: ['draft'], write_keys: ['formatted'] },});Wiring the resolver into GraphRunner
Section titled “Wiring the resolver into GraphRunner”Wire the registries globally once at startup, then pass modelResolver via GraphRunnerOptions:
import { GraphRunner, configureAgentFactory, configureProviderRegistry,} from '@cycgraph/orchestrator';
// Once at startup:configureProviderRegistry(providers);configureAgentFactory(registry);
// Per run:const runner = new GraphRunner(graph, initialState, { modelResolver, // ← budget-aware resolution});
const finalState = await runner.run();Budget-aware downgrade logic
Section titled “Budget-aware downgrade logic”The default resolver uses a simple heuristic:
- Look up the preferred model from the tier map for the agent’s provider
- If no budget is set → use the preferred model
- Estimate the call’s cost using conservative token budgets per tier
- If estimated cost < 50% of remaining budget → use the preferred model (plenty of headroom)
- Otherwise, step down one tier → return the next cheaper model (
high→medium,medium→low) - If already at the lowest tier → use it anyway and mark the resolution as
budget_critical
Each resolution produces one of three reasons:
| Reason | Meaning |
|---|---|
preferred | The agent got its requested tier — budget is healthy |
budget_downgrade | Stepped down one tier to conserve budget |
budget_critical | Forced to the lowest tier — budget is nearly exhausted |
Listening to resolution events
Section titled “Listening to resolution events”The runner emits model:resolved stream events so you can observe every resolution decision:
for await (const event of runner.stream()) { if (event.type === 'model:resolved') { console.log( `[${event.node_id}] ${event.reason}: ${event.original_model} → ${event.resolved_model}` + (event.remaining_budget_usd !== undefined ? ` ($${event.remaining_budget_usd.toFixed(4)} remaining)` : '') ); }}The ModelResolvedEvent includes:
| Field | Type | Description |
|---|---|---|
reason | ModelResolutionReason | Why this model was chosen |
resolved_model | string | The concrete model that will be used |
original_model | string | The agent’s static fallback model |
preference | ModelTier | The agent’s declared capability tier |
remaining_budget_usd | number | undefined | Budget remaining at resolution time |
Cost estimation
Section titled “Cost estimation”The resolver estimates call cost before execution using conservative token budgets:
| Tier | Estimated Input Tokens | Estimated Output Tokens |
|---|---|---|
high | 4,600 | 2,300 |
medium | 2,300 | 1,150 |
low | 1,150 | 575 |
These include a ~15% headroom buffer. If the agent uses Anthropic extended thinking (provider_options.anthropic.thinking.budgetTokens), those tokens are added to the input estimate.
Unknown models are assigned a conservative fallback cost of $0.05 per call (fail-closed).
Custom resolvers
Section titled “Custom resolvers”You can replace the default resolver with any function matching the ModelResolver signature:
import type { ModelResolver } from '@cycgraph/orchestrator';
const myResolver: ModelResolver = (preference, provider, remainingBudgetUsd) => { // Your custom logic here // Return ModelResolutionResult or null to fall back to config.model return { reason: 'preferred', model: 'my-custom-model', tier: preference };};Complete example
Section titled “Complete example”import { GraphRunner, InMemoryAgentRegistry, InMemoryPersistenceProvider, createProviderRegistry, configureProviderRegistry, configureAgentFactory, defaultModelResolver, createGraph, createWorkflowState,} from '@cycgraph/orchestrator';import type { ModelTierMap } from '@cycgraph/orchestrator';
// 1. Set up providers (wired globally)const providers = createProviderRegistry();configureProviderRegistry(providers);
// 2. Define the tier mapconst tierMap: ModelTierMap = { high: { anthropic: 'claude-opus-4-20250514' }, medium: { anthropic: 'claude-sonnet-4-20250514' }, low: { anthropic: 'claude-haiku-4-5-20251001' },};
// 3. Register agents (wire registry globally)const registry = new InMemoryAgentRegistry();
const researcherId = registry.register({ name: 'Researcher', model: 'claude-sonnet-4-20250514', model_preference: 'high', provider: 'anthropic', system_prompt: 'You research topics thoroughly.', tools: [], permissions: { read_keys: ['goal'], write_keys: ['research'] },});
const writerId = registry.register({ name: 'Writer', model: 'claude-sonnet-4-20250514', model_preference: 'medium', provider: 'anthropic', system_prompt: 'You write clear, concise summaries.', tools: [], permissions: { read_keys: ['research'], write_keys: ['summary'] },});
configureAgentFactory(registry);
// 4. Build the graphconst graph = createGraph({ name: 'Budget-Aware Research', description: 'Research a topic, then summarize it under a budget.', nodes: [ { id: 'research', type: 'agent', agent_id: researcherId, read_keys: ['goal'], write_keys: ['research'] }, { id: 'write', type: 'agent', agent_id: writerId, read_keys: ['research'], write_keys: ['summary'] }, ], edges: [{ source: 'research', target: 'write' }], start_node: 'research', end_nodes: ['write'],});
// 5. Build state and run with the resolverconst persistence = new InMemoryPersistenceProvider();const state = createWorkflowState({ workflow_id: graph.id, goal: 'Research and summarize quantum computing', budget_usd: 0.50,});
const runner = new GraphRunner(graph, state, { modelResolver: defaultModelResolver(tierMap), persistStateFn: async (s) => persistence.saveWorkflowSnapshot(s),});
for await (const event of runner.stream()) { if (event.type === 'model:resolved') { console.log(`${event.node_id}: ${event.reason} → ${event.resolved_model}`); }}Limitations
Section titled “Limitations”- Architect unaware — the Workflow Architect does not yet generate graphs with
model_preferenceset; you must configure it via the registry - Single-step lookahead — the resolver estimates cost for one call at a time, not the remaining workflow
Security
Section titled “Security”- Budget is read only from top-level
WorkflowStatefields (budget_usd,total_cost_usd), never frommemory— this prevents agents from manipulating their own resolution by writing fake budget values - The tier map is frozen at construction time and cannot be mutated at runtime
- All resolver-internal metadata uses
_prefix keys for bookkeeping
Next steps
Section titled “Next steps”- Cost & Budget Tracking — set budgets and monitor spending
- Custom LLM Providers — register providers referenced in your tier map
- Agents — full agent configuration reference
- Streaming — consume
model:resolvedevents in real time