Skip to content

Workflow State

The WorkflowState is the single source of truth for a running workflow. Every node reads from it, writes to it, and the engine persists it after each step for crash recovery.

import { createWorkflowState } from '@cycgraph/orchestrator';
const state = createWorkflowState({
workflow_id: graph.id,
goal: 'Research and summarize quantum computing',
constraints: ['Under 500 words'],
max_execution_time_ms: 120_000,
});
FieldTypeDefaultDescription
workflow_idstring (UUID)requiredGraph definition this run belongs to.
run_idstring (UUID)auto-generatedUnique identifier for this execution.
goalstringrequiredHigh-level objective for the workflow.
constraintsstring[][]Rules the workflow must respect.
FieldTypeDefaultDescription
statusWorkflowStatus'pending'Current lifecycle status.
current_nodestringNode currently being executed.
iteration_countnumber0Total reducer dispatches so far (loop guard).
max_iterationsnumber50Hard cap — the run fails if exceeded.
started_atDateWhen run() was first invoked.
max_execution_time_msnumber3600000 (1h)Wall-clock timeout for the entire run.
FieldTypeDefaultDescription
retry_countnumber0Retries on the current node so far.
max_retriesnumber3Maximum retries before the node fails permanently.
last_errorstringError message from the most recent failure.
compensation_stackCompensationEntry[][]Stack of typed compensating actions for saga rollback. Each entry has action_id and compensation_action: { type, payload }.
FieldTypeDefaultDescription
waiting_forWaitingReasonWhy the workflow is paused (e.g. 'human_approval').
waiting_sinceDateWhen the workflow entered the waiting state.
waiting_timeout_atDateDeadline after which the wait times out.
FieldTypeDefaultDescription
total_tokens_usednumber0Cumulative tokens consumed across all LLM calls.
max_token_budgetnumberIf set, the run fails when token usage exceeds this.
total_cost_usdnumber0Cumulative estimated cost in USD.
budget_usdnumberPer-run cost budget (run fails when exceeded).
FieldTypeDefaultDescription
memoryRecord<string, unknown>{}Shared key-value store. See Memory below.
visited_nodesstring[][]Node IDs visited in execution order.
supervisor_historyobject[][]Routing decisions made by supervisor nodes (for debugging).
created_atDatenowWhen this run was created.
updated_atDatenowLast state mutation timestamp.

The workflow status transitions denote the lifecycle of a workflow. All terminal states (completed, failed, cancelled, timeout) are final.

stateDiagram-v2
    direction LR
    pending --> scheduled
    scheduled --> running
    running --> completed
    running --> waiting
    running --> retrying
    waiting --> running
    retrying --> running
    retrying --> failed
    running --> cancelled
    running --> timeout

The memory object is the primary data exchange between nodes. It’s an arbitrary key-value store — you define the keys based on your workflow’s needs. Agents write to it via their text output, which the orchestrator automatically routes to the node’s write key. For agents that need to write structured data to multiple keys, the save_to_memory tool can be declared explicitly. Agents read from memory via their filtered state view (controlled by read_keys on the node).

  • Use descriptive keysresearch_notes is better than data or result
  • Reference, don’t store — avoid large blobs in memory; store them externally and keep a reference
  • Keep it flat — deeply nested objects are harder to debug
LayerScopePersistencePurpose
Graph StateShared across all nodesPersisted after every stepSource of truth — goal, results, artifacts
Thread ContextLocal to a single agentEphemeralRaw LLM conversation for the current agent

Graph State is the memory object. It’s persisted after every node execution, enabling crash recovery and time-travel debugging.

Thread Context is the raw LLM conversation history within a single agent execution. Each agent has its own thread — agents don’t see each other’s raw messages. The orchestrator automatically captures the agent’s text output and routes it to the appropriate write key, and the thread is discarded.

Actions dispatched to the reducer use a discriminated union type ActionTypeSchema. Valid action types are:

Action TypePurpose
update_memoryWrite key-value pairs to the memory object
set_statusTransition the workflow status
goto_nodeOverride the next node in the graph
handoffTransfer control to another agent/workflow
request_human_inputPause for human-in-the-loop approval
resume_from_humanInject human response and resume
merge_parallel_resultsCombine results from parallel node execution

Invalid action types are rejected at parse time via Zod validation. Internal engine actions (prefixed with _, such as _fail, _init, _budget_exceeded) bypass this validation and are reserved for the engine.

Data entering the system from external tools (web search, file reads) is flagged as tainted. Taint propagates automatically — if a node reads tainted data and writes to state, the output key inherits the taint flag. This lets downstream nodes make trust decisions about their inputs.

  • Agents — how agents read and write state
  • Nodes — node types and configuration