G
Gait

Durable Jobs

Durable Jobs

Use this page when your agent work can run for minutes to hours and you need deterministic lifecycle control with verifiable evidence.

What A Durable Job Is

A durable job is a checkpointed execution record managed locally by Gait with explicit lifecycle commands:

  • submit
  • status
  • checkpoint add|list|show
  • pause
  • approve
  • resume
  • cancel
  • inspect

The job surface is for runtime control and evidence, not prompt orchestration.

When To Use This

  • multi-step agent workflows can fail mid-run and must resume deterministically
  • human approvals are required before continuation
  • operators need inspectable state transitions and stable stop reasons
  • CI or incident workflows need portable evidence from job state

When Not To Use This

  • tasks are short-lived and retries are trivial
  • no Gait CLI/artifact path is available in the runtime
  • you only need hosted traces and dashboards without local enforcement or artifact verification

Minimal Lifecycle

gait job submit --id job_1 --json
gait job checkpoint add --id job_1 --type progress --summary "step 1 complete" --json
gait job pause --id job_1 --json
gait job approve --id job_1 --actor reviewer_1 --reason "validated input" --json
gait job resume --id job_1 --actor worker_1 --reason "continue after approval" --json
gait job inspect --id job_1 --json
gait job status --id job_1 --json

Artifact And Verification Path

Durable jobs produce state under the job root (default ./gait-out/jobs) and can be promoted to a pack:

gait pack build --type job --from job_1 --json
gait pack verify ./gait-out/pack_job_1.zip --json
gait pack inspect ./gait-out/pack_job_1.zip --json

Portable evidence outputs:

  • job lifecycle state/events under ./gait-out/jobs
  • pack_.zip (PackSpec v1 envelope)
  • deterministic verify/inspect JSON for CI, incident handoff, and audits

How This Differs From Checkpoint/Observability Tools

Dimension Gait durable jobs LangChain/LangFuse-style checkpoint and observability stacks
Primary role runtime control + evidence contract orchestration tracing, hosted observability, debugging UX
Enforcement boundary tool boundary with fail-closed non-execute rule usually orchestration-time controls, not portable side-effect enforcement contract
Artifact portability signed/offline-verifiable packs and traces service-backed trace state, often not cryptographically portable by default
CI regression loop first-class regress fixture and stable exit semantics typically custom harnesses around exported traces
Offline operation core verify/diff/regress/job operations run locally hosted components commonly required for full feature set

This is a complementary model: teams can keep hosted observability while using Gait for enforceable runtime boundaries and deterministic evidence.

Better Fit Vs Not Necessary

Better fit:

  • regulated or high-risk tool execution
  • production incident-to-regression loops
  • multi-team workflows requiring independently verifiable evidence artifacts

Not necessary:

  • experimentation with no external side effects
  • prototypes where deterministic replay and auditability are out of scope

Integration Anchors

  • CLI entrypoint: cmd/gait/job.go
  • runtime implementation: core/jobruntime/
  • pack conversion/verification: core/pack/
  • representative adapter path: examples/integrations/openai_agents/quickstart.py