← All services
Multi-step orchestration that ships

AI Cascade & Agent Systems

Production multi-step AI systems — agents that use tools, cascades that route easy queries to small models and hard ones to larger ones, MCP servers that expose your data to AI clients. With observability, cost guardrails, and the discipline to keep them honest.

LangGraphVercel AI SDKMCPOpenTelemetryTemporalInngestPostgres

The honest version of "AI agents in production" in 2026: most work is still cascades, not autonomous loops. A classifier decides whether the query is simple, a small fast model handles 80% of cases, a larger model handles the remaining 20%, and a human gets routed the 1% that need judgement. The agent isn't out-thinking you — it's plumbing that handles cost and quality together.

What I build

Cost-aware cascades — classification step routes to model tiers (Haiku → Sonnet → Opus, or Flash → Pro, or local-llama → frontier). Cascade decisions cached. Cost-per-query budgets enforced.

Tool-using agents — agent loops with a tool registry, retry semantics, max-step termination, and audit logs of every tool call. Built for use-cases that genuinely need multi-step reasoning, not bolted onto things that don't.

MCP servers — Model Context Protocol servers exposing your databases, APIs, or workflows as tools that any MCP-compatible AI client can use. Authenticated, rate-limited, observable.

Observability — every step traced with OpenTelemetry, LLM spans showing prompt/output/cost/latency, sampled traces drilled into Honeycomb / Langfuse. The "why is this slow?" question becomes a 5-minute investigation.

Operational discipline — eval suites covering the cascade end-to-end, regression detection on routing decisions, kill-switches and circuit breakers, gradual rollout via feature flags.

What I steer you away from

Wiring a 12-tool agent loop onto a problem that's a 200-line script. Multi-agent "swarm" architectures because someone read a paper. Autonomous agents that touch production data without human-in-the-loop checkpoints. Cascades fail less spectacularly than autonomous agents, and ship years sooner.