The hard part of AI features in production is not the model call. It's everything around it — the eval suite that tells you when output quality regresses, the cost dashboard that catches the request loop spending $80/hour, the fallback when the upstream model is down, the structured retry on rate limits.

What I build

Customer-facing copilots — chat interfaces grounded in your product's docs and data, with streaming UI, retries, citations, and refusal patterns that don't embarrass you.

Document RAG — ingestion pipeline (PDFs, HTML, source-of-truth APIs), chunking strategy, embedding store (pgvector for most cases, Pinecone for scale), retrieval with reranking, evals on retrieval quality.

Internal agents — workflow automation backed by tool-using LLMs. Triage, summarisation, document drafting, customer-support classification. With logged decisions, human-in-the-loop checkpoints, and rollback discipline.

Eval infrastructure — golden sets, model-graded evals, regression detection. The thing that distinguishes "we're using AI" from "our AI quality is improving over time".

What I steer you away from

A 47-tool LangChain Frankenstein when a 200-line script would do. Vector DBs when full-text search would have worked. Custom fine-tunes before you've nailed the prompt and retrieval layer. The AI feature gold-rush is full of ten-figure mistakes; my job is to make sure you don't ship one.

AI Integration & RAG

What I build

What I steer you away from

Adjacent services.

Cloud & DevOps Engineering

Platform Engineering

Kubernetes & Container Orchestration