AI Integration & RAG
Real AI features: customer support copilots, document RAG, contract analysis, internal agents that automate work. With proper evals, fallback strategies, observability, and per-tenant cost guardrails.
The hard part of AI features in production is not the model call. It's everything around it — the eval suite that tells you when output quality regresses, the cost dashboard that catches the request loop spending $80/hour, the fallback when the upstream model is down, the structured retry on rate limits.
What I build
Customer-facing copilots — chat interfaces grounded in your product's docs and data, with streaming UI, retries, citations, and refusal patterns that don't embarrass you.
Document RAG — ingestion pipeline (PDFs, HTML, source-of-truth APIs), chunking strategy, embedding store (pgvector for most cases, Pinecone for scale), retrieval with reranking, evals on retrieval quality.
Internal agents — workflow automation backed by tool-using LLMs. Triage, summarisation, document drafting, customer-support classification. With logged decisions, human-in-the-loop checkpoints, and rollback discipline.
Eval infrastructure — golden sets, model-graded evals, regression detection. The thing that distinguishes "we're using AI" from "our AI quality is improving over time".
What I steer you away from
A 47-tool LangChain Frankenstein when a 200-line script would do. Vector DBs when full-text search would have worked. Custom fine-tunes before you've nailed the prompt and retrieval layer. The AI feature gold-rush is full of ten-figure mistakes; my job is to make sure you don't ship one.
Adjacent services.
Cloud & DevOps Engineering
Production cloud environments designed deliberately — resilient, cost-aware, and ready for the day you actually need them.
Internal developer platformsPlatform Engineering
Self-service platforms that turn 'open a ticket and wait three days' into 'open a PR and ship in fifteen minutes'.
EKS · GKE · AKS · self-hostedKubernetes & Container Orchestration
Production-grade Kubernetes — clusters that scale, upgrade cleanly, and don't wake people up.