AI Prompt Engineering
Production-grade prompt engineering: structured prompting, schema-constrained outputs, retrieval-grounded answers, eval harnesses, and the regression discipline that distinguishes a working AI feature from a hopeful one.
The thing nobody tells you about prompt engineering is that the prompt isn't the artefact. The eval is. A prompt that scores 92% on a 200-example eval set is a thing you can defend, ship, and improve. A prompt that "feels good" is a feature about to regress next time the upstream model updates.
What I do
Prompt design — for the actual problem you're solving, not a generic 'helpful assistant'. Structured prompting with system / role / context layering. Schema-constrained outputs where the answer feeds another system. Retrieval-grounded answers when freshness matters.
Eval harness — golden sets that represent the real distribution of your inputs. Model-graded evals for soft criteria (helpfulness, tone). Regression suite that runs on every prompt change. Score deltas surfaced in PR review.
Multi-model portability — design prompts that survive a swap from GPT-4 to Claude to Gemini. The eval tells you whether the new model is actually better for your use case.
Prompt-injection audit — adversarial test set, system-prompt leak detection, refusal patterns. The basics most teams skip.
Documentation — a prompt library your team can read, extend, and improve. Versioned. Tested. Owned.
When this is the right engagement
You have an AI feature in production (or about to be) and the quality is inconsistent. Or you're picking between models and don't have data to decide. Or you've been burned by a regression after a model update and want never to be again.
Adjacent services.
Cloud & DevOps Engineering
Production cloud environments designed deliberately — resilient, cost-aware, and ready for the day you actually need them.
Internal developer platformsPlatform Engineering
Self-service platforms that turn 'open a ticket and wait three days' into 'open a PR and ship in fifteen minutes'.
EKS · GKE · AKS · self-hostedKubernetes & Container Orchestration
Production-grade Kubernetes — clusters that scale, upgrade cleanly, and don't wake people up.