I start by mapping manual business systems and shadowing the people who use them every day, surfacing where time, cost, and cognitive load pile up.
Design how models, tools, interfaces, and business logic interact. I focus early on what makes the system viable, validating data flow, model behavior, and measurable impact before investing in scale.
Build ways to test, measure, and improve model behavior. I design measurement systems that learn, blending automated evaluation with curated gold-set reviews so both the model and the business improve with every release.
Embed AI into workflows businesses can actually depend on. Once performance and ROI are proven, I design the self-improvement, onboarding, and monitoring systems that compound results.
I deploy AI in messy real-world environments, then turn those learnings into scalable infrastructure. The work below spans enterprise conversational AI deployments, the evaluation and platform systems I built from what I learned there, and the AI-native development practice that ties it all together.

Built a closed-loop evaluation system where simulated users test agents via LiveKit, and LLM judges score for goal satisfaction, quality, and naturalness. 150+ automated evals replaced 2-week manual QA cycles, enabling 3x MoM call-volume growth without regression.
View Case Study
Took Goodcall's agent creation from a one-week growth hack (200+ auto-generated agents, 8-figure pipeline) to a live keynote demo to a full self-serve agent development studio with API integrations, phone deployment, and built-in evals.
View Case Study
Forward-deployed AI agents into high-stakes environments: voice booking for the largest privately owned salon chain in the US (50% to 85%+ success rate) and breakdown call routing for the largest US trucking company (abandonment cut from 25% to 8%).
View Case Study
Developed a prototype-to-production pipeline where working code becomes LLM-generated specs, visual references ship to Storybook, and coding agents handle production implementation. This is the practice behind 3.9B tokens and 27K+ prompts.
View Case StudyThe biggest opportunity in AI isn’t replacing software—it’s automating the operational systems that power real businesses.
Most organizations don’t need more AI features. They need systems that reliably complete work in messy, real-world environments. That’s the work I focus on.
My foundation in UX and product design gives me an edge most AI builders don’t have: I think in workflows, interfaces, and user behavior - not just model architecture. I work across model behavior, system design, and product craft to move AI from prototype to production. Whether mapping workflows, designing evaluation loops, or building scalable design systems, I help teams go from 0→1 and 1→N with speed and clarity.
Operating principles
I value alignment, but not at the expense of momentum. I gather diverse input early, then make decisions visibly, documenting the “why” so teams can move fast and focus on execution.
I've learned the languages of design, data, and engineering well enough to plug holes where the team needs it most. That range lets me unblock dependencies, ask smarter questions, and get my hands dirty translating ideas into working systems at speed.
I treat prototypes as experiments, not deliverables.The goal isn’t to sell an idea, it’s to expose reality early, validate assumptions, and save engineering cycles downstream.
I do my best work hands-on - pairing with engineers to prototype live systems and with end-users to observe them in action. I design feedback loops that capture both qualitative insight and quantitative signal, turning every release into a learning cycle.
Teams find energy through visible progress, and getting things in the hands of end-users. I help maintain velocity with short feedback loops, fast prototypes, and early evidence of impact.