Live code experiments — things I’m building to learn, explore, and share about ML, AI, and product development. Each project has a live demo and a case study with the evaluation work behind it.
D&D RAG Content Generator
A production RAG system that generates original Dungeons & Dragons content — weapons, NPCs, artifacts, locations, and monsters — grounded in a 1,006-document Forgotten Realms lore corpus. A six-phase evaluation framework testing one variable at a time improved retrieval precision by 85% over the baseline while cutting per-query token cost by 65%.
ChromaDB · Claude Haiku · DALL-E 3 · FastAPI · Next.js · DigitalOcean
Federal Regulation RAG
A production RAG system over the U.S. Code of Federal Regulations — Titles 7, 21, and 42 (Agriculture, Food & Drugs, Public Health), 85,351 sections. Each query returns a plain-English explanation, a legal-language synthesis with verbatim quotes, and structured CFR citations. Eleven experiments, one metric-audit plot twist, and an inference-time confidence signal that reliably flags “I don’t know.”
PostgreSQL + pgvector · OpenAI embeddings · Claude Haiku 4.5 · FastAPI · Next.js · DigitalOcean
Building — data collection in progress
Hyperlocal Weather Prediction
A backyard weather station running live, collecting the training data needed to fit a hyperlocal forecasting model. The station is online; the ML work is paused until there is enough data to fit and evaluate cleanly. Case study and live demo will land when the model does.

