Alex Goldhoorn

Articles

Technical Articles

Technical writings on data science, AI evaluation, simulation, and logistics optimization.

LLM Evaluation Framework

When LLMs Meet Structured Data: The Evaluation Challenge

Building an evaluation framework for LLM agents at Meight. When extracting structured shipping data from documents, we learned that evaluation requires both strict metrics (for production readiness) and LLM-as-a-judge (for semantic correctness).

Read Article →
LLM Riddles Evaluation

System 1 vs System 2: Testing LLMs with Riddles

An experimental evaluation of how 8 models (6 cloud, 2 local) perform on logic puzzles, revealing the gap between pattern matching and first-principles reasoning. Includes complete raw model responses.

🎯 Try Interactive Challenge → Read Full Analysis → 📝 Raw Outputs →
Glovo Delivery Simulation

How to Simulate a Global Delivery Platform

Deep dive into building a large-scale discrete event simulation system for Glovo's global delivery network, covering architecture decisions, performance optimization, and real-world validation.

Read on Medium →

Contact: alex (at) goldhoorn.net