The 100-Year Lead: What Baseball Teaches Us About the Future of AI

PyMC Labs
Chris is the creator of PyMC and an Adjoint Associate Professor at the Vanderbilt University Medical Center, with 20 years of experience as a data scientist in academia, industry, and government, including 7 years in pro baseball research with the Philadelphia Phillies, New York Yankees, and Milwaukee Brewers.
He is interested in computational statistics, machine learning, Bayesian methods, and applied decision analysis. He hails from Vancouver, Canada and received his Ph.D. from the University of Georgia.

Delphina
Hugo Bowne-Anderson is an independent data and AI consultant with extensive experience in the tech industry. He is the host of the industry podcast Vanishing Gradients, a podcast exploring developments in data science and AI. Previously, Hugo served as Head of Developer Relations at Outerbounds and held roles at Coiled and DataCamp, where his work in data science education reached over 3 million learners. He has taught at Yale University, Cold Spring Harbor Laboratory, and conferences like SciPy and PyCon, and is a passionate advocate for democratizing data skills and open-source tools.
Key Quotes
Key Takeaways
Baseball leads data and AI by a decade. Pay attention now. Baseball has been ahead of industry on data work for more than 100 years, from FC Lane's linear models in the early 1900s to Bill James to Moneyball to today. The techniques Fonnesbeck talks about in this conversation (process-based metrics, Bayesian hierarchical models, causal inference, integrating expert knowledge into probabilistic models) are what baseball front offices are running in 2026. If the pattern holds, this is where data and AI work in industry is heading next. If you want to stay relevant, learn these now.
Measure decision quality, not outcomes. Modern baseball cares less about what happened (outcome) and more about what should have happened given the inputs (process). With granular sensor data you can evaluate the quality of every pitch and swing independently of whether it produced a hit. AI teams should be making the same move in eval: judge whether the system made the right call at each step, not whether the final answer happened to land.
Share strength across a population to handle small samples. High-stakes decisions often rely on limited data, like a rookie's first few appearances. Bayesian hierarchical models let you combine a strong prior (historical data on similar players) with a small sample (current performance) to get reliable predictions without overfitting to a hot streak. The same pattern applies anywhere you need a prediction for a new user, customer, or product with little individual data.
Causal modeling is what you need when you are going to act on the model. A pitcher throwing a specific pitch might correlate with strikeouts, but that doesn't mean the pitch causes them. Fonnesbeck argues the next leap in analytics is asking the counterfactual: would a random pitcher who started throwing this pitch get more strikeouts? Same question applies to any intervention you might run, including adding a new product feature or marketing channel. Correlation is enough when you only want to predict. The moment you want to change something, you need causal inference.
At enough scale, the bottleneck shifts from modeling to engineering. Baseball's move from radar (TrackMan) to high-frequency optical cameras (Hawkeye) took the data from discrete events to six or seven terabytes per game of skeletal pose points and pitch trajectories. At that scale the hard problem stops being "what model do I fit" and becomes "how do I store, process, and serve this data." Same shift hits any AI team the moment they start logging full system traces and tool calls instead of summaries.
Integrate uncertainty through the model, then collapse it at the decision. The GM wants a list of players to draft. You still hand over a point prediction. The right way is to push a full distribution through the model and only collapse at the last step. Same problem AI engineers face shipping product recommendations.
Simple baselines still compete. Tom Tango's Marcel the monkey model (fixed weights, regression to the mean) still holds up against sophisticated projections decades later. Build the dumb baseline first.
Causal tooling lags behind Bayesian tooling. Bayesian methods have mature off-the-shelf libraries. Causal inference doesn't. If you want to do counterfactuals on observational data, you are mostly writing it from scratch. Methods at the frontier are always like this. The teams that get there first have an edge.
Build processes to integrate expert information into your models. Bayesian methods give you two natural places to bring in domain experts: prior elicitation (using their knowledge to inform the model upfront) and validation (sniff-testing the outputs). Fonnesbeck used both at the Yankees. For AI teams, the lesson is to design your modeling process so seasoned domain experts have a way in at both ends, not just at review time.
You can read the full transcript here.
00:00 HawkEye Cameras and the Sensor Revolution
00:38 Why Sabermetrics Is a Leading Indicator for AI Builders
04:33 What Sabermetrics Actually Is05:56 150 Years of Observational Data Science
07:14 DIPS and Separating Signal From Noise08:24 Moneyball as Cultural Integration
10:25 Why Baseball First: Markovian States and Discrete Events
13:54 Clock-Like Systems, Games, Sports, Chaos
15:08 "We've Solved Baseball" and Engineering Elite Pitchers
16:49 Win Probability and Shorter Time Horizons
17:45 Predicting Populations vs Predicting a Player
21:17 Wins Above Replacement: Putting a Price on Performance
23:38 Learn From Negative Results, Not Just Wins2
4:46 Inside a Modern MLB R&D Department
26:16 Six Terabytes per Game: The HawkEye Data Firehose
27:18 From Outcome Stats to Process Stats (the "x" Metrics)
29:15 Automated Balls and Strikes as a Game-Theory Problem
33:21 Small Samples, Strong Priors: How Baseball Forecasts
35:39 What Bayesian Thinking Actually Means
39:06 Making Assumptions Explicit and Sensitivity Analysis
41:00 Hierarchical Models Belong Everywhere, Not Just Baseball
41:56 Origin Story: How PyMC Got Built
43:00 Books to Read: Silver, Tetlock, and "The Book"
45:25 Why Causal Inference Is the Next Frontier
49:33 Why Bayesian and Causal Thinking Haven't Diffused
51:46 Breaking Into Sports Analytics: What to Learn First
53:52 Your GitHub Shows Thinking, Not Just Code
55:09 Bayesian Thinking Is the Original Generative Modeling
Links From The Show
Transcript
In the spotlight: Our most popular episodes
Listen up: Our latest discussions
Hear the hottest takes on data science and AI.
Get the latest episodes in your inbox
Never miss an episode of High Signal by signing up for the Delphina newsletter.


.png)



.png)

