This is a placeholder draft. The full essay is in a Notion doc that I haven't been brave enough to publish yet.
The question
Why does KBO's quoted park factor for a given stadium swing 8–12 points year over year? Is the park changing, or is the estimator?
The setup
A standard park factor is a ratio of run-scoring at home vs. away. With ~70 home games per team and ~10 teams, you get a sample that feels large but is dwarfed by year-to-year offensive variance. The naive estimator picks up that noise and prints it as if it were a park trait.
The fix (sketch)
Partial pooling across years per park, with an informative prior set by the league mean and a between-year variance term. The shrinkage is dramatic for parks with mild deviations and gentle for the obvious outliers.
brm(
runs ~ 1 + (1 | park) + (1 | season:park),
data = games,
family = poisson(),
prior = prior(normal(0, 0.2), class = "sd")
)
What I want to write next
- The handedness-split version
- Whether the "humidor era" effect survives the shrinkage
- How much of the Jamsil low-offense reputation is the park vs. the rosters