Resource type
Date created
2017-11-14
Authors/Contributors
Author: Bailey, Sarah Reid
Abstract
We consider new baseball data from Statcast which includes launch angle, launch velocity, and hit distance for batted balls in Major League Baseball during the 2015, and 2016 seasons. Using logistic regression, we train two models on 2015 data to get the probability that a player will get a hit on each of their 2015 at-bats. For each player we sum these predictions and divide by their total at bats to predict their 2016 batting average. We then use linear regression, which expresses 2016 actual batting averages as a linear combination of 2016 Statcast predictions and 2016 PECOTA predictions. When using this procedure to obtain 2017 predictions, we find that the combined prediction performs better than PECOTA. This information may be used to make better predictions of batting averages for future seasons.
Document
Identifier
etd10438
Copyright statement
Copyright is held by the author.
Scholarly level
Member of collection
Download file | Size |
---|---|
etd10438_SBailey.pdf | 493.34 KB |