Predicting Baseball Pitcher Efficacy Using Physical Pitch Characteristics
DOI:
https://doi.org/10.58445/rars.95Keywords:
baseball, sabermetrics, sports statistics, machine learning, statistical analysis, regression, neural network, pitcherAbstract
The efficacy of baseball pitchers can be predicted from prior pitching data using machine learning models. Previous machine learning works relating to baseball have primarily involved predicting outcomes of baseball games and a thrown pitch. This paper is the first work that uses sixteen game-independent features, which describe a pitcher’s set of thrown pitches, to predict multiple pitcher efficacy metrics, like walks/hits allowed per inning (WHIP), batting average against (BAA), and fielding independent pitching (FIP). We hypothesized that these sixteen “physical features,” which are measured by the use of sensors, can explain >50% of the variance while predicting pitcher efficacy. We applied the Neural Network model to predict the efficacy metrics using all sixteen features, while we used the Linear Regression model to analyze the individual impact of each feature for predicting the efficacy metrics. We observed from the Neural Network and Linear Regression models that the “ballFrequency” feature was the most impactful in predicting the WHIP for any pitcher. For the BAA and FIP metrics, the Linear Regression models showed that none of the features were impactful; however, we observed that the Neural Network model improved the prediction of the BAA and FIP metrics. Based on our evaluations, the machine learning models could not prove our hypothesis, as the results accounted for <50% of the variance when predicting the pitcher efficacy metrics. Professional scouts can still use the results of our individual feature analysis to select better pitchers who have never played a game at the professional level.
Downloads
Posted
Versions
- 2022-12-23 (4)
- 2022-12-23 (3)
- 2022-12-23 (2)
- 2022-12-13 (1)
Categories
License
Copyright (c) 2022 Tejas Oberoi
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.