NCAA I Pitcher, Lafayette College
University of Chicago MBA `18
In a previous post we clustered 2017 MLB starting pitchers into four mutually exclusive groups based on their pitching style. The groups were: Elite Power, Finesse, Fastball, and Off-Speed. With these groups defined, some natural questions arise.
- Do umpires systematically call balls and strikes differently for these groups?
- Do some teams hit different pitcher types differently? (e.g. A good fastball hitting team)
- Are some pitcher types more durable than others?
We at Lokation Nation decided to look into the first question. Our hypothesis is that pitchers who command the strike zone benefit from marginal pitches being called a strike more than pitchers with less command. Anyone who watched the Braves in the 1990’s is familiar with this potential bias; however, there has been no extensive analysis on strike zones for different pitcher types. We find that Finesse pitchers have a larger strike zone than Elite Power pitchers, particularly down in the zone.
Finesse pitchers can benefit from an expanded strike zone by commanding the lower boundaries of the zone. Consistently pitching to and testing the confines of the strike zone will earn a pitcher called strikes and induce swings on marginal pitches that are more difficult to lift for home runs.
Other pitcher types should take notice of this finding too. By mimicking Finesse pitchers (i.e. commanding the corners of the zone) they too can expand the limits of the strike zone.
To evaluate the size of the strike zone for different pitcher types, we piggy-backed on our previous clustering work using the same dataset and pitcher groupings. Of course, we are only interested in pitches on which batters did not swing because we want to evaluate called balls and strikes. After filtering to called balls and strikes, there are between 9K and 12K pitches from each pitcher type.
I decided to borrow a methodology employed by Jim Albert to determine what the “true” strike zone is in the Major Leagues. The methodology involves building a binary classification model which will take inputs from the data and output an estimated probability of the pitch being called a strike. In this instance, I choose to use a Logistic Generalized Additive Regression Model which allows for a lot of flexibility in fitting data but is still easily interpreted. The only inputs used for the model are the vertical and horizontal location of the pitch when it crosses the plate. The model’s form is below where S is a smooth continuous function of vertical and horizontal coordinates.
Log [Prob (Strike) / Prob (Ball)] = s(Vertical Location + Horizontal Location) + Constant
I built a separate model for each pitcher type on all the data available. Fitting separate models allows for an identically located pitch to have a different probability of getting called a strike for each pitcher type. From there I plotted a few contours which show the pitch locations that have specific probabilities of being called a strike. Holding the strike probability constant and comparing across pitcher types one can determine which pitcher types enjoy a larger strike zone.
Below are the contour plots for each pitcher type against a right-handed batter. A pitch on the light blue line has a 90% chance of being called a strike. A pitch inside the light blue line would have a higher probability of being called a strike. A pitch on the darker blue line has a 50% chance of being called a strike. The red rectangle is rulebook definition of the strike zone (unadjusted for batter height).
The first takeaway from these plots is that umpires’ strike zones vary from the technical rulebook definition of the strike zone. Umpires tend to be more generous to pitchers on outside pitches and to a mixed extent low pitches. Umpires are less willing to call strikes on pitches in the upper corners of the strike zone. These findings align with what observant fans may have noticed.
Comparing across pitcher types, Finesse pitchers have the largest area inside the dark blue shape. This means that a pitch with a 50-50 chance of being called a strike from an Elite Power pitcher would have a better chance of being called a strike if thrown by a Finesse Pitcher. In fact, Elite Power pitchers have the smallest empirical called strike zone. The strike zone differences between Finesse and Elite Power pitchers are statistically significant at a 95% confidence level, meaning there is less than 5% chances that the differences are due to random variation (chance or luck).
These contours do not offer any explanations for discrepancies across different pitcher types. It is up to the analyst to make inferences into why Finesse pitchers get lower pitches called strikes at higher rates than Elite Power pitchers. One way to make such inferences is to look at the types of attributes that go into defining these pitcher types. The differentiating attributes of Finesse pitchers are that they “Nibble” more (i.e. pitch to the corners), pitch behind in the count more, and don’t have high fastball velocity. We posit that the expanded Finesse pitcher strike zone is most associated with commanding the corners of the strike zone, “Nibbling”.
It is possible that there are confounding factors that could lead to the same outcome. For instance, perhaps the strike zones differ because umpires loosen up in certain counts and Finesse pitchers are in those counts more often. Jim Albert’s analysis showed that umpires’ strike zones contract when there are two strikes on the batter. Finesse pitchers are behind in the count more often than Elite Power pitchers (though the different is small) so this confounding story is plausible. The effects of count on the strike zone should be a further explored in the future.
About the Blogger:
Brad Woodfield holds a BA in math and econ from Lafayette College (NCAA I) where he pitched. He closed the 2007 Patriot League Championship game and pitched in the 2007 NCAA Charlottesville Regional. Brad is currently an MBA candidate at the University of Chicago’s Booth School of Business with a concentration in econometrics and classes in sports analytics. Brad has teamed up with Lokation Nation on the data and analytics side, to provide studies and information for the baseball community to view and fairly determine results of pitching command and location, in relation or accordance with velocity. We will be utilizing a variety of hypothesis, topics, and theories to help us better understand how command and location may be a bit underrated in terms of what makes pitchers truly successful. Brad brings a unique perspective to analytics because he played NCAA I baseball and has studied analytics at a high level.