The Frictional Cost of a Call to the Bullpen

Photo from wikimedia.org

Post by Eli Shayer and Scott Powers

It is well known that a starter’s performance tails off as he pitches deeper into a game. This drop off in results is attributed to facing the same batters multiple times, pitcher fatigue, and inconsistencies in mechanics. In this work, we examine reliever performance to see if there is an analogous effect.

Our study uses wOBA. a statistic developed by Tom Tango that measures the contribution of plate appearance results toward run creation, in units of runs. When assessing a pitcher’s quality, a low wOBA indicates a high performance by the pitcher, while a high wOBA shows the opposite. Expected wOBA is derived from the season wOBA of the batter. The figure below shows the difference between observed and expected wOBA for relievers, as a function of the number of batters faced, based on all batters faced by MLB relievers from 2000 to 2013.

For example, the value 1 along the x-axis corresponds to the first batter faced by relievers. The value 2 along the x-axis corresponds to the second batter faced by relievers, and so on. Because pitcher and batter handedness have a significant impact on the result, we separate the results into separate curves for each possible handedness pairing. The “All Handedness” curve is the unweighted average of the four other curves.

ShayerPowers1

After the fourth or fifth batter faced (BF), results fluctuate greatly due to insufficient sample size, but all curves show the same pattern at the beginning: On average, the wOBA of the first BF is 10 wOBA points higher, relative to expectation, than the wOBA of the second BF. This difference of 10 wOBA points scales to a difference of about 0.37 runs per 9 innings (because the average number of batters faced per 9 innings is about 37).

Our proposed explanation of this frictional substitution cost is that pitchers require some feeling out of their pitches and throwing at full effort before being completely game-ready. Warm up pitches in the bullpen appear to not sufficiently prepare a reliever for appearing in the game, and they pay the price when facing their first batter.

What kind of relievers struggle most against the first batter faced?

While we account for batter skill by comparing results against expected results for the batter (and in doing so adjust for year), the above results do not account for pitcher ability. Pitchers who face more batters than average are over-represented, relatively, against the fifth BF, while pitchers with fewer BFs than average are relatively over-represented against the first batter faced.

To account for this source of bias, we define a reliever’s type based on the number of batters he faces in an average outing. The three categories were < 3.5 BF, 3.5 – 4 BF, and > 4 BF. These categories were derived from the distribution of average number of batters faced, which was centered at 3.5 – 4 BF, with long tails on either end. Using the same model as above, we made a similar graph for each category of reliever, included in the figure below.

ShayerPowers2

Dividing the data to this granularity, we observe that the sample sizes have been reduced sufficiently to mask the signal with the noise. In none of the three graphs above is there a clear trend. However, one important observation is that the three groups of relievers do not have significantly different performance over all among the first five batters faced. So we have assuaged concerns that the observed first batter effect may be due to sampling bias.

How do relievers struggle against the first batter faced?

To try to understand exactly how reliever performance changes as they face more batters, we broke down the distribution of results for each number of batters faced. In the table that follows, we have found the percentage of plate appearances that end in each result.

ShayerPowers3

The data in the table demonstrate that the mechanism for relievers performing worse against their first batter faced is a high level of power. The first batter of a reliever’s appearance hits fewer singles than typical, but makes up for it by hitting an above average proportion of doubles, triples, and home runs. Additionally, the peril of leaving a reliever in too long is clear when comparing the first few batters faced to the last few batters faced in the chart. In fact, the first batter effect is overcome by a reliever tiredness effect by the 7th batter, at which point reliever performance increasingly worsen, and is worse than their performance against the first batter.

What about the first batter faced in subsequent innings?

The final aspect of our analysis was looking at whether there is a first batter effect for each inning similar to the one we found for each appearance. Knowing that the first batter effect exists for the first inning we separated out that effect from a potential first batter of the inning effect. Thus this analysis looked exclusively at plate appearances pitched by relievers coming out of the dugout after pitching the final out of the previous inning. Pooling together all innings besides the first into number of batters faced results in the figure below.

ShayerPowers4

The graph doesn’t show any notable patterns in the first several batters faced. There doesn’t appear to be an analogous first batter effect, and moreover the data shows an opposite result. There is an oddly consistent result in the fifth and sixth batters faced, which is a source of intrigue. Otherwise, the data doesn’t show an effect on the first batter of an inning, other than the first batter of the appearance as a whole.

Conclusion

We have shown that relievers struggle against the first batter they face, relative to expectation. Data were insufficient to identify which types of relievers suffered from this effect most, but we were able to identify that the reason for the increase wOBA of the first batter faced is an increase in power numbers. That is, the proportion of doubles, triples, and home runs against the first BF is higher than would otherwise be expected when relievers enter a ballgame.

Intuitively, these results make sense. A reliever who has just entered the game could not be described as being “in rhythm.” These results suggest that there is an increased risk of such a reliever throwing a mistake pitch, resulting in extra bases. Perhaps, on average, the time spent warming up in the bullpen is insufficient for a reliever to be “game ready.”

The frictional cost we observed is the equivalent of a difference of about 0.37 runs in ERA. So while much has been made of the value of using relievers, this effect is something that managers need to take into account when they are managing their bullpens.

Something that we did not explore is whether relievers struggle more against the first batter face when they have more or less forewarning that they will enter the game. This preparedness may be difficult to measure, but a possible surrogate would be an indicator of whether the reliever entered mid-inning. We leave this to future work.

Eli Shayer is an undeclared freshman from Anchorage, Alaska. He misses having snow available for cross country skiing.

Scott Powers is a PhD student in statistics and an analytics consultant to the Oakland Athletics. He plays catcher for the club baseball team and setter for the club volleyball team.

Contact Eli at eshayer ‘at’ stanford.edu and Scott at sspowers ‘at’ stanford.edu

Advertisements

Examining MLB Postseason Cluster Luck: or, Why the Playoffs Might Be a Crapshoot

Photo from wikimedia.org

Post by Vihan Lakshman

What role does luck play in baseball success? As one of the pioneering sports in quantitative analysis, our national past time is now understood—in many respects—as a finely tuned game of numbers. But does that tell the whole story?

Many prominent baseball figures, including Billy Beane, have described the MLB playoffs as a “crapshoot,” a roll of the dice that throws regular season success out the window. As Beane puts it, the teams who make the playoffs undoubtedly deserve to be there following a marathon 162 game regular season, but pure luck might be ultimate factor behind who finally ends up hoisting the World Series trophy.

To explore this idea of postseason luck in more detail, we can examine the “cluster luck” of teams in the regular season and the postseason. First coined by Joe Peta in his book Trading Bases, cluster luck provides a numerical measure of a team’s fortune in stringing together hits.

Jonah Keri of Grantland explains the phenomenon of cluster luck with an example: “Say a team tallies nine singles in one game. If all of those singles occur in the same inning, the team would likely score seven runs; if each single occurs in a different inning, however, it’d likely mean a shutout.”

As a further example of very unfortunate cluster luck, consider this box-score from Baseball-Reference from a 2005 meeting between Minnesota and Kansas City where the Twins tied a 1969 MLB record for the most hits in a game without a run.

Vihan1

Thus, if we use cluster luck as a tool to measure the respective fortunes of MLB teams in the regular season and the postseason, we might be able to shed some light on whether the playoffs are indeed a crapshoot, or if there is, in fact, a correlation between regular season and post season cluster luck—suggesting that cluster luck may not be luck at all.

While the idea behind cluster luck may make intuitive sense, there is no clear-cut, standard method of calculating how well a team bunches hits together. In this analysis, I used the base-runs formula, a model of predicting scoring, and considered the most accurate sabermetric statistic for run estimation. For all playoff teams between the years 2007 and 2014, I calculated each club’s regular season and postseason luck by determining their predicted run totals from the base runs formula and subtracting that from the actual amount of runs scored. A negative number indicates that a team scored fewer runs than expected and is hence “unlucky” while a positive score denotes “good luck” and specifies how many runs a team exceeded our base-runs prediction.

In examining the World Series winner from 2007-2014, we see that the vast majority of teams enjoyed positive cluster luck in the postseason.

Vihan2

Perhaps what’s more surprising about this list is the overwhelming amount of negative cluster luck during the regular season, most notably on the part of the 2009 Yankees who finished at the bottom of MLB in regular season luck. This phenomenon can likely be explained by considering that teams who manage to win games in spite of bad luck might be the most talented. In addition, this table of World Series winners provides our first bit of evidence that there may not be a correlation between regular season and postseason cluster luck, affirming the theory of the playoffs as a crapshoot.

To test this idea in further detail, I conducted a simple linear regression examining postseason cluster luck versus regular season luck.

Vihan3

Under the null hypothesis that the true slope of our linear regression is 0, we use a two-sided t-test to obtain a p-value of 0.6201, which is greater than our significance level of 0.1. Therefore, we cannot reject our null hypothesis and cannot conclude anything further about the relationship between postseason and regular season luck.

In our regression, we obtained an R2 value of 0.003987, suggesting that regular season cluster luck explains virtually none of the variance in postseason luck.

Ultimately, we found no evidence of a relationship between a team’s luck in the regular season and in the playoffs, which is what one would expect if it were truly luck. Although we cannot conclude that no relationship exists, there might in fact be something to the intuitive notion that the playoffs are a crapshoot. Whether this news is comforting to perennial playoff disappointments like the A’s, I can’t say, but the idea that luck can play such a huge role in determining legacies in sports is a fascinating question and definitely deserving of further exploration.

Vihan Lakshman is a junior from Savannah, GA studying mathematics. He also writes about football for The Stanford Daily and broadcasts sports for KZSU student radio. In his free time, he loves playing intramural sports and hopelessly rooting for the Atlanta Falcons to return to the Super Bowl.

Contact Vihan at vihan ‘at’ stanford.edu