Part 1 – The Problems with ERA | Part 3 – Pitching Arsenal
In Part 1 of this series, we looked at the weaknesses of using ERA as a statistic to measure pitcher performance. It turns out there are a number of factors that have a substantial impact on a pitcher’s ERA that the pitcher doesn’t control. Let’s take a look at a few of the statistics that improve on ERA.
Isolating the Pitcher’s Contribution
Suppose we pare back things for which we hold the pitcher accountable. Pitchers do have significant control over strikeouts, although catchers play a role with calling pitches and pitch framing. Umpire strike zones matter, too. But pitcher performance plays an overwhelming role in strikeouts. The same is true for walks and hit batters. Let’s give the pitcher credit and blame for those three outcomes.
For the moment, let’s assume home runs are something the pitcher controls. The number of home runs surrendered is unrelated to defense or relief pitchers or sequencing or official scorers, although it does depend on park factors. But lets assume for now that home runs belong in the bucket of stuff the pitcher controls.
We need a statistic that evaluates pitchers on those outcomes. Just count up home runs, walks, HBP and strikeouts. Those are standard box score stats. Nothing fancy. Figure out a weighting for each that reflects the known data on contribution to runs scored. To help with familiarity, use a formula that puts our stat on the same scale as ERA with 4.00 about average, 5.00 and above lousy, 3.50 good, and below 3.00 outstanding.
What we described is Fielding Independent Pitching (FIP), which is one of a group of similar statistics referred to as ERA Estimators. FIP has been popularized by the site FanGraphs and used as a basis for their WAR calculations.
FIP – Evaluating on Strikeouts, Walks and Home Runs
Fielding Independent Pitching measures how the pitcher actually pitched. The pitcher gave up those home runs and walks. He struck out those batters.
It’s more crucial to list the factors that don’t influence FIP. It doesn’t look at the number of runs scored or whether they were earned. It doesn’t include hits in the formula. Remember from yesterday’s post that hits have a huge component of randomness and are affected by batter skill and defense.
FIP does a better job of isolating what the pitcher controls in his performance than does ERA. FIP does not depend on official scorers decisions, or a shortstop’s range, or a left fielder’s arm strength, or the effectiveness of relief pitchers, or the sequence of events, or whether soft fly balls fall in as hits.
Research shows a pitcher’s FIP is a better predictor of how many runs he’ll give up in the future than does the pitcher’s ERA. Think of FIP as what a pitcher’s ERA would be assuming average defense, average bullpen and average luck.
A pitcher’s FIP is more stable than ERA from year to year, which is another indication it better reflects actual pitcher talent. If a pitcher has a long enough career, his ERA usually converges to his FIP. 75% of pitchers with at least a thousand innings pitched had an ERA within .2 of his FIP.
FIP isn’t perfect. It doesn’t account for that small part of batted balls that the pitcher does control. It includes home runs, even though those are influenced by park factors. But if you’re looking for a better measure of pitching performance, it’s a good place to start.
xFIP – Evaluating on Strikeouts, Walks and Fly Balls
Let’s go back to home runs. After years of study, we’ve learned that pitchers surrender one home run for every 10-12 fly balls they allow. That stat is expressed as the ratio HR/FB. For many years, HR/FB remained near 10%. In the past three seasons, the number jumped to around 12.5%.
Pitchers do have a degree of control over the number of fly balls they give up. If, as the data indicates, home runs are a reasonably consistent percentage of fly balls, the number of home runs a pitcher gives up is a function of his fly ball percentage (FB%).
Let’s say we wanted a version of FIP that “normalizes” home runs hit across luck and stadium dimensions. The way to do that would be to remove HR from the equation and replace it with a variable representing a pitcher’s FB% in relation to the league FB%.
That statistic is called xFIP where the “x” stands for “expected.” FIP counts how many home runs a pitcher gives up. xFIP estimates how many home runs a pitcher should give up assuming average luck and stadium size. It works essentially the same way FIP does. Pitchers control strikeouts, walks, hit batters and fly ball percentage. The formula is scaled to ERA. You can find xFIP at FanGraphs.
Why is xFIP important?
Over a season, the number of home runs an individual pitcher gives up varies quite a bit and might even diverge from league average over the duration of an entire year. Eventually the pitcher will move back toward league average. But an unusually high or low HR/FB for certain stretches may not be a good indicator of his true talent.
Studies show that xFIP is a better predictor of future pitching than FIP. Both are better than ERA.
SIERA – Adding Back Some Pitcher Skills
Let’s return to that small amount of influence pitchers have on batted balls and try to factor that into an ERA estimator.
Here is the raw data: Pitchers with greater velocity and more strikeouts also generate more poor contact and more double plays per ground ball. Pitchers with higher walk rates give up more runs than would be supposed by straight linearity. Pitchers with higher ground ball rates have lower out rates than fly ball pitchers.
A formula that takes all of that into account is more complicated than the one for FIP or xFIP. But it is still based on what the pitcher controls.
This statistic is called SIERA, which stands for Skill-Interactive ERA. You can find it at FanGraphs.
SIERA assumes the pitcher has average luck, defense, sequencing, park factors and home runs. It incorporates strikeouts, walks, HBP and FB% as things under the pitcher’s control. What SIERA adds to xFIP is an attempt to model the small fraction of batted balls that the pitcher can influence.
Studies show that SIERA is a better predictor of future pitching than xFIP, FIP and ERA.
DRA
In 2015, the folks at Baseball Prospectus (a historic and tremendous baseball site) introduced their own stylized pitching statistic. It’s called Deserved Runs Average (DRA). DRA is a “mixed model” because like ERA it weights all batting events, including hits, but normalizes ERA in many, many ways. DRA controls for the stadium, temperature, quality of opposing batter, pitching on the road, defense, pitch count, catcher framing, umpire strike zone, number of runners on base, number of outs, base runner speed and more. It’s also scaled the same as ERA.
Statcast “Expected” Stats
MLB’s Trackman system now gives us batted ball data, such as exit velocity and launch angle, for every play. Using that, it’s possible to develop new measures of how the pitcher performed. MLB’s Statcast Search page contains several new statistics that look at every hit ball a pitcher gives up.
Based on exit velocity and launch angle, it’s possible to formulate an expectation for how many hits and extra-base hits the pitcher should have given up. Examples include expected batting average (xBA), expected slugging percentage (xSLG) and expected, weighted on-base average (xwOBA). You can find them at the Baseball Savant website operated by MLB. They evaluate pitchers but are scaled to hitting stats, not ERA.
These new stats share similarities with FIP, xFIP and SIERA in that they assume average defense, bullpens, sequencing and, to a certain degree, luck.
But there is an important difference between the ERA Estimators and the new Statcast Expected Stats. Expected Stats give the pitcher 100% credit for the batted ball profile he surrendered. If a pitcher gives up more hits with a powerful angle-velocity combination, Expected Stats attribute it entirely to pitcher performance. But we know pitchers control far less of the variance in batted-ball profiles than that.
The Statcast Expected Stats give additional insight into actual pitcher performance and eliminate much of the noise that makes ERA an unreliable measure. But assuming that pitchers have complete control over batted balls will lead you down a questionable path.
“Minus” Stats
It is possible to adjust certain statistics for park effects. The convention among baseball statisticians is to put a minus-sign at the end and scale the statistics to 100. You can find ERA-, FIP- and xFIP-. Every point below 100 is a percentage that a pitcher is better than average. For example, a pitcher with an FIP- of 90 is 10 percent better than average, taking into account ballpark.
About WHIP
The statistic WHIP stands for Walks plus Hits per Innings Pitched. It measures how many base runners a pitcher allows per inning. Because it’s a non-traditional baseball acronym, people often assume WHIP is a new-fangled sabermetric stat when that isn’t the case.
WHIP was a term invented by the guys who came up with the first fantasy baseball league in 1979. So it’s a made-up fantasy baseball stat.
WHIP does offer a certain snapshot of pitcher performance. Walks are an important way to evaluate pitchers. Of course, plenty of other statistics measure walks. The second half of the WHIP equation is Hits. Assigning the number of hits given up to the pitcher is a problematic and inaccurate way to measure the pitcher.
Defense, luck and hitter talent play an overwhelming role in Hits. An analyst looking to mitigate that variance would avoid using WHIP to analyze pitchers in favor of the stats described above. In that sense, WHIP is more of an anti-modern stat than a modern one.
Conclusion
In the first two parts of this series, we’ve looked at ERA, ERA estimators and Statcast Expected Rate as ways to measure pitching.
But there are new, granular ways to evaluate pitchers, many of which are at the cutting edge of thinking and based on brand new technology. Those metrics examine and measure the pitcher’s arsenal, individual pitches and outcomes. We’ll cover them in Part 3.
Televised games started showing a few updated hitting stats other than just batting average. Do you think pitching stats like FIP or SIERA will be shown along with ERA?
I read at one point that team OPS was the stat with the highest correlation to final standings. From a friend who has a job in an MLB office I am told that wOBA is used that primary comparison between hitters used by teams.
Generally, Google can help you out here. Since not all of these statistics are created/maintained by the same group, the formulas and research are scattered among the creators separate sites.
My favorite statistics site, Fangraphs, has an amazing glossary that not only dives into the formula for each of these stats, but also has examples for how to use the statistics to evaluate players, as well as tables that list what an ‘Excellent’, ‘Above Average’, ‘Average’, ‘Below Average’, and ‘Poor’ example of each statistic can be for a given season.
As I read the two articles I got dizzy from my spinning head with all the new (to me) acronyms. While it would be easier to stick to the traditional terms I immediately realized the inevitable that I needed to re-educate myself since new age broadcasters are going to be using the new analytical terminology especially since it seemingly demonstrates more accurately a player’s performance. Then I read WV’s comment and it confirmed my thoughts. I have put a glossary of acronyms in the note section of my mobile device for quick access.
Now, if the Reds can avoid being down by 5 after three, it should be an interesting season. While I say that tongue in cheek, the past few years have been frustrating to me as a fan for the game to be over before it started. Can’t imagine how that played on the psyche of the position players.
In my opinion its hard to go wrong with a pitchers WHIP. I like to look at batting average against and HRs allowed as well. Of course HRs can be subjective. How many rockets to the wall did Alex Wood allow last year? A towering 375 ft flyball to left-center at night in LA is just another out, but the same swing in Gabp would probably be 8 rows back in the cheap seats.
I have asked numerous times through social media if there were stats on the number of first-third row HR’s hit in GABP with no results. Can anyone help? I maintain it will forever be an issue when it comes to signing FA pitchers. Other organizations have restructured their outfields so is it worth considering?
It’s easy to download the stats of FanGraphs and put it into Excel. From there, you can run a quick correlation. I did that with WAR and wRC+, etc. for fun a while back. It was easy and also informative.
Current Pitching staff analysis based on the last two season’s sierra: 4.14 starting staff & 3.75 bullpen staff = 3.98 projected staff earned runs given per 9 innings. The pirates gave up 693 runs last season with a 4.00 staff era.
1. Wood ERA 17% of starter innings (3.74 sierra over last two seasons)
2. Castillo ERA 17% of starter innings (3.77 “)
3. Gray 15% of starter innings (4.18 sierra “)
4. Roark 15% of starter innings (4.35 sierra “)
5. Desclafani 15% of starter innings (3.96 sierra “)
6. – 8. 21% of starter innings equally at a (4.72) : Reed (4.29) career sierra, R Stephenson (5.15) career as a sierra, T. Mahle (4.72) career sierra. (last season 23% of our starts came from pitcher 6-10, I used 21%)
Reliever’s 2 year sierra = 3.75 at 41% of all innings pitched.
1. Iglesias at 13% of reliever innings (3.24 sierra)
2. J Hughes 13% of reliever innings (3.56 sierra)
3. D Hernandez at 13% of reliever innings (3.37 sierra)
4. M Lorenzen at 13% of reliever innings (4.22 sierra)
5. A Garrett 10% of reliever innings (3.49 sierra)
6. M Bowman 10% of reliever innings (3.87 sierra)
7. Z Duke 10% of reliever innings (4.00 sierra)
8. S. Romano, C Reed, J Stephens, M Wisler 3.77+ 1.98 + 4.23 + 6.58 (career bullpen) = 4.14
I enjoyed this article a lot. More please. Feel free to do threads like these on more sabermetrics :). This is what makes RLN legit.