“Luck” is a very divisive word among baseball fans of late.  The word gets thrown around a lot when discussing common statistical measures, such as Batting Average on Balls In Play (BABIP) for hitters or Left On Base Percentage (LOB%) for pitchers.  What does “luck” really mean, though, in the context of hitting and batted balls?

The way I see it, “luck” is just a stand-in phrase for a much longer explanation about the variance of batted balls that come off the bats of major league hitters.  I think all fans inherently understand that Hitter A is more likely to get a hit if he hits a screaming line drive than Hitter B if he hits a lazy fly ball.  So, if a Hitter A makes an out on the liner and Hitter B gets a hit off the can-of-corn, it’s much easier to say Hitter A was unlucky and Hitter B was lucky.

So, how does one measure if a hitter is getting lucky or unlucky over a given period of time?  If you watch enough of a hitter’s at-bats, you can generally tell how lucky or unlucky a hitter is getting.   However, few of us can watch every at-bat of an entire team attentively enough to get an accurate picture.  Even if we can, we are likely to fall prey to some kind of bias.

Because of this reality, we have eXpected Batting Average on Balls In Play (xBABIP)!  Not only do we generally know the things that help a BABIP, but we also know what things can hurt a BABIP.  So with a little math, we can come up with a multi-variable equation for what a hitter’s BABIP should be, given those variables.  Then we compare what BABIP should be versus what it is and we have our luck factor!

The xBABIP equation I’ll be using is a modified version of what is found here, which has been refined and modified a few times by a few different folks.  The modification I’m making is replacing the Bill James Speed Score (Spd) with FanGraphs Base Running Runs Above Average (BsR).  Spd is an older formula that only takes things into account like triples rate, steal rate, and stolen base success rate.  I probably don’t have to tell you that triples rate is highly variable and can depend as much on your park as on your speed.  So, BsR gives a much more balanced approach with more variables that more quickly estimate your overall speed in relation to getting hits.  I took BsR and scaled it to a per-150-games baseline and then inserted it into the existing xBABIP equation in such a way as to keep the relative impact of the speed component the same in regards to xBABIP.

In addition to foot speed (which has, admittedly, a small impact), what else goes into xBABIP?  Line drive rate, fly ball rate, hard-hit rate, infield fly ball rate (popups), and opposite-field hit rate.  All of these things either positively or negative correlate to BABIP.

So, why am I writing about this?  Because we want to see what the Reds are doing, of course!  Without further delay, here is a chart!  We love charts, right?


Currently, Adam Duvall has been the luckiest Red in terms of BABIP-xBABIP difference, at 40 points above expectation.  Watching his at-bats, we can see why this is.  In the last few weeks, he’s had a few blooper and a few broken bat bleeders find holes for base hits. This, coupled with him generally hitting the ball hard, gives him a high-ish BABIP.

Currently, Joey Votto is the most unlucky Red in terms of BABIP-xBABIP difference, at 79 points below expectation.  Again, this shouldn’t be a surprise to anyone who has watched most of his at-bats.  He’s routinely hitting the ball hard, but it has been going right at people quite often.  For example, in the last 2 games, he’s hit a 105 mph and a 103 mph line drive right at Lonnie Chisenhall in right field.  A few feet either way and those are doubles, a few degrees more loft and they are homers, a few degrees less loft and they fall in for a single.  Plain old bad luck.

Jay Bruce is the poster child for normalcy.  He’s running a well above average BABIP and a well above average xBABIP; almost identical.  He’s hitting the ball hard, hitting more line drives than he usually does, and hitting fewer fly balls he usually does.  Add all that together and it’s a good indication that Bruce should be running a higher AVG than usual, and he is.   The shift doesn’t seem to be hurting Bruce as much this year, likely do to the increase in line drives.

Billy Hamilton is an interesting case.  His speed scores are off the charts, which would seem to suggest he should be getting a lot of infield hits and bunt singles.  This isn’t the case because of Billy’s somewhat unique profile.  He generally can’t hit the ball hard enough to keep infielders honest which causes them to play so far in that it effectively removes bunt singles and infield hits from Billy’s profile.  Billy, no matter how long he plays, will likely always under-perform his xBABIP.

Also interesting to note is that Tucker Barnhart has the highest xBABIP of any Red.   This is due, mostly, to the fact that Tucker is destroying the baseball to the tune of a 36.9 Hard%, and a 29.2 LineDrive%.   The Hard% is 2nd on the team to Votto, and the LineDrive% is first on the team.  He’s also not popped up a single time.

Some of you may be thinking a 119-point spread is pretty large (from +40 to -79) and you’d be correct.  BABIP is one of those stats that takes a long time to reach any level of stability.  Many hundreds of balls-in-play over thousands of at-bats is required before we can be confident in a player’s true-talent BABIP.

So, let’s look at these same players, but over their entire careers and see how much we should actually trust our friend Mr. xBABIP.  The following chart shows career BABIP and career xBABIP, sorted by career xBABIP.


If we take the four players with the largest sample size (Phillips, Votto, Bruce, and Cozart) we see that their career BABIP are all no more than a 13-point difference from their career xBABIP.  This should give us a pretty good feeling that xBABIP, even on a partial season sample, should be a decent estimation of what reality should be.

The other hitters on the list have a larger variance due to the fact that they simply haven’t had enough at-bats for their BABIP to begin to stabilize and become a true depiction of talent level.  In my pre-season analysis of Eugenio Suarez, I presented why I think he’ll be a BABIP over-performer.

Of particular note, I think, is that Votto’s .355 career BABIP is 4th in MLB history behind Ty Cobb (.379), Rogers Hornsby (.365), and Rod Carew (.359). I am only counting people who started their careers after 1900 and accumulated at least 4000 PA.  If I lower the threshold a bit to 1500 PA, Christian Yelich becomes 2nd in MLB history at .366.  Watch out for that kid. He can hit.  Mike Trout and Starling Marte (!) also sneak into the top 10 all-time.

If you view BABIP as a overall measure of how well a hitter strikes the ball (contact quality) and how well the hitter uses the entire field and mixes up his ball-in-play types (LD, FB, GB), not many have ever been better than Votto.  It is unfortunate that he’s had a poor start to the season, but I still appreciate watching him go at his craft, spraying hard-hit liners right at Lonnie Chisenhall.

So why did I write this article?  I don’t really know.  Seemed like a fun topic and an excuse for me to do a little math.

What should the take-away be? Probably something like “don’t assume a high (or low) BABIP means a hitter is getting lucky or unlucky without first looking at their peripherals.”  Also, something like “Duvall is getting a bit lucky, Bruce and Cozart are striking the ball very well right now, and most of the other Reds are getting a bit unlucky.”   Maybe you already knew that intuitively, but now you know it mathematically!

BABIP figures courtesy of FanGraphs.

Note: A lot of work is being done very recently on creating an xBABIP equation using only inputs from StatCast.  I haven’t had the time to fully explore this, but it seems promising to use actual results.