Baseball continues to fascinate me. Each year, we learn more about how to properly evaluate what we see with our eyes. Still, we often struggle to understand just how well certain players perform in a given season. And I’ve struggled with something all year in regards to evaluating the Reds players. We’ve come a long way in more accurately depicting a player’s offensive value in regards to run scoring; we still labor to properly evaluate defense.

Defense is hard to evaluate because of all the variables involved. For instance, suppose a ball is hit three steps to the shortstops right a total of fifteen times. One shortstop fields the ball ten times and makes ten outs. Another shortstop fields the same ball 14 times, makes two throwing errors, but gets twelve outs. The latter player has more errors but actually converted more plays into outs. We don’t like the errors, but the two extra outs mean quite a bit.

And that is the essence of evaluating defense. The best defensive players convert more plays into outs. So, if there were some way to measure how often players of a certain position make a particular play, we could determine which players make more outs than others.

Currently, the two most popular systems for evaluating defense on an advanced level is defensive runs saved (DRS) and Ultimate Zone Rating (UZR). These two systems work similarly with some slight differences.

I’m not going into all the math. I don’t like math. Me teach writing. Numbers bad. Words good. So, I’ll give you the basic idea of how these systems work.

Both systems measure how many runs above or below average a player is worth based on position. As we will see in a minute, Todd Frazier has a DRS of 6, meaning that he has saved six more runs this season than the average third baseman. UZR has him at 6.9, which means essentially the same thing only with decimals.

On these scales, 0 is average. As a rule of thumb, Fangraphs lists the following tiers for DRS and UZR has the exact same scale:


Both systems use human evaluators that watch EVERY PLAY of every game. These evaluators identify where a ball is hit and calculate how long it takes the ball to reach that area of the field. Based on years of analyzing video and collecting data, evaluators can calculate how often a play is made by a fielder.

So if a ball is hit down the line, we can calculate how quickly that ball gets into that zone (DRS uses timer data) and find out from years of data that the third baseman makes that play 20% (made up number for the illustration) of the time.

If Todd Frazier makes that play, he is credited with .80 points or 1.00-.20. If he fails to make that play, he is credited with -.20 points. It doesn’t matter if the player had to dive to make the play, or if he had enough range to make the play look routine. Either way, the play is scored the same because the ball was hit in the same area and got there in a similar time frame.

Evaluators then use a lot of math and historical data to turn this information into the theoretical runs saved that both systems use. While this system makes up a large part of the evaluation, there are other factors. An outfielder also creates outs with his arm. Not only do outfielders throw out runners, but runners will run less frequently on a player who has a good arm and keeping a runner from advancing a base has value. Errors, how often bunts are turned into an out or outs, ability to turn double plays, and others factor in. You can read more detailed information about UZR here and DRS here.

These systems are great. They take a little of the subjectivity out of defensive evaluation, even though people are still involved in the process. But in spite of their similarities, these two systems don’t always agree. In regards to the Reds, their key players have generally performed well in these two metrics. Here is some data.


This data exemplifies my confusion with these metrics: we can immediately see some discrepancies. In this case, three players have significantly different results. Billy Hamilton rates eight runs better by UZR. That’s a ton of value. If Hamilton is closer to a seven run defender, he isn’t even a starter. Certainly, not a very good one. If you are wondering why Hamilton’s fWAR (2.0) is so much bigger than his bWAR (0.8), it’s largely because Fangraphs uses UZR in their WAR calculation while Baseball Reference does not.

Brandon Phillips has dazzled with his defensive prowess once again this season. Yet, DRS has him as a very good defender while UZR has him as a slightly above average defender.  The DRS system suggests that Phillips’ range hasn’t deteriorated much at all. UZR has Phillips’ range declining quite a bit over the last two seasons.

I can’t help but believe Phillips is closer to a very good defender than an average one. By DRS, he ranks as the seventh best defensive second baseman this season. That seems about right to me, maybe a few spots higher. UZR doesn’t have Phillips in the top ten. Hard to believe that guy isn’t a top ten defender at his position.

Jay Bruce also has about a five run difference in one measure than the other.  Just like with Phillips, DRS rates Bruce more highly than UZR. By these measures, Bruce is either an above-average, top-ten rightfielder or an averagish corner outfielder. That’s a pretty big difference.

Bruce had an awful defensive series in Philadelphia earlier this season, but besides that, he has been solid defensively. Quite frankly, I think Bruce is still an above-average defender. He may have lost some range since knee surgery, but he still possesses a strong arm that keeps players from getting extra bases. Bruce has had a strong, if unspectacular defensive season.

The Reds haven’t done much right this year. But they have played strong defense. While these metrics are likely the best indicators of defense we have, no one is really convinced of their supreme accuracy, so take them with a grain of salt. But based on what we’ve seen this year and the metrics, they all seem like pretty good defenders with Hamilton being elite, Phillips still playing at a high level into his mid 30s, and Bruce returning to above-average form.


13 Responses

  1. jessecuster44

    Who has time to figure this stuff out? I can’t imagine watching every play of every game, and then cross-referencing it with other games.

    Bravo for figuring out the metric, then establishing its credibility. Pretty cool.

    • gaffer

      We all knew one of those guys in High School who would do that.

  2. gaffer

    The better way to look at this is probably to compare among the options you have. Cozart is clearly way better than Suarez defensively. Since the other guys are not going anywhere soon, its not like there is any decisions that can be made based on these numbers. We all know this team is generally a good defensive team.

    • gaffer

      Looking at the definitions of DRS/UZR, I think the explainations make some of the differences more clear. In short, they make a lot of adjustments in DRS that remove a lot of potential data (potentially biased data) in the park adjustment factors and also ignores defensive positioning. I might suggest looking at UZR as the “potential” defensive impact of a player and DRS more as the impact if a player played on a theoretically equal playing field. Hence, DRS would be good to look at when trading for a guy moving to a new field, while UZR is what impact “your guy” actual had for you on your field. Just a thought.

  3. Doug Gray

    Both are still a bit unreliable. Generally speaking, you want about 3 years worth of defensive data to give you a good idea of what kind of a defender you’re talking about. The sample size matters because there just aren’t enough defensive plays that are difference makers. A lot of the plays made are ones made by anyone. Very few plays are made by only a few guys and those plays aren’t created equally. We need very large sample sizes to work with in order to get a stablization of the sample.

    Secondly, when you need such a large sample, you need to account for things for some players and not for others. Using a 3-year sample for Jay Bruce right now doesn’t tell the story given that we know he was playing on a bum knee for a month last year, then came back far earlier than he should have from the surgery to fix it. His defensive numbers were terrible last year, but we also know that it’s probably not something bothering him this year.

    Third: What exactly are you measuring? These are measuring what happened, but not necessarily measuring the skill of a player. Extreme shifting aside, teams are positioning players a few steps one way or the other on nearly every play. The player isn’t making that determination, the coaching staff if. A team with a better understanding/better scouting is going to likely improve the defender versus a team who isn’t as good at it, and it has nothing at all to do with the ability of the player. But the numbers will reflect that the player is more valuable. And while technically that is correct to say, it’s not always the right question to be asking. Is the question who is the better defender or who made the most plays? I’d argue that you want to know who the better defender is because that’s going to be team independent, while who made the most plays may not be.

    There’s a whole lot that goes into defensive stats. While they have some value, they aren’t nearly as valuable to look at as hitting stats. There’s simply not enough information and way too much variation in what is happening or what could happen.

    Hopefully the new statcast system that tracks the movement of the players and the ball velocity, landing spots, time it took to get there, route efficiency – all of that stuff, can clear this up for us and give us true defensive values. Unfortunately, at least for now, MLB isn’t making that data publicly available for fans to take a crack at. So we are left with stuff that is ok to look at, but entirely incomplete.

    • lwblogger2

      Wow, you stated my feelings on defensive metrics about 1000x better than I could. The defensive metrics are far better than simply using FLD% but they have a long way to go in telling the whole story. There are simply too many variables.

      I think when that Statcast data becomes available, we’ll see more reliable defensive metrics. I’ve already been very impressed with the arm-strength calculations I’ve seen and with batted-ball speeds and observed player reaction times. There is some great stuff for defensive metrics that should be on the horizon.

  4. doctor

    What I like about having two systems like this is the can help confirm perception, i.e. Frazier is good. When they differ, then seems you go to the eye test on which one to trust like for BP. He still seems to be pretty good with the glove. I like some of the other stats, like plays out-of-zone, which lead to the Zone Rating system.

    Like Doug elaborately spelled out above, the D metrics have work to get refined. It feels at times like they give too much weight to defensive players.

  5. james garrett

    On a completely different topic,I just saw where Latos was DFA by the Dodgers.

  6. Carl Sayre

    I enjoy this sight in no small part to the younger more technical savvy people on here teaching me a different way to measure the game. These metrics are a) quite a bit beyond my ability and b) a bit ridiculous. There have been some comments about why they are different because of a different criteria used. This is one place that the old “eye test” is probably still more reliable, though I will start checking the numbers to see is they concur with my eye. I don’t need their numbers to tell me BP is near the top at his position maybe a top 5 but I also seen him miss a few more than i would have liked. That tells me through the “eye test” he is still a h………..eck of a 2nd baseman but yeah even he has lost a step.