If you were watching last night’s FSO broadcast in the bottom of the 2nd inning, you saw this graphic regarding Matt Harvey’s ERA:

Thom Brennaman said: “The simple truth is he (Harvey) has gotten healthier as we have moved forward and the numbers are starting to show it.”

Let’s take a closer look at the chart and what it attempts to prove. The first bar on the left (blue) shows Harvey’s time with the Mets. The subsequent columns are the month-by-month values of Harvey’s ERA. Each month represents three (Sept) to five (June, July, August) starts. Harvey’s red-column May numbers include four starts for the Reds.

Sure, that May column is lower than June, July and August, so it doesn’t fit the narrative. But the broadcasters explained it away as being not representative because Harvey only pitched a couple innings in a few of those games. In fact, Harvey threw four innings in his first two May starts for the Reds. The reason his ERA is low for May, though, is because of a 6-inning start where he gave up one earned run. So, yeah. But we have narratives to keep stitched together, so let’s ignore that little issue, as the FSO guys asked. Here’s the same chart in a format we can repeat.

Also in service to the broader narrative, you have to look past the fact that July’s ERA was only .03 runs better than June’s. That’s less than the effect of one-third of an inning, and trivial. But again, we’ve got narratives at stake here.

To be sure, if you look at July, August and September, there is, indeed, a “downward trend in the ERA department” (Jeff Brantley).

[If you know last night didn’t go too well for Harvey, you’re rightly wondering how that affects these numbers. We’ll get back to that in a minute.]

The purpose of this post it to take a look at the way arbitrary endpoints can affect support for what one is trying to prove. FSO chose endpoints based on when we turn the page of our Gregorian calendar, devised in 1582 by Pope Gregory XIII, and who was not known as a big fan of baseball.

Suppose instead, we divided Harvey’s time with the Reds into four equal blocks, which is 5-6 starts per block. Here’s what the ERA graphic would have looked like.

Hmm. Not terribly consistent with “as he has gotten healthier the numbers are starting to show it.” Instead, by these endpoints, Harvey’s performance looks like a random walk, not a trend.

How about dividing Matt Harvey’s time with the Reds into two equal periods, each with 11 starts. That’s as large of a sample size we can create for comparison.

Oops. Harvey’s first 11 starts were quite a bit better than his last 11 starts, based on ERA. You can see why FSO didn’t choose these data piles.

[Reminder: All we’re doing is changing the endpoints for each block of time. This is the same underlying data, with the same metric (ERA) that FSO used. They chose to divide it month-by-month, which supported what they wanted to say. But so far, every other way we’ve divided up the data has failed to back up their claim.]

Finally, what if we used the popular “first-half, second-half” scheme, with the All Star Game serving as endpoints.

Ugh. That’s really narrative busting. Harvey’s ERA has been far worse after the ASG. And remember, we aren’t even including his latest start.

The point should be obvious by now. The “simple truth” advanced by the guys on FSO using ERA was based entirely on the dates they chose for the data piles and those start and end points are completely arbitrary.

What about last night?

Here’s where the narrative takes a real beating. Giving up 7 earned runs in 5.1 innings won’t help one’s ERA. Even FSO’s Month-by-Month endpoints no longer support their “numbers show Harvey has gotten better” claim.

Obviously then, throwing that 7-spot into the last column is going to make all the other endpoint schemes worse for the back end, too. Here’s the Four Block example:

The First-Half/Second-Half split really, really becomes contrary to the Harvey Happy Ending bedtime fairy tale.

The bottom line: Matt Harvey’s ERA with the Reds has been 4.46. The NL average for starters is 4.02. That’s a huge increment worse than average. Are you sure you want to offer him a big extension?

Now, what if FSO had used a metric other than ERA?

ERA is laughably inaccurate in conveying pitching performance over such short periods of time. Runs are clunky in relation how someone pitches (one swing can = three runs). Sequencing has a huge effect. Defensive performances can be variable. Sometimes relievers strand inherited runners, sometimes they don’t. Sometimes you pitch well and are unlucky and vice versa. ERA is fraught with misleading implications over sample sizes even of half-seasons, let alone individual months.

Suppose we just look at a pitcher’s strikeouts, walks and home runs?

Some analysts believe those are the variables over which the pitcher has the most control. That’s the statistic called Fielding Independent Pitching, or FIP. It’s a lot less clunky, isn’t affected by sequencing, defense, luck (much) or relievers. Maybe it will show something more reliable about Matt Harvey’s performance with the Reds.

It turns out that using FIP, Matt Harvey has pitched better if we use Month-by-Month endpoints, or Four Blocks. But looking at Two Blocks or First/Second Halves, Harvey’s FIP has been higher later. That’s with or without last night. For example, his first-half FIP is 3.97 and second-half is 4.79. The NL average is 4.10.

Finally, what if we look at just Harvey’s strikeouts and walks?

Some analysts think that’s an even better way to measure a pitcher because home runs can be random (a function of fly balls, not pitching quality per se). To look at just a pitchers Ks and BBs, we can use xFIP, a metric like FIP that is scaled to ERA.

The xFIP data doesn’t much conform to the narrative that Matt Harvey is getting better, either. But it’s closer. The First-Second Half numbers (4.39 down to 3.99) are back in the correct direction. So are the Four Block and Two Block data. Month-by-month is still up and down, though. So the broader point remains, what you prove still depends on the endpoints you choose.

The main moral of this story is: Be skeptical of advocates using selective endpoints, especially when the sample size is small and with certain metrics.

Yes, choosing endpoints is unavoidable. We’ve all at times done it selectively. And I’m not saying FSO chose month-by-month knowing it was the only way the data could be grouped to prove their point. They probably just didn’t look past the first split data they found. Those “simple truths” though, are often simple but not true.

9 Responses

  1. J

    Numbers-schmumbers. If they like the guy and he likes them, then he deserves a long-term contract.

  2. BigRedMike

    This is a great summation and it is clear that FSO has been given some ideas to promote the Reds resigning Harvey by the front office.

    Harvey is not a pitcher the Reds should consider giving a contract to. He is just not that good and the Reds have younger cost controlled pitchers that can produce the same and possibly better results.

  3. Aaron Bradley

    Pitching is very random and luck oriented when it comes to starters… I think the Reds were smart in stockpiling young arms they have just been unlucky or bad at developing them, but even so, some of them are starting to show positive results… stick with that, spend money on upgrading bullpen and offense… experiment more with the bullpen like how Tampa Bay is doing (which Thom admonished and called Tampa bush league … oops, Tampa turned it around once they started doing that, Thom!) Continue investing in scouting, coaching, development, etc. And be more pro active in trades… identify talent and go get it.. .don’t just sit back and wait for ridiculous deals to fall in your lap.

  4. roger garrett

    Could have found out more if the 43 starts Harvey and Homer got had gone to somebody else but wel.They are just promoting on TV the signing of Harvey because well they were told to do it.They know its a mistake well I think they do don’t they.

  5. eric3287

    Without selective endpoints, though, the front office can’t claim the 20-9 record from June 10 – July 14 is evidence that the rebuild is over and winning days are nigh upon us.

  6. David

    That right there sounds depressing and a lot like treading water in the middle of the ocean. I think they have no clue about what they should really do.

    Makes me mighty sad, but on the other hand, it’s just baseball.

  7. jay johnson

    I would love to know Harveys numbers if you take off the last inning or partial inning pitched.There were numerous games were he was pulled by interim Jim and the pen gave up his runners that were on base when he left the game.I think that his numbers would be quite different.
    He is a pro.Something rare for a reds starting pitcher.Sign him and watch him return to his former self.

  8. Brandon Bowling


    The only part of that that I take issue with is Peraza. I think he has had a good year at the plate, and his defense might not be great, but he is so young that I think he could be a much better shortstop in the future. With his speed, I also wonder if he couldn’t play CF as well…if we had another option at SS.

    I am completely on board with you on the other points. I think signing Harvey and Hamilton would be an absolute mistake. I can see how you might sign Scooter to a 3 year deal, if you are willing to move him around the field in such a way so as not to block another younger, good player.