Thursday, March 31, 2016

Starling Marte, Defense and the Limitations of WAR

As we get ready for the opening of the 2016 baseball season, I thought I'd share some thoughts on advance metrics and what the sabermetric community knows and what it is still trying to figure out through the lens of Starling Marte’s statistics. This isn’t exactly breaking new ground, but I still think Marte’s numbers specifically are worth looking at.

For years the Holy Grail of sabermetrics has been to find a singular number that could neatly summarize a player’s season and provide context for that season versus other seasons from the same player, different players and different eras. The number is designed to take into account every aspect of a player's performance--offense, defense/fielding and base running. The term commonly used for this is WAR or Wins Above Replacement.

Talking effectively about WAR requires considerable time and effort, given how much work goes into computing this number, how many factors / metrics it takes into account, and the fact that there are several permutations of it available to the public. 
Please keep that in mind as we gloss over many of the details. Currently there are three main forms of WAR, all of which calculate the metric differently. The thing is, when we talk about the value of position players, we probably know 95% of what we will ever be able to ascertain from a batter’s contribution in the batter’s box. There are various offensive numbers such as Weighted Runs Created (wRC), On-base Plus Slugging Plus (OPS+) and True Average (TAv) which all do a good job of telling us a player’s offensive contribution in a single statistic by analyzing the counting and rate stats and using a formula to distill them down to a single number.

The problem is when it comes to defense (and a lesser degree base running), it’s hard to tell exactly what we know and how accurate the data is. We can see this clearly when taking a look at Marte’s numbers during his three full season in the majors. The table below compares fWAR, the version supplied by Fangraphs, bWAR, the version found on Baseball-Reference and WARP, the version provided by Baseball Prospectus.

Starling Marte   fWAR   bWAR WARP
2013      4.8      5.4      2.7
2014      4.4      5.1      3.3
2015      3.6      5.4      2.5

Almost the entire difference in these three numbers each year comes from the different defensive components used by the three methods. bWAR, which clearly views Marte most-favorably in all three seasons, shows Marte leading all left fielders by a large margin in defensive runs saved in 2015, leading a to strong positive defensive contribution to his overall bWAR figure. fWAR looks at Marte’s 2015 defensive season favorably, but has a handful of players ranked ahead of him. WARP calculates Marte as having a negative Fielding Runs Above Average (FRAA) in 2015 (and in his other two seasons as well.). The reason for these discrepancies is they each use different methods to come up with their defensive valuations. They are all evaluating the same player, looking at the same plays, but coming up with surprisingly disparate results.

There are complex calculations that go into these numbers and this is probably an oversimplification even as clear as the differences are, but it illustrates how the different methods of evaluating defense can come up with dramatically different views on a player. And although it doesn't impact the analysis of Marte, there is a valid argument that the massive increase in the use of defensive shifting has actually made defensive metrics less reliable over the last five years. Hopefully the new data being recorded by Statcast and Trackman, which tracks exact player positioning and movement, will allow us to better analyze a player’s defensive capabilities going forward.

And there there is the question of the “eye test” and whether that is a reliable way to evaluate a player. The anecdote of watching a player play an entire season and knowing that if a .270 hitter gets one more hit every to weeks he becomes a .300 hitter is instructive. Could you tell the difference? Of course the answer for virtually all of us, unless you are recording the information, is no.

Having said that I believe evaluating a player’s defensive skill set is a bit different. I think that if you watch a player play 150 games in a season you will have a very good understanding of his defensive strengths and weakness. But translating that understanding into useful data and comparing it to players playing different positions is a whole different problem. Having watched Starling Marte throughout his career I can give you the scouting report: He has great range in left field. He tracks balls well but doesn’t catch everything he gets to. He occasionally misses the routine play (St. Louis Sept 2014 anyone?), but he has good instincts and a very strong accurate arm. #DontRunOnMarte. But can I give you a solid comparison between Marte and Alex Gordon or Christian Yelich? No. I do see Marte 150 times a year, but I only see the others maybe 10-15 times. That might be enough to provide a basic scouting report, but it isn’t enough of a sample size to have a deep understanding of the others’ strengths and weaknesses and thus I can’t come up with a valid quantitative comparison between the players. 

Back to WAR. In today’s world 1 WAR is valued at between $7-9 million. How do we put any kind of accurate value on Starling Marte’s worth when one method calculates his 2015 season at 2.5 WAR, one has him at 3.6 and the third has him at 5.4? Using $8M/WAR the range of value is $20-$43 million. Not very precise.

So remember as we go through the season analyzing players and situations, the data and on-going study of that data has revealed so much that has changed how we think about the game. Just looking at the numbers may tell us virtually everything about a player’s offensive performance. But we’ve got a ways to go to figure out the rest. And be careful when someone brings up WAR. It's a useful tool, but it's still pretty blunt.

