It has been more than five years since Derek Jeter began his final major league season. He was 39 years old, due to turn 40 in mid-season, and he was recovering from a serious injury that mostly wiped out his 2013 season. It would not be a good season, especially by his standards, however Derek did have a few well-timed moments that allowed his fans to remember the player he had once been.
He saved the best for last. In his final game at Yankee Stadium, September 25th against the Orioles, the Yankees led 5-2 heading into the 9th inning. Jeter had contributed to the effort, with a hard-hit RBI double in the first inning. He also reached on an error in the 7th inning with the bases loaded, resulting in two more runs scoring. It looked like his watch as the Yankee shortstop would end in the field, as David Robertson recorded the final out in some fashion. It was not to be. Nick Markakis walked, Adam Jones homered, followed by a Steve Pearce homer.
Save blown, the game was tied, and you couldn't blame any of that on poor shortstop defense as there is no defense for walks and homeruns (excepting the occasional outfielder going over the fence). The only ball in play for the inning was the final out, a fly ball to center field. As the Yankees prepared to bat, I don't think I was alone to wonder if Jeter would have his moment, as he was scheduled to bat third in the inning. Still the question remained, would they get a runner on for him? If so, could he come through on demand?
Jose Pirela started the inning with a single, and the situation quickly came into focus. Antoan Richardson pinch ran for him. There was no chance that Brett Gardner would end the game with Jeter in the on-deck circle, with a homer or extra base hit, because Gardner knew his role in the saga and put down a sacrifice bunt to move Richardson over. Evan Meek, the Oriole pitcher, threw Derek a belt high pitch on the outer half of the plate. Derek did not wait, slashing a clean single between the first and second basemen, and Richardson beat the throw to the plate for the game-winning run.
For the game Jeter had two hits in five at bats, a single and double, scored once, and was credited with 3 RBI. In the context of a Hall of Fame career that included 3,465 hits, this does not seem to be a remarkable game. Derek certainly had better. His career included 41 4-hit games, four 5-hit games, and ten multi-homer games. None of these games however resulted in higher WPA (win probability added) than his final game in Yankee stadium. Due to the timing of his plays (and the fact that WPA treats him hitting into an error with the bases loaded the same as if he hit a two-run double) his WPA for this game was .612, his new personal best.
Enough of that game, as I am getting far off topic, which was intended to be an exploration of Jeter's rankings on various defensive metrics. Derek Jeter played his entire career in the field as a shortstop. Not even a third of an inning at any other defensive position, although he did occasionally serve as the designated hitter. This is extremely rare for players with long careers. Ozzie Smith and Luis Aparicio also did not play anywhere besides shortstop. They were outstanding defenders, the anchors of the defense. Jeter, on the other hand, was not a very good shortstop. While he somehow won five gold gloves in his career, the verdict of the defensive metrics ranges between "bad", "very bad", and "earth shatteringly awful". It is strange that contrary to the experience of almost every baseball player in history, a poor fielder was not at least tried at a less demanding position.
Let's look at these metrics, on a career level, from most favorable to most damaging.
All these numbers are subject to revised methodology and represent what I found in March 2019. To get runs per year, I multiplied the metrics expressed in plays by .75, the run value of an extra play made (or not made). For the denominator years, it's just the number of seasons, not looking at games or innings played, but not counting 1995 (September callup) or 2003 (missed most of season to injury.)
Accuracy and Bias
Some of the difference in these estimates can be explained by bias, some by the accuracy of measurement. Something like Range Factor plus is an unbiased measure, though not particularly accurate. It looks at how many plays Jeter made per inning and compares that to the league average. It does not matter if Jeter went an entire game without a ball hit anywhere near shortstop. His RF+ would still get worse after this game is included. Because of this, RF+ should be considered a measure that lacks accuracy.
DRS and UZR are more accurate in that they use play by play data to only count balls in play against Jeter if they were hit in locations where the shortstop had a chance to make the play. While more accurate however, these metrics introduce bias. The play by play data that underlines these metrics is (or was, while Jeter was active) recorded by human scorers watching the game. It is possible that by the fact that a shortstop fielded a ball, the scorer would rate the ball as easier to field than if the shortstop failed to get to a similarly placed batted ball. This is known as range bias. While the concept makes logical sense, it is difficult to know how much a problem it was. In the days of Statcast, teams have better data on the location of each batted ball that does not rely on the observation of a scorer. Unfortunately, the introduction of this data source happened after Jeter's career.
In general, the metrics that are kindest to Jeter are the most accurate, but also the most biased. The ones he does worst in are unbiased, but lack accuracy. Range Factor+ is a very simple measure that can be calculated quickly. Jeter played 23,225 innings in his career and had a range factor (putouts + assists per 9 innings) of 4.04. During this time, the league average RF was 4.51. (Source: baseball-reference.com) This means he made 1,213 plays fewer than average. That's 910 runs (at .75 runs per play), or 51 below average per season.
That's about the most extreme measure we can come up with. Some problems with this, especially in comparison to the other metrics, is that assists are closer to what other metrics are measuring. Putouts are mostly line drives and popups caught or catches from other fielders. While a few putouts by shortstop represent a fielded ground ball with a runner on first where the shortstop fields the ball and steps on second, assists more commonly meet the definition of ground balls fielded and turned into outs.
Getting an assists+ measure would be difficult to get from Baseball-Reference. If you were going to spend time to do that, you might as well correct another problem with Range Factor: Innings should not be used as your denominator. Innings includes strikeouts, which have been rising year over year for quite some time now. There is a great disparity in strikeouts between teams as well. Correcting for the denominator and also narrowing the focus to ground ball defense, I came up with Ground ball+. From 1996 to 2014, Jeter recorded outs on 644 fewer ground balls than the average shortstop did, given the same number of balls in play. That turns into 483 more runs, or 27 more per season. It certainly true that while he was playing for the Yankees Derek Jeter made fewer plays in the field compared to an average shortstop. What is not certain however, is how clear the relationship is between these plays not made and increased hit totals "past a diving Jeter".
Ground Ball Defensive Efficiency
In 2011, with Jeter playing about 3/4 of his team's games, the Yankees had a ground ball defensive efficiency (GBDER) of .717. The following year, with Jeter playing 135 games, they were at .723. Jeter suffered a terrible injury in the playoffs that year and was only able to take the field 13 times in 2013. The Yankees replaced him with a mix of shortstops. Eduardo Nunez played more than anyone else, and his defensive statistics were very poor. Also playing a good bit of the games were Jayson Nix and Brendan Ryan, who were good to excellent defenders.
If Jeter was in fact a horrific defensive shortstop, far worse than anyone else, then replacing him suddenly with any random shortstop should result in an immediate observable improvement in the team stats. Did the Yankees improve in 2013? Yes, but just a bit, to .725. Jeter came back in 2014 and the GBDER fell to .721. This is the equivalent of maybe 5 runs worse, nowhere near the 25-run difference shown by the most extreme metrics. In 2015 Jeter had retired and was replaced by Sir Didi Gregorius, who by observation was a far more talented defender, particularly showing a much better arm. The GBDER? Unchanged at .721.
A shortstop is not wholly responsible for ground ball prevention, the other fielders have a say in the matter as well. It is possible that they had great fielders along side him when he was playing, and terrible ones when he was not. That explanation seems a bit far fetched though, as we have 3 distinct changes of going from Jeter to not Jeter, to Jeter, and back to not Jeter. The Yankees did not simultaneously swap out the other infielders when Jeter's availability changed.
From 2011 to 2015, Yankees had a GBDER of .725 when they used any shortstop other than Jeter. With Jeter on the field, they had a .718. Per 2,000 ground balls, a typical season's worth, this represents about 14 fewer ground balls fielded, or 11 more runs. Considering he was 36 to 40 years old during this time, that is not an exceptionally bad result.
Where did the ground balls go?
From 2011-2015, the Yankees had 4,813 ground balls hit when Jeter was playing shortstop. When anyone other than Jeter was playing, there were 5,228 ground balls hit. Almost a perfect match. It's also a convenient period for analysis because there is not a clean break of his playing time. We've got Jeter playing most of the time, then not playing much, then playing most of the time, then not playing at all. You don't have a clean break between Jeter and not-Jeter, where differences in the data could be blamed on other fielder or pitcher changes.
For an easier comparison, let's pro-rate the ground balls to 2000 in each bucket, a normal season total:
The first section shows the number of outs recorded on ground balls based on which fielder touched the ball first, so 17 outs made on tappers in front of the plate and fielded by catchers when Jeter was the shortstop. The second group is all ground balls that do not end up as outs. The 69 balls by non-Jeter shortstops with no outs represent a mix of errors, infield singles, and fielder's choices.
The bottom-line results show that per 2000 ground balls, Jeter made an out 61 fewer times than the other shortstops. This did not lead to 61 more hits though — the team recorded only 15 fewer outs. Other fielders were making the plays — 20 more plays for the first basemen, 30 more for second basemen, and 15 more for the pitchers. The third basemen were not making up for balls hit to the left side that Jeter couldn't field, as the 3rd basemen had fewer plays as well. The Yankees had a different ground ball distribution when Jeter was in the field, more hit to the right side of the field and fewer to the left. Whether this was intentional is unknown.
I can only look at numbers like this back to 2003. From 2003 on, Retrosheet has full results for ground balls, as far as who fielded it and whether a play was made. In addition, the number of ground balls at the league level is fairly consistent from 2003 on, so I'm not too worried about misclassification problems. Before 2003, we know which fielder recorded outs, but don't have that information for all hits.
From 2003 to 2005, Jeter played shortstop for 5521 ground balls, and the Yankees had a GBDER of .726 on these plays. Jeter was not at short for 857 ground balls, and they were worse off without him, a GBDER of .721. Most of the notJeter plays occurred in April and early May of 2003. On opening day 2003, Jeter tried to go first to third on a ground out to pitcher. Normally that would not be a wise play, but the Blue Jays had the shift on against Jason Giambi and nobody covering third base. That is, nobody except catcher Ken Huckaby, who made a heads up play to cover the open base, got there about the same time as Jeter, and put the tag on him for a double play. Jeter was injured on the play and would miss about 40 games.
From 2006 to 2010 Jeter played about 150 games a year, and the team's GBDER was .728 with him on the field. Over the 5 years his backups saw 1,048 ground balls, and they were more efficient, turning in a .752 GBDER.
Here's how it looks per 2000 ground balls:
Not the same patterns as above, but keep in mind that while the 2011-15 data had equal representation, in this chart the Jeter sample is more than 7 times larger than the notJeter sample. For what it's worth we see fewer balls hit to the right side, plays made by the first and second basemen. There are more balls hit towards third, with both more plays made and more hits going by into left field. Jeter made 27 fewer plays, with some of those resulting in hits to left field and others being infield singles or errors hit to short.
Combining the data
Jeter played a lot more in the field from 2003 to 2010, but as a consequence we don't have nearly as much of a control group — how the Yankees performed with his replacements. We have a near perfect control group for the 2011-15 period, since Jeter and his replacements faced an almost equal number of ground balls. I can combine the data with the weight being the geometric mean of the ground balls with Jeter in the field and the ground balls when he was not in the field. The result will put more weight on the later seasons simply because we don't have much of a control group for 2003 to 2010.
The weighted average works out to the Yankees having a GBDER of about .009 worse when Jeter is in the field. In a full season of 2050 ground balls, it works out to 17.8 fewer plays in a year, or 13.4 runs below average. Can we use this combined estimate and apply it to years before 2003?
Here's how Jeter rates by Total Zone and Range factor, as a percentage of league range factor, by year group:
Both TZ and RF indicate that he was bad in his first few years, got a little bit better (but still below average), and then was much worse as an old player. His range factor is slightly lower in 2003-2010 than it was from 1995-2002, but as the league saw fewer plays at shortstop (more strikeouts and fly balls) his relative range factor was not quite as bad.
By TZ and RF, Jeter's early years are somewhere between his 2003-10 seasons, and his 2011-14 seasons. We might be able to estimate his defense for those early years as being similar to the combined results for 2003-2014. Under that assumption, Jeter's poor defense meant a .009 decrease in team DER over 23,225 career innings. That works out to 15.9 full seasons (9 innings per game, 162 games per year), 284 fewer plays made, and 213 fewer runs. That's not far off from the run estimates on baseball-reference, 243 runs using DRS for years where it is available and TZ for the years where it is not.
What happened in 2005?
In 2005, a funny thing happened, Derek Jeter had a range factor better than the league average. Jeter recorded 454 assists that year and had a 4.76 range factor per 9 innings — the only time in his career that his RF9 was better than the league average. Despite this, the other metrics didn't tell us his fielding was any better than usual. His TZ was -5, a bit worse than it was in 2004 and about the same as it would be in 2006. DRS on the other hand had him at -27, which was the worst single season result he'd have in that stat. Thinking back to the 2005 Yankees, I think of the Small and Chacon miracles. Short on pitching, they picked up two pitchers, Shawn Chacon and Aaron Small, who had never been great pitchers before and never would again. Somehow, they combined for a 17-3 record as the Yankees won 95 games and yet another division title. Chien-Ming Wang was a rookie, Carl Pavano and Randy Johnson were in their first season as Yankees, and Kevin Brown was in his second. Other than Mike Mussina, none of the pitchers they used as starters had pitched very long for the Yankees. If the Yankee pitchers before this had a specific game plan that resulted in a shift of ground balls away from shortstop, perhaps they didn't get the memo.
Looking at the pitchers who allowed the most ground balls when Jeter was in the field, here they are ranked by the number of shortstop plays per 2000 ground balls:
Mussina at this point was a long time Yankee. Brown was in his second year with the team. They didn't have many balls hit towards short. Rivera had especially few. Part of this is that he made more fielding plays himself (19) than Jeter made behind him (12). Rivera was a great fielding pitcher and his specialty was breaking the bats of hitters, resulting in more than the usual number of weak tappers right back to the mound. Small, Johnson, Pavano, and Wang were in the top half of the list and all first year Yankees.
The pitchers in the top half had a combined GBDER of .744, and those in the lower half had a .730. At first glance this might say that pitching to a normal ground ball distribution was more effective than pitching to try and avoid the shortstop, however, I don't think this tells us anything. We are looking at selective sampling here. We're not looking at the number of ground balls hit into the shortstop because we don't have that. We're looking at plays made by the shortstop and trying to infer ground ball distribution. Perhaps Jeter just played his worst behind Kevin Brown, and if he had played better you would see both a higher GBDER for Brown, and a higher number of GB-6-outs for him. In short, I don't think I found anything useful here but just want to show where I looked.
Some of the defensive metrics shown in the first chart have smaller ranges, such as the ones published and used in WAR calculations on Fangraphs and Baseball-reference. Some have much larger ranges, like WOWY and GB+. What is the real spread in defensive value? Is this something we can know?
Actually, I think it can be known. We know what the spread is for teams, so we can statistically infer what the spread should be for positions. I looked at all teams from 2003 to 2015, and the standard deviation in ground ball DER was .0158. Note: I adjusted for the year to year changes in league averages, it would be .0162 without the adjustment. That's 13 years, with 390 team-seasons. Nineteen teams were more than .032 above or below average, about 5%. Bingo — exactly what we should have with a normal distribution, where about 95% of values fit within 2 standard deviations.
We know what the team spread is, how can we get the individual positions?
I will ignore the catchers, as they field very few ground balls. The team ground ball performance is the sum of the performance of the pitchers and 4 infielders. Take the spread for each position, square it, and add them up. Take the square root of this sum, and it should match your spread for the teams. Some positions handle more chances than others. The shortstop handles the most, with the first baseman handling the least among the four infielders, and pitchers less than that. I came up with the following numbers for spread:
Square those numbers, sum and take the square root, and you have 31.6 plays, which represents a rate spread of .0158 over a 2000 ground ball season. A typical leader and trailer at shortstop (at team level, or per Cal Ripken season) should be about 2 standard deviations above or below average. In a 30 team league the 2 teams furthest from the mean represent 6.7 percent of teams. We can expect our shortstop league leader (or trailer) to normally be 33 plays, or 25 runs away from average. I don't know if those numbers are exactly correct. I don't have a formula that you can reproduce to check the numbers. You can come up with your own assumptions, and as long as 1) pitchers have a smaller spread than the other positions, 2) First basemen less than the other infielders and most importantly 3) they add up at the team level, I won't argue with your assumptions.
This does not mean, however, that we would expect the best or worst fielder to be 250 runs away from average over 10 years. Adding more seasons should increase the SD by the square root of the additional sample size. In other words, one SD for 16 seasons should be about 4 times as big as one SD for a single season. A measure like GB+ shows a single season SD about 30 plays, or double what we should get. I don't have any more on WOWY than what can be found at the link provided, but the results shown are consistent with the high SD I find with GB+. To my knowledge, Tango Tiger has never published single-year WOWY numbers. To further illustrate the wild spread of results for GB+, I'll share some of the highs and lows. The best season from 1996 to 2015 was by the Cardinals' Brendan Ryan, at +90 plays. In fourth place was Nomar Garicaparra at +60 for the 2002 Red Sox, this was two years before the Red Sox traded him mid-season because they thought they needed better shortstop defense to succeed in the playoffs. Jeter does have the worst season, -76 plays in 2000. The second worst is Pat Meares of the 1996 Twins. Jeter shows up a few times on the trailer list, as does Jimmy Rollins, who had a mix of good and bad seasons in his career, with no explainable pattern. Rollins had some good and bad years as a young player and also a mix as an old player.
As Tango Tiger noted in the opening paragraph of that post, "The "spread" of the numbers is larger than I'd like, but, that's why we have regression." I fully agree with that. If the spread was that big we should see, with a league average around .735, some teams turn more than 80 percent of ground balls into outs, and others less than 65 percent. We don"t. The best any team did was .777, the 2004 Cardinals, and the worst was .683, the 2015 Phillies.
One more metric
Let's try one more defensive metric. I'll look at GB+, but instead of using balls in play as the denominator I'll use ground balls (as coded in the Retrosheet data). As noted earlier, we only have complete data for batted ball types from 2003 on, so we can't look at Jeter's earlier seasons here. From 2003 to 2014, Jeter made 416 fewer plays than the average shortstop. This works out to 38 fewer per season (counting 11 seasons as 2013 is almost a blank for him), and a run value of -28 per season. This is very close to the earlier GB+, which had Jeter at -27 runs per season.
What is the advantage of looking at only ground balls? This allows me to look precisely at how many plays the other infielders made, compared to average, when he was in the field. If GB+ and other metrics largely based on the plays he didn't make at shortstop are accurately representing runs that he cost the team, then we should not see other infielders making up these plays. We should see an increase in balls that do not end up as outs. I could not do this with GB+, based on all balls in play, because there would be an obvious bias. Shortstops who make a lot of extra plays, per BIP, should on average have teammates who make a lot of plays in the infield. In addition, they should have outfield teammates making fewer plays than average, because we would expect these teams to play behind more ground ball pitchers.
Before I present the numbers for Jeter, there's one more probable bias to address. If the data show that Jeter's infield teammates were making more plays than average, does this mean that he's facing a skewed ground ball distribution? Not necessarily. The data might show something like that if Jeter was in fact costing his team 40 outs per year, but he played with other excellent infielders. The Yankees won so many games because other infielders were making up for Jeter's poor fielding, and they would have been even better had he been a competent fielder.
Here's what the data show. Jeter made 416 fewer plays, but other Yankee infielders made 278 more plays than average while he was on the field from 2003-14. Pitchers in particular were making a huge number of plays, 188 more than average. Would Jeter have made more plays if his pitchers had not made so many? Very likely he would have. We can't expect he would have made 188 more plays if his pitchers had been average, but we do know that those 188 plays are balls that turned into outs, and certainly did not hurt the team even though the shortstop did not make plays on them.
What should be more telling as to the usefulness of this analysis is looking at how it works out for groups of players. If I say that Jeter did not cost his teams 416 plays because his teamates made up for two thirds of them, you might say that Jeter did indeed cost his team 416 plays, and was fortunate enough to play with very good fielding teammates. How about this: If I take a larger group of the best and worst fielding shortstops (in terms of plays above/below average), what should we expect for their teammates? If the shortstop plays made is an accurate reflection of the shortstop's defensive value, then I would not expect their teammates to be far from average. With a big enough group, we should have some of these extreme shortstops playing on good defenses, and some playing with bad defenses. This is what I find:
The average playing time for the best and worst shortstops in the multi-year data was about 5 seasons worth. The multi-season level results should be regressed less than single season data, as expected, but even at the multi-year level they should be regressed heavily.
What bothers me about the defensive ratings derived from looking at plays made is that the numbers seem far to extreme. It is just not easy to believe that a player could cost (in Jeter's case) or save his teams that many runs. My reservations about this do not mean that the numbers are wrong. The idea that Jeter cost his team 25 or 30 runs per year on average is an extreme claim. With such an extreme claim, I ask what should we expect to see if this claim was in fact true about Jeter?
None of the three expectations were met. Yes, The Yankees played worse defense when Jeter was in the field, but not nearly as bad as the plays made data suggests. His teammates did not entirely make up for the lost plays, but it appears that much of the deficit in plays at short was simply balls hit to other fielders. Team defenses have a standard deviation of about .016, and this is not nearly enough to make sense if shortstops were regularly 50 plays better or worse than average.
In a few months Derek Jeter will be inducted into the hall of fame. He fully deserves to go in, despite his below average defense. By both Fangraphs and Baseball-reference, he has a bit over 70 wins above replacement, even with the bad defensive ratings worked in. He made up for the defense in other ways. He was a fine hitter, an excellent baserunner, and his constant hustle allowed him to beat out extra infield hits and stay out of double plays. If I had been running the Yankees I would have tried to move him to another position, the typical career path for good hitting, poor fielding shortstops. The 2004 season seemed like an obvious opportunity, as they acquired the reigning gold glove winner at shortstop and in doing so opened up a spot at second base. Jeter's weaknesses on defense were that he didn't react as quickly as other players, and his arm wasn't as good as the top shortstops. With those weaknesses, I don't think he would have been a good fit for the hot corner and if those were the only two options, it might have been the correct decision to leave him at short and play A-Rod at third. At second base, his arm would not have been an issue and his speed and solid fundamentals would have, in my opinion, enabled him to play solid defense there. The window closed a year later however as the Yankees came up with a very good second baseman.
This page was last modified 12/19/2019