• We can tell from observing baseball games that teams have different distributions of batted balls.One absolutely cannot tell, by watching, the difference between a .300 hitter and a .275 hitter.The difference is one hit every two weeks.It might be that a reporter, seeing every game the team plays, could sense the difference over the course of the year if no records were kept, but I doubt it .A fielder’s visible fielding range, which is his ability to move to the ball after it is hit, is vastly less important than his invisible fielding range, which is a matter of adjusting his position a step or two before the ball is hit.What of systemic observation, though?If data providers were able to furnish reports with a high level of accuracy, we would see consistency between datasets.In reality, data providers can and will differ substantially even in the aggregate.Mitchel Lichtman compared rates of ground balls and fly balls and found differences there as well.A lack of accuracy in and of itself is worrying, but in theory it ought to be overcome by a larger sample size.That suggests that there’s some sort of bias preventing the data sources from coalescing around similar figures over a long time span.One potential source of bias is the position of the observer when recording the data.The data is collected by stringers seated in the press boxes at all 30 major league parks.They range from a low of 38 feet in Oakland to a high of 92 feet in Pittsburgh.As it turns out, there is a relationship between press box heights and line drive rates as measured by Gameday stringers.While that seems small, it can be significant in practice.Consider the two Chicago parks.Cellular Field has a press box height of 47 feet, while at Wrigley Field the press box is 67 feet off the ground.Given that, we can expect a player with a line drive rate of 22 percent at the Cell to have a line drive rate of 26 percent at Wrigley.Having systemic differences between batted ball rates that persist over long periods of time is one thing that may be preventing different data sources from reaching agreement even in larger samples.Some of the broadcast cameras are located in or at the level of the press box.Sportvision has a set of dedicated cameras in the ballpark to collect very precise estimates of the pitched ball, typically within about an inch at home plate.After all, this is not simply evidence of a lack of accuracy but of a systemic difference in how batted balls are recorded.The simplest explanation is that stringers are unable to score the absolute position of a batted ball.Instead, the stringer determines the position of the batted ball relative to one or more landmarks, and uses that to assign the position of the ball on the field.What this data suggests is that one of the landmarks the stringers use is the position of the fielders themselves.The cure for the first problem is increased sample size, but that has the perverse effect of exacerbating the second problem.We know there are biases and discrepancies, but we are unable to evaluate each particular method to see exactly the magnitude of those effects.Batted ball data has become so pervasive that it is difficult to conceive of evaluating fielding without it.If we allow ourselves to take a step back, though, we can recall that there were efforts to estimate fielding analysis before we had batted ball data.Is there anything there that might shed some light on how best to evaluate fielding?Once again, we turn to Bill James, who was dissatisfied with his own range factor, as well as attempts to improve upon it.Batting success of individuals may be successfully related to team wins because there is a natural relationship between individual batting statistics and team success.Pitching statistics of individuals may be successfully related to team wins because there is a natural relationship between individual pitching statistics and team wins.But fielding statistics of individuals are difficult to relate to team wins because there is no natural relationship between individual fielding statistics and team success.How do you fix that?You fix that by starting in a different place.You don’t start with the individual fielding statistics.You start with the performance of the team.First, before you do anything else, establish the overall defensive quality of the team.This observation has languished largely unnoticed, partly because the advent of systems relying on batted ball data seemed to supersede it and partly because it was part of his arcane and confusing win shares system.The fundamental insight behind fielding win shares was buried in baggage like claim points and other mathematical gyrations, and separating them would take a fair amount of effort.James’s proposal turned traditional fielding analysis on its head.To that point, fielding analysis began with what a player did and then proceeded to a comparison with that player’s peers.This is a key insight because in looking at the totality of a team’s fielding chances and the distribution thereof, we can see when a player’s fielding performance is actually helping his team save runs, as opposed to simply taking fielding chances that would otherwise be going to another player.With a pitching staff that was heavy on strikeouts and ground balls, the outfielders simply didn’t see many flies, but they largely caught what came their way.This allows us to separate ability from opportunity.The biggest indicator that a player is simply hogging chances rather than contributing is if his performance comes when the team itself is having little defensive success.We start by comparing infield to outfield and then break down between the positions in each unit.If the outfielders are making a lot of plays as a unit, but the team’s defense is not appreciably above average, that’s an indicator outfielders are simply getting more balls hit to them than would normally be the case.Conversely, if a team’s outfielders are making relatively few plays and the team’s defense isn’t noticeably suffering as a result, the more likely explanation is fewer chances, not poor performance.Once we know what has been hit where and how often, we can make a reasonable estimate of an expected number of plays on those balls.At that point, it is much easier to isolate individual player skill.Consider hypothetical versions of the Yankees and the Cardinals.The Yankees have Derek Jeter at shortstop.The Cardinals have Ozzie Smith.Both teams see an equal number of defensive chances, with a typical distribution of balls to each position.In most seasons, a vastly greater percentage of balls hit toward the Yankees’ shortstop position are going to go for hits based on the disparate abilities of the shortstops.Biases will keep batted ball data from coalescing to similar conclusions over large sample sizes, so we know that increases in sample size will not necessarily improve our understanding.Metrics that eschew batted ball data don’t have that same concern.Consider, for instance, Derek Jeter.In his career he’s played behind 181 different pitchers, in dozens of different parks, and alongside 51 different third basemen and 35 different second basemen.Derek Jeter is one of the most controversial players when it comes to fielding analysis.He’s well regarded by many for his fielding prowess at shortstop, having won five Gold Gloves in his career, but practically every accounting method of fielding prowess based on the numbers disdains his glove.That means that in Jeter’s 9,710 fielding chances, he’s made 41.5 fewer errors than the average shortstop in the same number of chances, or 2.4 fewer errors per year.Flashy plays are more likely to stick with voters and fans than an extra error here or there.The key to evaluating Jeter’s fielding isn’t what he does with balls he fields, but counting how many balls he gets to in the first place.That’s almost 1,000 fewer chances over his career, or 59 fewer chances per season.Derek Jeter’s skill at avoiding errors is fundamentally a skill at avoiding baseballs.Why don’t we notice this shortage of fielding chances?We start off watching the pitcher, then the batter, and then finally we watch the fielder.But before we finally turn our attention to the shortstop, he’s already positioned himself, read the ball coming off the bat, and started moving to where he thinks the ball is going to be.All of that is a vital element of fielding, and all of it happens while we’re watching someone else.Once we have finally turned our attention to where the fielder is, a ball to which he has reacted poorly during those crucial first few seconds will seem as if it had been out of his range all along.The utility of batted ball data, over thousands of balls in play over the course of a career, is greatly diminished.Given a large sample, we expect the range of expected outs for different players at the same position to diminish substantially.Having metrics that work well for careers but less for individual seasons is an unsatisfactory solution, but it’s preferable to having metrics that work well for neither.Waiting upon commercial data providers of any stripe to solve our analysis problems for us means turning away from the spirit of inquiry that defines sabermetrics.As a Russian proverb tells us, we pray, but we also keep rowing toward the shore.I know people say, Well, you’ve got to score runs, but you’ve got to stop them before you can score runs.Baseball has clearly delineated how it values each kind of player by their vastly divergent levels of compensation and job security, but is it correct?What is the relative value of offense and defense, and how does it affect team building?Pretend you’re the general manager of a team that finished last season with a .500 record.What’s more, your team scored exactly as many runs as it allowed, 700 in both cases.Your third baseman, Aaron Average, has decided to retire and you need to find a replacement for his precisely average performance.Bobby Bats and George Gloves.Either one will cost you $5 million.Bobby Bats will contribute 20 more runs offensively than Aaron Average, and will match him defensively.George Gloves will save 20 runs more than Aaron Average would have on defense, while contributing identical value with the bat.Which player should you select?The easiest way to answer this question is to use baseball’s version of the Pythagorean theorem.Originally developed by Bill James, it uses runs scored and runs allowed to estimate a team’s winning percentage.It’s funny how contrived examples work out, isn’t it?Now let’s look at the team’s expected winning percentage next season under each scenario.For a balanced team, each run saved is slightly more valuable than each run added.What about teams that aren’t as balanced?Have we uncovered a fundamental principle for team building?The Pythagorean theorem is just an estimate of how many games a team will win based on their runs scored and runs allowed.There is some benefit to tailoring your approach based on your overall team quality, but it’s far from clear that it makes sense to base your entire strategy around it.A far more useful plan is Get better any way you can. Score one for the obvious.How do we go from that flash of brilliance to actually deciding which players to acquire?The best approach is to look at total value by combining hitting, base running, and fielding.The traditional approach to figuring a position player’s value is to tally up how he compares to average in batting, running, and fielding, add those numbers together, adjust for position and playing time, and voilà!This process is not perfectly accurate, because it assumes all runs are created equal, but it’s a useful starting point for discussion.Looking at the two biggest areas of impact for a position player, hitting and fielding, we see a big difference in ability to accurately measure value.

Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment