Using and Abusing Statistics–Baseball Edition

I like statistics and I like baseball, but the way I approach baseball stats might be a bit different from most other people.

At one time, it used to be that pitchers were evaluated on the basis of wins and losses, then came along pitchers who were ludicrously good with lousy win loss records like Nolan Ryan and people started realizing that the wins and losses make for lousy stats and started looking at alternatives.  By and large, that was a good thing–but with a caveat that people have forgotten.

More recently, people started realizing that some pitchers have ludicrously good ERAs and are not that good, and others with lousy ERA’s who were better than their numbers.  Came along more advanced stats like SIERA and FIP.  By and large, this probably is a good thing–but again, with a caveat that people forget.

The caveat in both cases is that the objective in baseball is winning.  Even if you allow an average of just one run every nine innings, if you keep losing, you still lost.  So winning is perfectly valid way of measuring a baseball player’s performance.  It is, indeed, the only measure that is actually meaningful.  Everything else is secondary.

The problem is that there are 25 players on a major league roster so that contribution to a win by a single ballplayer is conditional.  Steve Carlton, on terrible Philly teams, was more valuable, relatively speaking, than he would have been on a good team, even if he lost 20 games (i.e. 1973).  So how valuable is a single player on another team, ceteris paribus?  This involves constructing counterfactuals and it is something statistics–the real statistics–is supposed to be good at, as it came out of experimental research tradition.  But this is something that requires a bit more complex thinking than what most users of data, baseball and otherwise, seem too interested in consuming, as it often cannot reduce the performance to a single set of numbers.

Personally, I think ERA is still the best single set of numbers, for example, for evaluating pitchers for the ease of interpretation that it allows.  A pitcher with ERA of 3 on a team that averages 4 runs a game is a winner, on average, while the same pitcher on a team that averages 2 runs a game, on average, will be a loser, assuming that everything except the average offense (e.g. fielding, bullpen quality, etc.) stays that same.  That’s a bad assumption, obviously, but it omits an even more egregious and troublesome assumption from measuring pitchers by their win-loss records:  that everything, including offense, is the same–except, that is, the pitcher.  

Note that, one can actually do a bit better even to just use ERA or win-loss record, to evaluate a pitcher, by incorporating better statistical methods that don’t reduce themselves to a single number.  Pitching performance and everything else are random variables:  the offense might score an average of 5 runs a game, but with variance of 2, say.  The pitcher may give up 2 runs a game, but with the variance of 2.  Another lineup may score an average of 4 runs a game, but with no variance whatever.  Another pitcher might give up 3 runs a game, but with the variance of 0.  The second pitcher always wins, in front of the second lineup.  The first pitcher might be better on average, but he might lose, even in front of the second lineup.  But if you have the first lineup and if the pitching and hitting performances are independent (might not be–personal catchers and all that), perhap you might want the first pitcher rather than second–or not, perhaps, depending on the distribution (which may not be normal).  Of course, this is a baseball application of the “tall Hungarian” problem.  A high variance distribution allows for gambling in a way that low variance distributions do not–whether you choose to gamble depends on the circumstances.  Sometimes, gambling is the only way–and occasionally, it pays off.

Further incorporation of additional variables–fielding, relief corps quality, ground ball/fly ball ratios, and all that, will further reduce the variance, but will it completely eliminate the uncertainty?  Sometimes, a Mark Lemke hits a grand slam after all and an Omar Vizquel boots a grounder, after all.  You don’t want to intentionally put Mark Lemke in a spot where he HAS to hit a home run–that would be silly.  But risks and gambles are what make baseball interesting, and betting on high variance/low mean is sometimes exactly what you must do to win–even if you will probably lose your gamble.

Now, what being able to add more variables and reduce “errors” means is that you will be able to make better, safer gambles, but that is hardly a sure thing.  An interesting observation that has been made about investments into risky assets is that, the more data-intensive the research and analyses have become, the smaller the arbitrage opportunities have become:  not shocking, since, if it is obvious, people will grab on to them and pay a premium for it.  The consequence of this is that people are taking on more risk, because it is easier to bet on your getting lucky than being good–because all the obvious answers have been addressed.  I don’t know if this tradeoff is as well understood as it should be:  (relative) success is increasingly a sign of luck than skill.  But, at least when it comes to sports, we want to see the lucky as much as we do the skill.  You don’t expect a nobody to hit the walkoff hit to win a playoff series, but that happens often enough.

The bottom line is twofold.  First, all useful statistics are conditional (or Bayesian in a sense).  Unconditionally good stuff get arbitraged away fast–especially since unconditionally good stuff are obvious, even if you don’t know high powered stats.  The good players, good tactics, good approaches are good only if they are good for the situations that you need them for, which is almost certain to vary from team to team.  The real value is not that player X has WAR of 2, but how to best use a -2 WAR player (for another team, given how he was used there) to get positive win out of him for your team.  This can be tackled statistically, but not by calculating a single number that putatively captures his entire value.  Second, the value of a player is spread out over the entire season.  A player’s performance at any one time is variable, a gamble, a lottery ticket.  You invest in probabilities, but sometimes, General Sedgwicks get shot at improbable distances.  Working with probabilities and statistics CAN improve your chances at the gamble, but this is two dimensinoal–do you want to win big, at a big risk, or do you want to win small, at a small risk?  This comes with the additional proviso, of course, that your understanding of the universe is limited.  The lack of the appreciation for the risk and uncertainty is usually how one lies with statistics, or surprise the Belgians with unexpectedly tall Hungarians.


One thought on “Using and Abusing Statistics–Baseball Edition

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s