ESPN blog has an interesting set of insights about how use of data is transforming strategy in baseball. The short version: people are substituting risk for “data,” so to speak.
Now, that is a bit misleading: people are using data to evaluate risk, and taking risk intelligently, and that has always been the case, with or without the big data. Mickey Mantle and Sandy Koufax were so good that you didn’t need data to tell you that they were great ballplayers. You use data so that you can systematically identify and evaluate the risk of the ballplayers who are not that obviously talented. So, in a sense, is the traditional scouting: ballplayer X “looks like” Koufax based on qualities that may not be so easily quantifiable, so I am willing to risk that he is a good ballplayer like Koufax. Of course, the chances are considerable that X is not Koufax, and the disadvantage of non-quantification is that you can’t retrospectively evaluate where you went wrong with precision, whether about X or about Koufax.
The part about being able to go back and retrospectively evaluate with some precision where things went wrong is the critical component of what makes something “science.” It is not necessarily impossible without quantification, but the limits in “precision” are reached quickly without the numbers. Yes, X did not look like Koufax so much after all, upon closer look…but how exactly NOT like Koufax was X? Numbers help round out these differences with precision, whereas non-numerical differences, even if real, are harder to describe.
The problem with Sabermetrics and Big Data mentality, in general, though, is that it dispenses with the science to a large degree. So is Danny Espinosa more valuable than Johnny Giovatella by 2-3 games because of fWAR? That is, will Angels win 2-3 more games by replacing Giovatella with Espinosa, while holding all things equal? It is highly unlikely. fWAR might be a useful aggregate statistic, but how exactly it is reached is not clearly known (bWAR is more easily calculable, but there’s no way to tell if it means actual “wins above replacement” since it is just a patchwork of existing baseball stats.) and, even when it can be calculated, there is no way to tell if it really means what it thinks it is without a sense of imprecision of the statistic in general and with regards its application to Espinosa, Angels, and Giovatella specifically. Of course, this imprecision is the source of both risk and opportunity: while the fWAR says something, we still don’t know exactly what will happen and there is no way to evaluate it beforehand. So Eppler, the Angels’ GM, can take chance that Espinosa will overperform fWAR and Rizzo, the Nationals’ GM, that he would not. That subjective evaluation|Angels > subjective evaluation|Nationals implies that there is a profit to be made from the trade on the present information although, when the full truth is revealed at the end of 2017 season, one will have lost relative to the other, with much likelihood. (i.e. are prospects Angels sent to Nationals, who looked terrible last year, any good in next year’s reality or what?) Angels are willing to pay more than what the Nationals consider Espinosa to be. So both sides win, until the truth is revealed, that is, by the end of the 2017 season.
In other words, evaluating risk and gambling on them is, in a sense, fundamentally the opposite of what people usually think they are gaining when they do “science.” If the risks and gains can be made more precise, with systematic evaluation of data, the former can be minimized and the latter maximized. You can identify players who are like Koufax in every manner quantifiable and, if no one else knows the magic formula, lock him up before anyone else gets in the act. You pay no penalty for the risk and reap all the gains. But once everyone knows the formula, the competition will wipe out any gain that you stand to make. You have to recognize the differences between X and Koufax and ask yourself how big a gap these imply for the outcomes expected from X and Koufax. You may quantify these differences, you may subject them to complex models, or whatever. In the end, the answer is not precisely known–if it were, there’d be no risk. So the choice becomes subjective.
One great irony is that, once the bounds of certainty are reached, there is no real gain to be had from quantification, at least in the short term. You may crunch numbers and churn out models, but what you don’t know, you still don’t know. You go through the number crunching motions, at least in my part, to satisfy your audiences, to give them the impression that you know what you are doing–even if you don’t, at least not precisely. You are no better than a traditional scout or, indeed, a fortune teller or witch doctor. At best, you are making an educated guess. At worst, you are guessing randomly. Thus is explained why there are neither true believer nor true atheists in the trenches: you know what you know and you know what you don’t. You minmax risk and rewards when you can, and you are going off on a leap of faith when you don’t.
What you gain by having a theory–again, quantified or not–is that you’d know what to do when you are wrong. The theory, if properly concocted, would have given you an understanding of what to expect from different moving parts. The results that you expected did not obtain because at least some of the moving parts did not move as you expected. So what parts of your theory did not work? Again, quantification helps make things precise, but it is neither necessary nor sufficient.
The trouble starts when the numbers are no longer part of “science,” but mere formulas whose value is simply that “they work,” just because. So what if they fail? What if a pitcher with terribly FIP keeps pitching like an ace day in and day out? Some sabermetrics fans insist on insisting that the pitcher is wrong because he is really terrible–FIP says so. But, in the end, baseball games do have winners and losers and that’s final, no matter what FIP says. Of course, I can’t help but insert some political commentary here: in 2012 and again in 2016, the losers and winners of presidential elections alike went off blaming and crediting “big data” for the outcomes, with the losers degenerating to outright denialism and conspiracy theories. This is a sign of data being abused, in service of cargo-cultish pseudoscience, exactly the opposite of science. As described above with regards sabermetrics, a science that works too well can breed this kind of attitude: up to some point, better application of existing science CAN simultaneously raise the mean and lower the variance. When the boundary of the science is reached, the only way forward is to gamble, because you don’t know. In a sense, this is far harder a challenge for social sciences than the natural, because the boundaries of uncertainty are far too close. Deviations in the extent to which sunlight is bent by planet Mercury is too esoteric a thing. On the other hand, while the vast majority of the people may have party id and vote party id, the exceptions are numerous enough and are important enough in elections for the “theory” (that people are partisan and vote party) to be any useful except as an abstract starting point. Whether Jered Weaver or David Cone, in their heydays, were great pitchers or not, notwithstanding their terrible FIP, I suppose, is somewhere in between. Like Enrico Fermi said, when results confirm hypotheses, you have made a measurement, when results contradict, you have made a discovery. Sometimes, discoveries are good things.