Baseball, Information, and the Brain

This fangraphs article is possibly the most fascinating thing that I had read about neuroscience behind (quick) decision-making, ever.

The problem with seeing the ball out of a pitcher’s hand, obtaining some information, and translating it into reaction is that the information is usually too complex, too wrapped in uncertainty, and the amount of time available is too small.  The article is probably being fair saying that how most batters cannot really describe or explain what it is that they see or how they process the information–it is not really a deliberate “analytical” process, but it is still a reaction that is both learned and “analytical” if in a slightly different sense–of having a fairly small set number of probable reactions, learned through both experience, analysis, and “intinct,” into which a batter can switch into rapidly–a set of mental shortcuts if you will.  A useful analogue might be parties in politics:  there are just two bins, or 4, depending on how one conceptualizes the universe:  there are liberals and conservatives, Democrats and Republicans.  Most politics fit into these categories (or the combination thereof).  If it’s not any of these, the brain will be confused in the short term, and without an obsessive interest in figuring things out–and this kind of interest is rare in politics, especially this requires leaving opinions behind–it is not worth delving into such things too deeply.  So most people operate through two step process:  does it fit the usual boxes of politics, and if it doesn’t, do I care, with the answer to the latter question usually being a big “no.”

The same is true with hitting a baseball, and presumably, with most other activities requiring a quick reaction:  nobody who is any good is probably so simple minded to have just one box, so to speak.  But most people will have just a few boxes, which, thankfully for them, would account for most of the universe.  (The same applies to sabermetrics:  most of these usual boxes will yield predictable results–i.e. high frequency of fly balls probably means the pitcher is not as good as his ERA indicates, for example–the idea behind FIPS)  But if the expectations can somehow be subverted, you can fool the hitters.  While a strange politicians who is not exactly liberal or conservative nor a Democrat or a Republican will confuse the voters and lose elections–becaused confused voters don’t vote–getting batters confused is a useful skill, if you are a pitcher, and all the better if you can confuse the sabermetricians along the way, because, that way, your methods might be so complex that the batters won’t be able to adjust to you easily either.

Theory, like Fog on Eyeglass…

There is something interesting about where the data journalist types were led astray in the 2016 primaries:  it’s not (as much) the data.  Rather, it’s the theory about politics that many of them began coverage of the primaries with–especially the now-infamous The Party Decides book.  In fact, while I have not examined the data systematically, a quick glance through the “polls plus” versus “polls only” forecasts by fiverthirtyeight.com provides a potentially quantifiable clue to this problem:  polls plus, which took into account institutional effects (e.g. endorsements) seem to have been systematically less accurate than polls only, which agnostically took the polling data as the guideline.

This should be good news for “science,” but it will come as a black eye instead.  We have systematic, quantitative data that clues us in on where our previous theory, one on which The Party Decides was grounded on.  This gives us the opportunity to rethink, refine, and update the theory, which is exactly how “science” is supposed to work hand in hand with data.  Instead, however, it will be accepted as evidence that the old theory, mistakenly accepted as “political science” was just wrong and provided misleading clues to the future.  Rather than an opportunity to rethink, refine, and update, it will be used as the means to further denigrate social sciences, already under attack as is.

The title of this post was poached from an essay by Stephen Jay Gould, who, in turn, took it from a Charlie Chan movie.  The context in which Gould was using the quote was the use of data by racial “science” and eugenics in early 20th century–not an insignificant attack, since racial science and eugenics drew interest of some of the best classical statisticians of the period and many of their most important work took place in this context.  Gould argued that, because these people were so enamored with their theories, they found increasingly creative ways to fudge the data and analyses when the data did not fit their worldview, thus, the fog of theory obscured the insights shown by the data.  Of course, this is true in every empirical context:  the old fashioned baseball people were not stupid.  They had good reasons to subscribe to their “theories” about baseball, backed by some data–but without systemic understanding of data analysis, they had little or no opportunity to rethink, refine, and update their theories, until many of their theories were just barely better than random chance.  Some of the more ardent sabermetricians were, I think, truly barbarians at the gates who did not know nor care much about baseball–just data, and concocted whatever that looked fine statistically, and as a baseball fan, I am happy to see that they did not win out.  Their attitude, however, has still prevailed in some aspects of sabermetrics and data analyses in general–as per the current popularity of “data science” attests to.

There is substantial amount of blame to be had by the social scientists in this failure to:  they did not approach the problems scientifically, although, with all the pressure to get things “right” as there is, the more leisurely approach to analyzing “lessons learned.”  This is unfortunate:  the success of US Navy during World War 2, in 1943 and beyond, came from the thorough “lessons learned” research after disastrous battles of 1942.  Failures help us learn.  Throw away failures, we learn nothing.  Unfortunately, if failures are used as the criterion for dismissal and punishment, we will throw away all our failures and learn nothing.

Misuse of Statistics Begetting More Misuse

I am ambivalent, if you hadn’t guessed already, about the popular use of statistics.   Dan Kahneman, among others, showed that people are inherently bad at statistical thinking.  My contention on this point has been that it is not simply that people don’t understand the basic statistical concepts, but that the subtle implications run counterintuitive to how people normally think.  The powers of statistics is that it summarizes a complex data into a neat, simple package, PLUS it provides a summary statistic of how wrong that package is, conditional on the sample.  Everyone loves the simple package, not the fine prints that comes with it.

This article on Fivethirtyeight strikes me as emblematic of exactly this sort of seemingly profound misuse of statistics.  I don’t doubt that, in general, pitchouts, sac bunts, and other self-sacrificial baseball strategery has been fairly ineffective.  But I also find it hard to believe that they are universally ineffective, period.  The more likely scenario is that the use of such strategery has generally included a small proportion of useful applications mixed in with many ineffective ones, with the situations difficult to disentangle through simple statistics.  Thus, the average sac bunt in the sample of the past was indeed ineffective, but with a small but meaningful if only it could be identified subset that was effective.

This is not an argument for a blanket condemnation of sacrifice bunts, then, but a call for better, more nuanced research.  What are the effective use of sacrifice bunts?  How do you sacrifice bunt more in situations when they are actually valuable and avoid using them when they are counterproductive?  One might suspect, in fact, that, if the baseball people are not as stupid as naive sabermetricians assume them to be, and are actually combining their traditionally minded skills with sabermetric insights productively, ineffective sacrifice bunts have been eliminated from the sample much faster than the effective ones.  So has the value of sacrifice bunts changing over time, then?  One might think it should be so, and we’d be able to tell if we have developed a useful methodology for distinguishing effective sacrifice bunts from ineffective ones, rather than trying to place a value on the average sacrifice bunt the way so much misuse of statistics winds up becoming.