This is a fantastic dissertation, too good for a mere MA thesis. While it is fascinating enough just as a source of historical information on an interesting topic, it is also useful as an instructive illustration of the problems in successfully using and abusing data.
Intelligence assessment and statistical analysis are, fundamentally, the same problem. Both are extrapolations of the knowns (the data) to evaluate the unknown, with a certain set of assumptions that guide the process. Both can never be entirely accurate, as the “knowns” do not match up neatly with the “unknowns,” but if we do enough homework and/or are sufficiently lucky, we can deduce enough about the relationship between them to make the pieces fit. Or, in other words, we cannot rely on the data itself to just tell us what we want to know. What we really want to know, the really valuable pieces of information, will be somehow unavailable–otherwise, we wouldn’t need to engage in the analysis in the first place.
The thesis points to an all too common in data analysis: the good data pertained to something that we don’t really need to know, or worse, something potentially misleading, while the problems that we really do want to know did not generate enough high quality data.
In case of military aviation in Japan, the good data came from 1920s, when the Japanese, being aware of the backwardness of their aviation technology, actively solicited Western support in developing its capabilities, both military and industrial. Since they were merely trying to catch up, their work was largely imitative. Knowing the limitations of their existing technological base, they were happier to copy what was already working in the West, even if they were a bit old, rather than try to innovate on their own. Since they had, literally, nothing to hide, they were open about the state of their technology, practices, and industries to the Westerners, who, in a way, already knew a lot of what the Japanese were working with anyways since most of them were copies of Western wares. In other words, the data was plentiful and were of extremely high quality. But they also conformed to the stereotype of the Japanese in the West as not especially technologically advanced or innovative.
By 1930s, things were changing: not only were Japanese developing new aviation technologies of their own, the relationship with the West has cooled decisively. They became increasingly secretive about what they were doing and, as such, good data about the state of Japanese military aviation became both scarce and unreliable. But, in light of the increased likelihood of armed clash between Japan and the West, the state of the Japanese military aviation in 1930s (or 1940, even, given when the war eventually did break out) was the valuable information, not its state in 1920s. The problem, of course, is that, due to the low quality of the data from 1930s, there was nothing conclusive that could be drawn from them. While there were certainly highly informative tidbits here and there, especially viewed in hindsight, there were also a lot of utterly nonsensical junk. Distinguishing between the two was impossible, since, by definition, we don’t know what the truth looked like. Indeed, in order to be taken seriously at all, intelligence reports on Japanese aviation had to be prefaced with an appeal to existing stereotypes, that the Japanese were not very technologically savvy–which was, of course, more than mere prejudice, as it was very much true, borne out by the actual data from 1920s. In other words, this misleading preface became, in John Steinbeck’s words, the pidgin and the queue–some ritual that had to be practiced to establish credibility, whether it was actually useful or not.
This is, of course, the problem that befell analyzing the data from the 2016 presidential election. All the data suggested, as per the state of Japanese military aviation, that Trump had no chance. But most of the good data, figuratively speaking, came from the wrong decade, or, involved the matchup that did not exist. In all fairness, Trump was as mysterious as the Japanese military aviation of 1930s. There were so many different signs pointing in different directions that evaluating what they added up to, without cheating via hindsight, would have been impossible. While many recognized that the data was the wrong kind of data, the problem was that the good data pertaining to the question on hand simply did not exist. The best that the analysts could do was to draw up the “prediction,” with the proviso that it is based on “wrong” data that should not be trusted–which, to their credit, some did. This approach requires introspection, a recognition of the fundamental problem of statistics/intelligence analysis–that we don’t know the right answer and we are piecing together the known information of varying quality and a set of assumptions to generate these “predictions,” and sometimes, we don’t have the right pieces. The emphasis on “prediction,” and getting “right answers,” unfortunately, interferes with the perspective. If you hedge the bet and invest in a well-diversified portfolio, you may not lose much, but you will gain little. Betting all on a single risky asset ensures that, should you win, you will win big. Betting all on the single less risky asset, likewise, would ensure that you will probably gain more than hedging all around–and if everyone is on the same boat, surely, they can’t be all wrong? (Yes, this is a variant of the beauty contest problem, a la Keynes, and its close cousin, Stiglitz-Grossman problem, with the price system.)
I am not sure, if the benefit of hindsight could be removed, an accurate assessment of Japanese military aviation capabilities in 1941 could have been possible. The bigger problem is that, because of the systematic problems in data availability, the more rigorously data intensive the analysis (at least in terms of the mechanics), the farther from the truth its conclusions would have been. A more honest analysis that did not care about “predicting” much would have pointed out that the “good” data is mostly useless and the useful data is mostly bad, so that a reliable conclusion cannot reached–i.e. we can’t “predict” nothing. But there were plenty of others who were willing to make far more confident predictions without due introspection (another memory from 2016 election) and, before the election day, or the beginning of the shooting war, it is the thoughtless and not the thoughtful that seem insightful–the thoughtless can at least give you actionable intelligence. What good does introspection due?
Indeed, in absence of good information, all that you can do is to extrapolate from what you already “know,” and that is your existing prejudice, fortified by good data from the proverbial 1920s. This is a problem that all data folks should be cognizant of. Always think: what don’t we know and what does that mean about the confidence we should attach to the “prediction” we are making?