This tweet by Nate Silver is worth pondering, in light of our present fetish with data whatever–whether it is data journalism or data “science.”
In my perspective, the value added from proper statistics is not that data tells us how “right” our narrative, theory, or whatever else is, but that it allows us to quantify how wrong our theories are and under what circumstances. Good statistical use of the data is not that the “big” data is necessarily better data, but simply that, if we have enough of it and apply appropriate methods, we’d get a better idea of how wrong our theories are, and provides credence in them because the theories are wrong only about 1 in 1 billion or whatever. In other words, in the more qualitative approach to data, we know we are more right than wrong, but we don’t know how much more with any precision. In the more quantitative approach, we know that we are more right than wrong, say, 99 to 1, under conditions X, Y, and Z. This is the real power of statistics: all theories are wrong, but at least we know how wrong we are.
The insinuation by Silver is doubly insulting, of course, because he implies not only that he and his ilk are more right, but that the old fashioned qualitative approach is not empirical. That’s first rate BS. The old journalists did not write fiction. They got their facts. They fit them into a theory of how the universe operated, and they drew up a narrative based on these. And yes, they did not and could not know how wrong they were because their methods did not allow for such estimation. And, this while so-called data journalists and so-called data scientists rarely pay attention to how wrong their facts and figures are (i.e. the variances) except as nuisance. This is an instance where pseudoscience is calling witchcraft a superstition.