The Atlantic has a nifty article on a cottage industry that grew up of late: trying to find statistical patterns in where the votes for Trump (and other candidates) come from. The truth is that, for all the fetishistic interest that grew up around “big data” and associated analytics, there really isn’t a whole lot to loading up some package and running the data, either in Python using SciKit or StatsModels or in R. There will be many patterns found. Some meaningful. Others, not. The real challenge in data analysis is to, as the article observes, “The trick is taking the insights seriously without taking them as gospel—and making sure to evaluate the assumptions their creators made.”
Every statistical model is wrong, in some form or another. The real insight is to learn why and how they are wrong and incorporate them into the next model. This is not just big data. This is how science works.