The Poverty of Observational Data

A recent National Journal article does a good job of feeling out the limitations of traditional survey research and the modeling derived from it:

Overall, 2012 brought more attention than ever to poll aggregators, with their methods becoming more sophisticated. But where do they go from here?

“I don’t think there are great advances in averaging or modeling horse-race polling data,” Blumenthal said. “We are ultimately reliant on the quality of the data that’s collected.”

Blumenthal is correct; there is little more to be done with data from static snapshots of public opinion that traditional polling delivers. There is only so much information contained in any one poll or even large collection of polls. 

What ES does is add new information regarding how specific messages (information, ads, etc) push and pull opinion. It produces data from a counter-factual universe; what if every citizen watched David Gregory fiddle with a high-capacity magazine while poking at Wayne LaPierre? Would it shift opinion on gun control? Would LaPierre’s response move people? What about the impact of seeing both together?

This tells us how dynamic or stable opinion on a subject is, and where it is likely to move as an issue is engaged more seriously in the public sphere or in the media. Surveys and even tracking polls cannot accomplish this. Instead of just an X-Ray snapshot, we get something more akin to an MRI of political psychology, data which creates an image of how the application of a “treatment” affects a “patient.”

What made the Obama campaign so very accurate in their prediction of the vote across contested states was the use of experimental results from the “lab” and the “field” in their voter modeling. Because they had a large amount of experimental data showing them how different kinds of people shifted in response to various messages (toward or away from Obama, greater or lesser likelihood of voting), they could predict with astonishing accuracy the aggregate results.

Traditional polling data alone can never accomplish this, as the Romney Campaign’s shock at the outcome attests. This, despite the fact that they employed “vector autoregression models” to decipher the data and make their predictions. No matter what shiny statistical talisman one uses, none can overcome the inherent limitations of observational data. (More can be done with quasi-experimental designs such as regression-discontinuity, but these are more helpful in public policy than political psychology). 

So while there are likely no great advances to be made in traditional survey-based modeling, there are many exciting advances that have been made and yet to be accomplished using experimental data and creative research designs.

There’s a whole new world out there . . .