Ricochet is the best place on the internet to discuss the issues of the day, either through commenting on posts or writing your own for our active and dynamic community in a fully moderated environment. In addition, the Ricochet Audio Network offers over 50 original podcasts with new episodes released every day.
Michael LaCour, a UCLA graduate student in political science, has been accused of faking results for a paper that he published in the journal Science. Two days ago, Science retracted the article.
Four days ago, in my Ricochet post, I explained why I believe LaCour has faked the results of a second paper. Much of my explanation was speculation. For instance, I described how the confidence intervals for his estimates of media slants didn’t follow the pattern that one would have expected. I said in that post that I would have given 10:1 odds that the results were fake; that is, I was about 90% sure that this second paper was also an instance of fraud.
A few days ago, Emory political scientist Gregory Martin posted a note examining the same paper by LaCour. Approximately a year ago, Martin and his coauthor, Ali Yurukoglu, had asked LaCour for the computer code that he wrote for his paper. LaCour gave him the code, but Martin was unable to replicate the results that LaCour reported. Perhaps most shocking, Martin noticed the following problem: In his paper LaCour wrote that all of his media data came from the UCLA Closed Caption (LACC) archive of news shows transcripts. The paper reported estimates for 60 news shows. Martin noticed, however, that 14 of the shows do not really exist in the LACC data set.
Although one could imagine innocent explanations for the problems Martin found, the evidence was stunning. After seeing Martin’s results, I became about 99% sure that LaCour faked results in the second paper.
Late yesterday, Martin added to his report two paragraphs explaining additional evidence of fraud by LaCour. The evidence in these new paragraphs make me approximately 99.99% sure that LaCour faked his results.
I’ll describe the new evidence in a moment. But first, here’s some background. A few years ago, while I was a professor in the UCLA political science department, I had a conversation in a hallway with my colleague Jeff Lewis. Lewis noted a potential problem with the media-bias paper I had written with Jeff Milyo. Although our method assumed that observations are independent (a standard assumption with statistical models), Lewis discussed some reasons why we shouldn’t believe that that is the case. He described a way to alter our method that would correct for the potentially untrue assumption.
I also remember Lewis telling me that he had described his alteration to a PhD class that he was teaching, or possibly to a single student. I think I remember that he also told me that he incorporated his model (i.e. the altered version of my and Milyo’s model) into a homework problem for his class.
I’m now almost certain that the statistical model that LaCour describes in the last page of the appendix is the exact model that Lewis constructed.
Lewis constructed his model to be run on the data that Milyo and I gathered. Those data involve citations to think tanks made by members of Congress and media outlets. In contrast, LaCour’s paper uses data about “loaded political phrases” made by members of Congress and media outlets in the LACC archive.
Here now is some of the new evidence that Martin reports. First, the data file that LaCour’s code references is labeled, “/Users/michaellacour/Dropbox/MediaBias/Groseclose/counts.csv.” Curiously, the label of the file contains my name.
Approximately two or three years ago, I gave LaCour the data from my and Milyo’s project. LaCour’s method appears to use that data. But my and Milyo’s data is based on think tank citations, whereas LaCour’s purported results are based on loaded political phrases. If LaCour really executed the method that he says he executed, it would make no sense to use my and Milyo’s data.
Second, LaCour writes in his appendix that his method assumes that the slant of each of the news shows that he analyzes has a “prior distribution” with mean equal to 50.06. Meanwhile, in our media-bias paper Milyo and I report that our estimate of the average voter’s ideology (on the “adjusted ADA” scale) is 50.06. LaCour must have gotten that number from our paper. However, it makes no sense for LaCour to use that number. The slants that he reports are on the “DW-Nominate” scale, which runs from -1 to 1. It is impossible for those slants to have a mean of 50.06.
Third, as Martin reports, LaCour’s code instructs his computer to compute slants for 20 news outlets. This happens to be the number of outlets that Milyo and I examined. In contrast, LaCour reports slant estimates for 60 outlets.
In sum, the code that LaCour claimed to use – and sent to Martin – does not make sense for the results he publishes in his paper. In contrast, the code makes perfect sense for the method that Jeff Lewis described to me that day in the hallway.
I strongly believe that something like the following happened: LaCour completely faked the media-slant results that he reports in the second paper and did not really write any computer code to estimate those results. When Martin and Yurukoglu asked him for his code, he sent the best substitute he could think of—the code that Jeff Lewis wrote for a different problem.