We are nearing the end. But if we don’t reach our modest goal, we will have to cut back on content and run advertisements (how annoying would that be?). So please, if you have not done so, chip in if you have the means.
Joe Lenski is Executive Vice-President and lead researcher for Edison Research, the exit polling firm that provides all the data for major networks and outlets on election day in the United States. After weeks of trying and dozens of calls and emails, Mr. Lenski agreed to answer questions by email. Here are the questions and his answers, excepting number 10, without editing. I will be referring to this interview regularly in Part 3 and following of my series on Election Fraud Allegations.
At 5pm eastern, before polls close, Edison starts releasing general information to the news outlets that make up the National Election Pool (NEP) about demographic percentages (e.g. 18-29-year-olds made up 16% of the Democratic electorate in Wisconsin) and answers to more generalized questions (e.g. Is Ted Cruz honest and trustworthy?). Then at official poll closing time, even if people are still in line to vote, Edison releases the first full wave of exit poll numbers. That first full wave generally includes well over 90% of the final sample size. In New York, for instance, the sample size released at 9pm was 1367, or more than 98% of the final sample size of 1391. Is this generally correct or am I off base on anything?
Correct. Edison Research exit poll interviewers call in exit poll results three times during election day – once in the late morning, once in the mid-afternoon, and once shortly before the polls close in a state. The exit poll data that is released at 5PM to the news organizations comprising the National Election Pool (NEP members are ABC, CBS, CNN, FOX, NBC and the Associated Press) and any other news organizations subscribing to the exit poll include about two-thirds of the interviews that will be conducted on an election day. The exit poll results that are released around poll closing include nearly all of the voter interviews that are conducted during election day.
After the first full wave of polling, a second and even third wave of polling begins using results released by the particular state to help Edison get the demographic sampling and vote percentages correctly. In New York, for instance, the first wave of full exit polling said Clinton won with Latino voters by eighteen percent, around 7 pm it had been 16%, but the final poll has Clinton winning the Latino vote by 28% even though the sample size grew by just 24 respondents. Is this generally correct, or am I off base on anything, and if so, could you explain what factors allow you to do this mathematically?
During election day before the poll close the exit poll data is weighted to adjust for respondent non-response by demographic groups. Our interviewers record the gender, approximate age and race of voters who decline to participate in the exit poll and the survey results are adjusted for the response rate by demographic groups so that the weighted results represent all voters. Typically younger voters are more likely to agree to participate in an exit poll than older voters so the percentage of older voters is typically adjusted upward to account for this non-response. The overall response rate in most election exit polls is between 40 and 50 percent of voters who are approached. In addition after the polls close in a state we compare our exit poll results in each precinct with the actual precinct returns and also with the vote totals for each county in the state. This allows us to adjust the exit poll results so that they match the distribution of votes actually cast by geographic region in each state.
So far for 2016, Edison has entrance polled caucuses in two states (Iowa and Nevada) and exit polled primaries in twenty-three states. Is this correct?
Including the Indiana primary on May 3rd Edison has conducted entrance polls in Iowa and Nevada this year as well as 24 state primaries. We will also be conducting exit polls in West Virginia and Nebraska on May 10th.
[Note: I had sent these questions by email initially to Lenski before the Indiana primary and did not update the wording for this question before I sent them again after Indiana.]
On the Democratic side, I have numbers for the first full wave of polling for twenty-two of
the twenty-three primaries. In nineteen of those twenty-two primaries, Hillary Clinton wound up doing better, often much much better, in the final results versus what the first full wave of exit polling projected. Does this sound right?
Exit poll estimates are constantly changing during the day and evening. I would not be able to answer exactly without knowing what time the estimates you are looking at were calculated. There are time of day effects on the exit poll. Older voters tend to be more likely to vote in the middle of the day. Working voters tend to either vote early before work or later after work.
Harry Enten of FiveThirtyEight wrote a piece for the Guardian in 2012 which addressed what seemed to many of FiveThirtyEight’s readers like surprising differences between exit polling and final results. Enten was working with numbers from 2004’s general election and a state election in Wisconsin in 2012 with differences between exit polling and final results of around 5 percentage points, seven points at the top most. That would be just at the edge of the Margin of Error, slightly over in the case of Wisconsin; Enten explained why this is possible and not indicative of fraud. In fact, however, 2016 Democratic contests have seen Edison’s first full wave miss outside the margin of error in nine of the twenty-three primaries I have numbers for, all of them in Clinton’s favor and all by between eight and fourteen points. Are these calculations correct, and if so, can you comment on why that is possible?
The calculation of margin of error for an exit poll is more complicated than a simple calculation of margin of error based upon the sample size. Exit polls have two stages of sampling – first stage is the selection of a sample of poll locations typically between 15 and 50 locations in a state; second stage is the random selection of voters within each polling location. This two stage sampling procedure introduces a Design Effect (sometimes referred to in the literature as DEFF) that increases the overall sampling error. Also there are many other contributions to total error in any survey. As I mentioned before approximately 40 to 50 percent of respondents participate in the exit poll. If the group of voters who refuse to respond to an exit poll differ from those who do participate in the exit poll that would introduce a source of error that is impossible to calculate based upon the information that we have before the polls close. From our interviewer observations we know that older voters are more likely to refuse to participate in an exit poll. There is also evidence that higher educated voters are more likely to participate in an exit poll. Voters may have other reasons for declining to participate in exit polls – they may not have brought their reading glasses, they may not like any of the news organizations who sponsor the exit poll and whose logos appear on the questionnaire, etc. All of these factors contribute to a total error for the exit poll survey that is much larger than a standard calculation of sampling error based upon the total number of interviews.
I now want to get your comment on some explanations that people, including you, have floated for the exit poll discrepancy level. One suggests fraud or other forms of officially miscounted ballots, several do not.
The substantive explanation you’ve been giving for why these exit polls cannot indicate fraud while exit polls in what you call “emerging democracies” might is that the longer form of your questionnaire versus a simple “who did you vote for?” may lead to over and undersampling of various demographics. Bernie Sanders’ best demographic, however, is 18-29-year-olds. Given the general perception of 18-29-year-old attention spans, wouldn’t this explanation actually suggest that you might oversample older Clinton supporters rather than under-sampling them?
I don’t see how your assumption about the attention spans of 18 to 29 year olds has any influence here. We know from our interviewer observations that younger voters are more likely to fill out an exit poll questionnaire than older voters.
Another theory, one that might explain some level of discrepancy is that Bernie Sanders’ supporters, sometimes pejoratively called Bernie Bros, are so enthusiastic that they just can’t wait to tell everyone, including Edison pollsters, about their hero. Do Edison polling practices account for this possibility or could this explain why Edison is consistently oversampling Sanders voters?
The “enthusiasm” of a candidate’s voters may indeed have influence on who chooses to fill out an exit poll questionnaire and who chooses not to. We do have some evidence from questions that we have asked during this primary season that Sanders voters are more excited about their candidate than Clinton voters. It would make sense to hypothesize then that Sanders voters would be more likely to choose to fill out an exit poll questionnaire than Clinton voters. However, I have no hard evidence to prove or disprove that hypothesis.
Nate Cohn of the New York Times’ Upshot suggests that several cycles worth of data proves that exit polls overestimate young people’s turnout and that that, combined with early voting, is skewing results toward Sanders in Exit Polls. Could you comment on those factors?
There is a pattern that the exit polls show more younger voters than surveys of voters using other survey methods especially telephone surveys of voters. It may be that even our adjustments of age demographics based upon our observations of non-response by age do not completely correct for this effect. It may be that telephone surveys of voters are more likely to contact older people. My guess is that the correct answer is somewhere in between but I have no hard evidence for that.
Finally, in terms of non-fraudulent explanations, in New York City, where your first wave of exit polling missed the final spread by sixteen percentage points, there were over 121,000 affidavit or provisional ballots cast. This equals 12% of all New York City ballots. Could this account for all of the problems in New York and does Edison ask people whether they voted provisionally or not?
The exit poll did not ask voters if they had voted by provisional ballot. I will not know how much effect that had on our estimates until I see the certified vote returns from the New York Board of Elections.
[Question 10 and its answer is being withheld until the final article for this series]
It seems like a tougher field on the GOP side with so many candidates, including not one but two anti-establishment candidacies. In Georgia, for instance, where the first wave missed on the Democratic side by 12.2%, it nailed the GOP race with deadly accuracy: 40% Trump (versus a 38.8% finish), 24% Cruz (versus a 24.4% finish) and 23% Rubio (versus a 23.6% finish). It looks like you’ve missed the margin of error just once for Republicans. In Texas you had an ~10.6% error on the gap between Cruz and Trump. This makes sense. We often hear the margin of error is +/-x.x 19 times out of twenty. In this case, Edison has gotten it right on the GOP side within the margin of error 20 times out of 21 (for the figures I can find). On the Dem side, you’ve gotten it right within the margin of error just 13 times out of 22. Is this information correct and if so, why has Edison polling been so much more accurate on the Republican side this cycle?
As I mentioned above the calculation of total error for an exit poll survey differs from the standard sampling margin of error calculation that I assume that you are using so I wouldn’t agree with your statement about how many of the exit poll surveys were within the margin of error. However, if a differential non-response among younger voters is a cause for exit poll errors it would make sense that the errors would be larger on the Democratic side because the differences in vote between younger and older voters on the Democratic side in this primary season are much larger than on the Republican side. Bernie Sanders has been typically receiving 70+% of the vote among 17-29 year olds in the 2016 primaries while Hillary Clinton has been receiving 70+% of the vote among voters 65+. On the Republican side the Trump percentage among younger and older voters tends to only differ by ten points or less. It would then make sense that if the exit poll were overstating the number of younger voters it would have much more effect on the Democratic side.
Three final questions:
Given the level of scrutiny after New York, has Edison or will Edison be changing any of its polling practices for the final states you will be exit polling?
After each election Edison Research analyzes the exit poll results. We are constantly evaluating all of the exit poll procedures to make sure that we have the most accurate exit poll survey results possible.
Would you be willing to share the data with me that was released to NEP for the first wave of polling from 2016 Democratic and Republican contests. If not, could I at least have the topline from the first full wave for gender for the New Hampshire Democratic contest (that’s the one I’m missing)?
All exit poll data is archived at the Roper Center for Public Opinion research. Any member of the Roper Center can analyze archived exit poll data.
I very much appreciate your willingness to give me some of your time and to go on the record for these issues. Do you have anything else that you would like to add that hasn’t been covered here?