On the ‘Name Recognition’ Shibboleth in the Democratic Primary Data Discussion

Now that the 2016 Democratic Primary cycle is (mostly) over — will it ever be entirely? — it has become fashionable once again to forward name recognition as a reasonably large factor in what distinguishes one or more candidates from the others in early polling of the field at large and amongst specific demographic sub-groupings. As The Intercept’s Ryan Grim has noted in an article on Bernie Sanders having double the support among African American likely Democratic Primary voters as Kamala Harris, the name recognition argument was “casually dismissed when made by Sanders supporters” in 2016.

“Casually” puts it somewhat mildly. Rudely and perfunctorily would do better. Meanwhile, the same crowd is enthusiastically chalking up almost all of Joe Biden and Sanders’ poll strength over their preferred candidates to the name recognition shibboleth.

Put most simply, the ‘name recognition’ argument suggests that even large gaps in polling support might best be explained by how well candidates are known by voters at this stage in the race rather than by the likelihood that those differences may hold when voting begins a year or so from now.

The graph at the top of this article, as well as the one that follows this paragraph, indicate that while name recognition might explain as much as 50% to 70% of data variances between candidates’ support at this stage, it is far from a slam dunk that this is the only factor at play, if it is even the most dominant one. Using a wide variety of potential candidates, including ones very unlikely to run (Oprah Winfrey, Michael Avenatti, Michelle Obama, and Hillary Clinton) and ones who have recently announced they will not be running (Michael Bloomberg and Sherrod Brown), I have plotted name recognition (y-axis) against the best support the person has received in a 2018 or 2019 poll (x-axis). Name recognition, in these ways of modelling the data, may have some explanatory power, but is far from what data minded people would hope for in terms of a nice cluster of entry points along a linear or evenly curved line moving from bottom left to upper right in a graph. For those as I am, not formally trained in statistics, the R² number in the bottom left is a statistical measure of how much of the variance among data is explained by a regression analysis, provided the entries are accurate. If the R² was at or near zero, the factors in play would be said to explain none of the variances, while the closer the number reaches to 1, the closer to a perfect explanation of the variations the interaction among plotted features has reached. Narrowing our data down to currently announced or reasonably likely potential candidates, including a curved power trendline rather than a linear one, and using a three week average of polling data rather than best poll, we can move the R² from about a 50% variance explanation range to around 70%.

In both cases, candidates or potential candidates below the red dotted line are doing better than the model would expect them to do if ‘name recognition’ was a perfect fit for how candidates are performing right now, or performing in their best poll. Candidates above the red dotted line are performing from a bit worse (closer to the line) to far worse than expected, given their name recognition, the further they are above it. In the top chart Michael Bloomberg, whose name recognition according to Gallup polling is near 90% but whose best poll was around 8%, is well above the line and a good example of why being well known is not enough to guarantee good polling.

By the same token, this analysis has given me a reason to reconsider my skepticism about Beto O’Rourke’s potential to do quite well. The highest measurement of name recognition I can find for him is 61% in the most recent Morning Consult data to measure his favorability. But his best poll was a remarkable 21% as measured by Change Research just before Christmas. While he has now fallen to 5% or so in the three-week average, that still puts him higher than would be expected given how well-known he is by voters at this stage. If, as expected, he officially joins the race later this week and has a good kick-off bounce, he could well rejoin the small cadre of candidates regularly polling in the double digits.

But there are clearly other factors beyond name recognition at play: 1) proximity to Barack Obama (see Michelle Obama and Joe Biden’s high support as well as the impact of favorable comments by Obama about Harris and Beto) 2) real or perceived ability to beat Trump, much of which can be measured by polling (“Bernie would have won,” Joe Biden’s favorability ratings and consistent double digit leads against Trump, and polls showing Democrats most want a candidate who can beat Trump, for examples) 3) proximity to the movement Left led by Bernie Sanders, including the ability to attract small donors rather than relying solely on large, corporate contributions 4) which candidate is being hyped by CNN and FiveThirtyEight as the flavor of the month. Beto fit the bill for the latter in December and saw a huge bump in support accordingly. Elizabeth Warren was the “it-candidate” briefly in the first two weeks of January after announcing early and leading in the first DailyKos straw poll. That place was then taken up by Harris from mid-January to mid-February, but her bubble appears to have popped a bit over the last several weeks with a real or relative decline in each of the last eight state or national polls since Sanders announced his candidacy on February 19.

In keeping with my analysis of the data, I have added a name recognition adjustment to my updating weekly candidate rankings, to be found in the Twitter thread here. By January 2016, Bernie Sanders had reached around 85% name recognition (about the level at which Elizabeth Warren is now). For candidates in the top eight spots, the rankings will generously assume that they can perform at least that well, and their support has been adjusted upward on a linear basis to a 85% level. Candidates in the 9th and 10th spot are assumed to be able to reach at least 75% if they can make the debates and run a decent campaign, and candidates ranked 11 or below are adjusted up to 65% name recognition if they have not yet reached that level.

As for the weekly rankings, Biden continues to lead, but that lead has shrunk a fair bit as Sanders jumped 5% on improved poll nationally. Adding the name recognition adjustment for candidates vaulted O’Rourke into 4th spot, displacing Warren to fifth. I have also added Stacey Abrams (impressively already at 6th spot), Marianne Williamson, and Andrew Yang while Bloomberg and Brown have been removed.

Doug Johnson Hatlem writes on polling, elections data, and politics. For questions, comments, or to inquire about syndicating this weekly column for the 2020 cycle in your outlet, he can be contacted on Twitter @djjohnso (DMs open) or at djjohnso@yahoo.com (subject line #10at10 Election Column).