How to spot a wonky poll

Polls can be wrong, I hesitate to call a poll from a reputable source FAKE.

How to spot a wonky poll

OUTLIER MAYBE, but calling a poll from reputable source fake, is to republican for me…

Some of the expectations about the 2022 elections based on polling:

Expectations that Republicans would score sweeping victories no doubt were buoyed by the predictions of RealClearPolitics. It projected that the GOP stood to gain three Senate seats and control the upper house by 53 seats to 47 — an outcome that proved illusory.

While hedged, the final, so-called “Deluxe” forecast posted at Silver’s and updated on Election Day did little to dampen expectations of a GOP wave. The forecast said Republicans had a 59% chance of winning control of the Senate.

“To be blunt,” Silver wrote, “59 percent is enough of an edge that if you offered to let me bet on Republicans at even money, I’d take it.”

We all know how 2022 turned out, NO RED WAVE as was predicted by many.

The Conversation

As an election nears, many partisans start heading for the nearest window ledge every time they see a poll bearing unwelcome news. Sometimes they have good reason to be concerned, but often the poll is an outlier, one not particularly representative of other polling conducted contemporaneously. There is an old saying that if something looks too good to be true, it probably is. In polling, a good rule of thumb is that if the results of a poll look very different from others, it is more likely than not an outlier.

Cook Political Report

It’s pretty rare that a pollster calls his own survey an “outlier.” But that’s exactly what happened last week after a Monmouth University poll showed an approximate three-way tie between Bernie Sanders, Elizabeth Warren and Joe Biden. Patrick Murray, director of the Monmouth University Polling Institute — an A-plus-rated pollster according to FiveThirtyEight — issued a statementdescribing his latest Democratic primary poll as an outlier that diverged from other recent polls of the race. (Indeed, there were quite a few national polls last week, and most of them continue to show Biden in front, with about 30 percent of the vote, and Sanders and Warren in the mid-to-high teens.)

But Murray doesn’t have any real reason to apologize. Outliers are a part of the business. In theory, 1 in 20 polls should fall outside the margin of error as a result of chance alone. One out of 20 might not sound like a lot, but by the time we get to the stretch run of the Democratic primary campaign in January, we’ll be getting literally dozens of new state and national polls every week. Inevitably, some of them are going to be outliers. Not to mention that the margin of error, which traditionally describes sampling error — what you get from surveying only a subset of voters rather than the whole population — is only one of several major sources of error in polls.


Why outlier polls are a good thing

The big reason why pollsters should publish outlier polls is that it proves they’re running an honest shop. Every poll has a margin of error, but remember that the margin of error (as traditionally reported) covers only 95% of all results. There is going to be that one out of 20 times when a result falls outside the margin of error.

With so many polls being conducted of the Democratic race nationally, we should see a decent amount of polls that fall outside the margin of error of the average. Monmouth was one of those cases. There should be many more “outliers” to follow.

In statistics, an outlier is a data point that differs significantly from other observations.[1][2] An outlier may be due to a variability in the measurement, an indication of novel data, or it may be the result of experimental error; the latter are sometimes excluded from the data set.[3][4] An outlier can be an indication of exciting possibility, but can also cause serious problems in statistical analyses.

Outliers can occur by chance in any distribution, but they can indicate novel behaviour or structures in the data-set, measurement error, or that the population has a heavy-tailed distribution. In the case of measurement error, one wishes to discard them or use statistics that are robust to outliers, while in the case of heavy-tailed distributions, they indicate that the distribution has high skewness and that one should be very cautious in using tools or intuitions that assume a normal distribution. A frequent cause of outliers is a mixture of two distributions, which may be two distinct sub-populations, or may indicate ‘correct trial’ versus ‘measurement error’; this is modeled by a mixture model.