Nielsen now gives it to Libs |

The latest Nielsen poll, results published in the Fairfax papers on July 31, gives the election to the Coalition by 52% to 48%. It's at the outer edge of where the polling results have been and could be just a statistical fluke. Some commentators say the result represents too large a move and that moves like this don't happen in the real world. So why do these commentators bother with polls if they know what happens in the real world better than the polls? The Nielsen poll is part of a pattern across all the polls which shows Labor and the Coalition neck and neck. A tightening was always likely to happen, as we saw last election with Labor's early lead evaporating. Whether the Nielsen poll is accurate we will never know because an election wasn't held over the period, but it may have an effect on the real poll. At the moment all polling is showing that the vast majority of voters expects a Labor win. That means that they are not thinking about Tony Abbott as a future prime minister. Our qualitative polling shows that when they do think about him in terms of future prime minister he is a negative for himself. Labor should hope for a few more polls like the Nielsen one to get their campaign back on track. There is three weeks yet to go before the election, and that is a long time for sentiment to move around. |

- April 2011 ( 1 )
- August 2010 ( 16 )
- February 2010 ( 2 )
- July 2010 ( 12 )
- July 2011 ( 1 )
- June 2010 ( 4 )
- March 2010 ( 7 )
- May 2010 ( 5 )
- September 2009 ( 3 )
- September 2010 ( 2 )
- September 2011 ( 3 )

Sponsored Advertisement

## Comments

2I thought you lot died out with the abacus.

I'm quite happy to side with every national bureau of statistics agency and fed reserves,let alone core sectors of the capital markets like the reinsurance industry on this one thanks.

The fact that you are still sitting here banging on mercilessly about *either* Morgan or Nielsen having to be an outlier goes to the heart of the problem - a large failure to understand the statistics involved.And you have the hide to call other people charlatans when you're obsessed by a mathematical non-sequitur!

Let's say, hypothetically, we had 5 polls taken at the same period by 5 separate organisations (which we now actually do in reality), where house effects were zero and the population was equally accessible, creating zero design effect on the confidence interval.

Let's say four of those 5 polls produced very similar results, (where those 4 overlapped on a tight range of given values with a respective high probability density) and one poll - the 5th poll - did not. With this 5th poll, only the right hand tail of its distribution overlapped this key range, but out beyond 1.5 standard deviations from its mean.

To really show the silliness here:

- I would be saying one of these polls is undercooked - even though that poll in isolation was still within 2 standard deviations of the mean of the true underlying value of public opinion derived from all those 5 polls (not an outlier in any true sense of the word). I could also say the size of the probability involved.

-you, on the other hand, would be arguing that the 5th poll is actually really good because some other single poll taken a day earlier was an actual outlier in the opposite direction! To make it worse, you then say that any analysis of the combined probabilities of those 5 distributions that says anything to the contrary is simply the work of a charlatan!

Are you trying to be Christopher Monckton of the polling world or something - because you're doing a damn good job of it at the moment?

When you throw a dice each time there is a one in six chance that a six will come up, no matter how many times you have thrown six in a row before. So polls last week have no bearing on polls this week.

When you conduct a poll this week with a particular sample size there is a probability that it will be correct each time. The fact that two polls come up with the same answer slightly increases the probability that they are correct because you increase your sample size.

If you had done the calculations on this one properly by combining the samples you would find that Morgan is the outlier. End of story. It doesn't make any of them right.

You can claim to have the weight of the insurance industry behind you, but as anyone who reads financial statements knows insurers regularly make under-writing losses, and more often than not only make profits from their investments, because the models aren't the real world, they are just best guesses.

What you are doing is substituting best guesses for reality, and then claiming they are real. As there aren't elections every week you get away with this because most people have no idea what you are talking about.

You also seem to have no idea how slippery the concepts are that you are measuring. When someone takes a poll mid-term respondents aren't thinking in terms of alternatives. They are referenda on the government, by and large. When you get into an election period things change significantly.

The mathematics you are applying to polling is just an elegant parlour game - it has no bearing on anything. It tells no-one anything they can use, and it's not properly conceived. I guess the first follows from the last.

You reckon you're a pollster and on your site you had a sarcastic and derogatory swipe at Pollytics where:

a) You didn't have the faintest clue about what I was doing, its methodology nor its robustness - anything about it at all actually. Just some ho hum assumption that you made up in a drive by smear.

b) When pinged on it, you've fallen back on simplistic nonsense (and a raging non-sequitur) like declaring that one poll must have been an outlier in order to justify another result (when there is not actually a *single* *piece* *of evidence* to suggest any outlier here by any pollster over the entire campaign so far). Worse, you've resorted to using ridiculous high school levels of mathematics in an attempt to justify it e.g your latest:dice! As if discrete uniform probability distributions are even remotely relevant here, when this whole thing is *about* probability density (which, for others - and maybe Graham - a dice has an equal probability density across the full range of its 6 possible values, unlike survey results which, for good pollsters, are distributed approximately normally, exhibiting a higher density towards their center than they do towards their tails). It doesn't even make for an accurate analogy.

The pooled sample example is another - pooling requires specific and necessarily arbitrary definitions of the time period that polls can be pooled within. From what date do we start and end the pooling? Why those dates? (no - convenience isn't an answer Graham). Do we start on the 26th? The 28th? - Why?. When do we end - the 29th? The 31st? The 2nd? Why? What makes those dates more worthy than a day or two either side? (Again, no - convenience isn't an answer here Graham) The arbitrary choice will determine your ultimate, and as a consequence, ultimately arbitrary outcome.

It is a third best option when we already have the data we need to work it out - the probability density and uncertainty of each poll. A standard but slightly more complicated method which doesn't destroy the time-dependency of information (time dependency such as public opinion moving across the entire period in question here - which we know occurred from the 25th to the 2nd with p

It is a third best option when we already have the data we need to work it out - the probability density and uncertainty of each poll. A standard but slightly more complicated method which doesn't destroy the time-dependency of information (time dependency such as public opinion moving across the entire period in question here - which we know occurred from the 25th to the 2nd with p keep complaining about abuse while calling people (me) charlatans.

I have nothing further to add - it's already been said.

I'll leave it to readers to make up their own mind here on just which one of us needs to get our act together.

It is a third best option when we already have the data we need to work it out - the probability density and uncertainty of each poll. A standard but slightly more complicated method which doesn't destroy the time-dependency of information (time dependency such as public opinion moving across the entire period in question here - which we know occurred from the 25th to the 2nd with p less than 0.01, and we knew at the time of Nielsen with p less than 0.04 ). Pooling destroys information we want in this case, without letting us use any of the benefits that pooling can otherwise provide.

c) Then, after all this nonsense, from the initial drive-by smear through to the half-baked justifications, from the the simplistic, almost high-school statistics examples not up to the job through to the naivety and arrogance (nice mix by the way) - you have the hide to keep complaining about abuse while calling people (me) charlatans.

I have nothing further to add - it's already been said.

I'll leave it to readers to make up their own mind here on just which one of us needs to get our act together.

Presumably your business depends to some extent on your analysis on your blog so I can understand you wanting to defend it, but I can't understand your defensive aggression which would deter me from ever recommending you to anyone, and I would have thought would have the same effect on anyone reasonable person reading this exchange.

When you make a mistake, the smart thing to do is to admit it.

What time period would I recommend we use? Well, looks like the polls were all done during the same three days, so let's use that time period. Is this convenient? Yes it is. Is it poor methodology? No it isn't. You have no idea when the various polling organisations did most of their interviews in that time period so your high falutin maths may well be dealing with the wrong things if you think you are taking some sort of time dependency into account.

But of course you don't think this because you note in your comments you don't know how many days Galaxy was in the field. So much for the time period counting! If it did you wouldn't have been able to do your calculation missing this bit of information.

Typically fewer successful interviews occur as a poll progresses, because you are hunting for the more elusive representatives in the community, such as young men under 30. They're sometimes so hard to find that their responses are faked by weighting a smaller sub-sample so it looks like a larger one.

And that's just one of the imprecisions.

Being good at maths is one thing. Understanding the weaknesses in the mathematical approach, and the problems inherent in the data you are trying to measure, is another.

Then knowing what to do with the real world implications is another thing again.

You fail on all of these. And you fail on the basic mathematics. In all the polling that has occurred since this exchange started Nielsen is an outlier, but not by as large a margin as Morgan.

Your trend is probably catching up with this fact. But that is the problem with trends. They're not leaders, they are followers. Pay more attention to the actual polls and less to your preconceptions (mathematical or political) and you'll do a lot better.

2RSS feed for comments to this post