Tuesday, November 4, 2008

WHAT ABOUT AVERAGING POLLS?

There is some merit in the idea that one should double check data for accuracy. This is generally done by those taking and recording serious data. Sometimes triple takes and more are conducted
to make sure we are not making a mistake.

It can be fun using vernier devices that allow for fantastic levels of apparent precision. Something so simple, yet so perceptually complex. Here is an interface between calculation and geometry. A genuine focus is necessary to learn these skills which like anything else can become easier and even automatic "skills" with practice.

Then we want more samples of data if we want to make a somewhat precise stab at characterizing a group along this or that parameter. So we have polling and maybe a thousand or two thousand people are questioned to come up with an effort to characterize the political views of some part of the population. In Presidential polls we are generally measuring the opinions of "likely voters" or some designated other group but "likely voters" is, the most likely! We see that such polls are said to have a margin of error of about two points for a population of over a hundred mllion people.

So that truly funny math, probability and statistics can presumably guide us as to whether we can somehow average the results of various polls.

Of course, polls are designed to find out similar things in these elections. We might say that they are instruments that ought to be highly positively correlated, but are they actually correlated p > .80 ? I don't think so my friends and this is probably way above any actual correlation. Doesn't this have profound implications for adding and subtracting means and other statistics computed from separate databases?

Of course if there is a legitimate way to compensate for these and other problems we can come up with odds for things that seem beyond our power to estimate odds at this time. I am assuming there is some sort of more or less statistically sane meta study analysis. I am assuming a lot of this data is already gathered with some statistical notions in mind, eg. the size of the same groups and polling questions that are useful to a political campaign, a news source or some other propaganda operation directed at slanting the polls a particular way.

Adding the means or other statistics from two different polls is somewhat like adding up the number of apples and potatoes in order to find out how many pieces of fruit there are. The potato may be the apple of the earth for the French but it is still a vegetable to me. Making the correct assumptions, seeing the actual fluctuation of data in a meaningful way is the crucial thing in these otherwise tiresome calculations. Of course once you realize that your calculations are a gift to treasure it is all worth it.

What to do about the devilish errors though and my complete bovine ignorance about so many things mathematical continue to challenge an often unenlightened curiosity? For this I have found it useful to contemplate my blessings, eg. that I was not born divine, yuck, yuck!