Polling and Statistical Models Can't Predict the Future

Statistics are like fortune-telling
Polling and statistical models are surrounded by an undeserved air of finality. There is an assumption that applying statistical rigor suddenly makes sense of nonsense. From political polling to business measurements, these tools are used to boil complex behaviors down to nuggets of "truth".

This works fine for simple ratios. It falls apart when used to describe how and why groups of people make decisions.


The kinds of statistical tools I have in mind range from political polls to web analytics to business intelligence. Used to predict how people will vote or what will make people purchase a product, they should be viewed with deep skepticism.

People Are Unpredictable

"Predictive analytics" carries a lot of buzz, but the forecasts it provides about the future are wrong. The answers are especially sinister because they come with the cachet of complex formulas and confusing, albeit persuasive, diagrams.

When talking about human decision-making, they start with the fatal assumption that it is simple enough to be predictable.

Even with the generous assumption that we can understand exactly what combination of variables prompted somebody to make a specific decision, people are constantly changing. In fact, they are probably different since the last time they were measured.

The Environment

People are influenced by their environment in innumerable ways. Trying to understand what they will do next assumes that an analyst can make a comprehensive list of all the influential variables and measure them. But people's environments change even more quickly than they themselves do. Everything from the weather to their relationship with their mother can change the way people think and act. All of those variables are unpredictable. How they will impact a person is even less predictable. If put in the exact same situation tomorrow, they may make a completely different decision. This means that a statistical prediction is only valid in sterile laboratory conditions, which suddenly isn't as useful as it seemed before.

Blame the Analyst

Measuring changes the measurement
The Heisenberg Principle plays havoc with predictions, too. If somebody knows they are being measured, their behavior will change. They will pay more attention to what they think they are being measured for.

Finally, in putting together a model that predicts the future, analysts are their own enemy. They unconsciously start with their own biases and assumptions. The model will reflect what they think is the most logical way to view a situation, what they think should stand out to a person and why. They find correlations in data and try to explain them through the lens of their own experiences. Despite an analyst's best efforts, their own fingerprints will be on their forecasts.

When It's Believable

Polling and statistical models are wonderfully useful in predicting non-sentient activity, though — chemical reactions, computer programs, medical applications. When the activity doesn't have to do with people making decisions, the numbers can be trusted. Or when it points out the most basic correlations (eg. when a person gets hungry, he is very likely to eat food within six hours).

Otherwise, consider all statistical forecasts caveat emptor.

2 comments:

  1. Poles are so fully manipulated as to make them completely useless. . . . . unless they show that my guy is winning!

    ReplyDelete
  2. Certainly human behavior cannot be predicted accurately in general - agreed that we're unpredictable. However, predictive analytics is valuable because you do not need to predict accurately - predicting better than guessing is often more than sufficient to drive mass scale operations more effectively, e.g., in the targeting of marketing, fraud investigation, law enforcement investigations, financial credit risk assessment, etc. In my book "Predictive Analytics," I call this "The Prediction Effect" - a little prediction goes a long way.

    ReplyDelete