Can't read this? Click here .

DOWNLOAD MODELRISK | WEBSITE | CONTACT US

Fooled by Correlation

By Stijn Vanden Bossche

What always interested me is the degree to which one variable is related to another and how often this correlation is present, even though you wouldn’t expect it at all. However, it seems that not every modeler shares my opinion. We find that bad output results in risk analysis models are often caused by a lack (or wrong usage) of dependency modeling techniques.

For me, the first step is to look at the available data and fit it to a correlation structure to see if there is any pattern and what pattern shows up. The next step is harder as you should think about where the correlation could come from(?). I found some funny/weird correlations through surveys and research that seem completely unrelated, but that must have an explanation or common driving factor. I’ll give it my best shot to find an explanation, but please feel free to help me out and send other possible explanations and thoughts directly to me. We are organizing a free webcast on correlation, but more on that at the bottom of this email.

Correlation 1: 74 percent of people who can type without looking at the keyboard prefer a restaurant to a fast food chain, compared with 56 percent of people in general.
My first thought is: the next time I’m in McDonalds I will tell this trivia fact to the female cashier in the hope she wants to prove herself and mistypes the amount I have to pay.

My second thought is: people that can type without looking are most likely office workers. They are more used to going to restaurants for lunch meetings, dinners with clients. Also, office workers tend to earn more than non-office workers and we all know how much you need to pay for a juicy steak (or a cashew-mushroom pâté for the vegetarians among us) in a good restaurant.

Correlation 2: 82 percent of people with tattoos prefer hot weather over cold, compared with 63 percent of people in general.
Contrary to popular belief I think people with tattoos are happy with their body and want to decorate it, similar to women (sometimes men) putting on lipstick, wearing earrings, … Although some tattoos are very personal, a lot of tattoos are a way to express your identity and beliefs towards the outside world. The hotter it is, the less clothes you have to wear and therefore the more you can show your beliefs and identity.

Correlation 3: A UK study showed there was a strong correlation between wearing coats and car accidents.
A first study proclaimed that wearing a coat would obstruct the driver in his movements and therefore making it more dangerous to drive the car. The explanation for this correlation was wrong. A second study showed that the coats actually didn’t obstruct the free movement. Most people wear coats when it rains. In this example rain is the hidden, common “driving” factor.

In this line of thoughts more accidents occur when the windshield wipers are on, therefore we should consider turning off the windshield wipers.

Other funny/weird conclusions:

  • 26 percent of people who never took a ride on a motorcycle are multilingual, compared with 40 percent of people in general.
  • 15 percent of people who don’t like mayonnaise are good dancers, compared with 29 percent of people in general.
  • In general, 33 percent of people say their friends are mostly of the opposite sex. But among those with oily hair, 46 percent say their friends are mostly of the opposite sex.

If you can come up with a good explanation of why these variables could be correlated, please send me an email with some good arguments. The winner gets a risk analysis training video of their choice (see www.vosesoftware.com/trainingsuite.php) and will be announced in the next newsletter as sharpest knife in the drawer.

Other dangers in the world of correlation
A big danger in correlation is the assumption that correlation is always symmetrical. Rank order correlation, for example, implies a symmetrical correlation. This assumes (in positive correlation) that when one variable is low the other is also at the low end and when the variable is high the other is also at the high end. We often see that variables are only strongly correlated at one end but at the other end very poorly. For that reason we can’t use symmetrical dependency modeling technique and have to revert to assymetrical, such as the clayton copula.

More information on copulas can be found here.

Want to know more about correlation?
Every week we provide a free webcast on a specific risk analysis topic. On November 19 the topic of the webcast is Correlation. As mentioned before, correlation is a critical component of a risk analysis model in any field. Failure to correctly model the correlation between uncertain variables will give misleading, and sometimes dangerously wrong, results yet it receives very little attention. On this webcast we will show how to correctly model correlation using symmetrical and asymmetrical copulas. Subscribe below.

Because correlation is so important ModelRisk uniquely offers the largest possible set of correlation tools including the emperical copula.

A free 15-day trial of ModelRisk can be found here.

Free Webcasts

In an effort to promote good practice in risk modeling we have launched a set of free webcasts. With this initiative we want to gather risk managers and decision takers to engage them into a discussion on different topics. The first 15 minutes an expert will give a lecture on a specific feature, method or industry example model. In the next 15 minutes we want to have an open discussion with the participants where they can share their thoughts on that particular topic. After 30 minutes we will make time to answer general ModelRisk questions. The topics for the next 4 weeks are displayed below. If a certain topic interests you, please subscribe. We already provided two webcasts that will be available on our website shortly.

Title Lecturer Date Time Subscribe
An introduction to the PK/PD module in ModelRisk Timour Koupeev Nov. 12, 2012 11AM EST / 3PM UTC
Correlation – what it is, why it’s important, how to estimate it, how to implement it David Vose Nov. 19, 2012 11AM EST / 3PM UTC
Linking ModelRisk to databases Timour Koupeev Oct. 29, 2012 11AM EST / 3PM UTC
Splicing Distributions David Vose Dec. 3, 2012 11AM EST / 3PM UTC

We Are Here For All Your Risk Analysis Requirements:
Software, Training & Consulting

If you don't want to receive Vose Software newsletters, please click here to unsubscribe.