Lecture 16Correlation andCausation

I"d like to make a few commentsabout correlation and causation. It is often said that a correlation betweentwo variables does not imply causation. This is generally true. Just becausethe amount of time spent working on an exam is positively correlated with thegrade on the exam does not mean that spending more time on an exam willnecessarily lead to a better grade. The two might be correlated for some otherreason.

You are watching: Third variable problem definition

In some instances, the adage aboutcorrelation and causation is actually not true. Consider the situation in whichwe have a well conducted experiment that compares two groups. We compute acorrelation coefficient and there is a significant relationship between theexperimental variable (i.e., group membership) and the dependent variable. Thiswould, of course, be the same as conducting a t-test. Assuming we haveeliminated any confounds and we used random assignment, we should be able toconclude that the independent variable caused the dependent variable. So inthis case a correlation between the independent variable and the dependentvariable indicates that the independent variable caused the dependent variable.

If a non-experimental study isconducted, there is likely to be some alternative explanations for arelationship between the independent and the dependent variable, because we didnot randomly assign participants to experimental conditions.

See more: Which Of The Following Statements Is True Of Licensing? Attention Required!

A Couple of CommonAlternative Explanations for a Correlation Between X and YThere are three general reasons we cannot conclude that X caused Y just becauseX and Y are correlated (in cases where X is not experimentally manipulated).First, we don"t know for sure that Y did not cause X. I"ll use arrows toindicate causation.

X----> Y (X causes Y)

but, it might be that Y causes X asin the following diagram:

Y-----> X (Y causes X)

Second, we don"t know that theydon"t cause each other:

Finally, there is often a thirdvariable that might cause both X and Y as this diagram points out:

The latter is referred to as the"third variable problem". My favorite example of the third variableproblem is the correlation between the number of fire hydrants in a city andthe number of dogs in a city. Cities with more fire hydrants tend to have moredogs. Why the relationship? A third variable. Any guesses what the third variablemight be?