High School: Statistics and Probability
High School: Statistics and Probability
Interpreting Categorical and Quantitative Data HSS-ID.C.8
8. Compute (using technology) and interpret the correlation coefficient of a linear fit.
We all know what happens when you assume. Yeah, it makes a...fool...out of you and me.
Students should know that we need to check our assumptions, especially in math. So, for example if we fit a linear model to a set of data, we can check and make sure that assumption was at least somewhat appropriate. We don't want to make a fool of the data, and we certainly don't want the data to make a fool out of us.
The correlation coefficient is a number that measures the strength of association between two variables. In particular, the Pearson product-moment correlation coefficient is a measure of the linear association between two variables. It was named after Karl Pearson, who's the reason your students are studying statistics since he is considered the "father" of the field. Tell your students to pelt him with spitballs.
We're sure Pearson won't mind if your students just call his coefficient the "correlation coefficient." They should, however, remember that it has the symbol r and that it ranges from -1 to 1. A coefficient equal to 1.0 suggests a positive correlation between the data. This means that as the independent variable (x) increases so does the dependent variable (y).
A correlation coefficient equal to -1.0 suggests a negative correlation between the data, or as the independent variable (x) increases, the dependent variable decreases. Positive is positive, negative is negative. Hopefully not earth-shattering for your students.
If the coefficient equals 0, we have made an incorrect assumption. The data has made a fool of us and there is no linear correlation. However, just because the linear correlation coefficient equals 0 doesn't mean there is not another type of correlation between the data.
In addition to being positive or negative, the correlation coefficient can be weak or strong. Strong correlations can bench press 400, while weak ones can barely lift a dumbbell. The closer the correlation is to -1 or 1, the stronger the correlation. An arbitrary cut off for a strong correlation is less than -0.8 or greater than 0.8. If r is between -0.5 and 0.5 we consider that a weak correlation and send the data back to the gym.
Students should know that the correlation coefficient can be calculated with the following formula:
This equation is pretty complicated and, quite honestly, a bit of a pain to use. If you want to make your students calculate it by hand, make sure they do so carefully. Luckily, there are lots of ways to use technology to calculate this value. Students can use their TI calculators or Excel. Hip, hip, hooray for technology.