How To Calculate The Correlation Coefficient (With Examples)

By Samantha Goddiess - May. 18, 2021
Articles In Guide

Find a Job You Really Want In

Correlation does not imply causation.

It is a commonly used phrase, one you’ve more than likely heard at one point or another. It means that a cause-and-effect relationship cannot be determined just because there is some observed correlation between two variables.

We can’t infer causation simply because there is some degree of correlation.

We may observe that those who focus a lot of attention on their core during workouts lose belly fat. The fat loss could result from working out more than usual; it is not necessarily due to the core work.

Have you ever wondered where the phrase comes from? If you’re a statistician or you’re familiar with the correlation coefficient, then you’re already aware.

Despite its use in popular culture, this phrase is directly related to the correlation coefficient.

What Is the Correlation Coefficient?

The correlation coefficient is a statistical measure of how strong a linear relationship is between two variables.

Correlation is based on observational data, so while we may determine a strong relationship between two variables, it does not necessarily indicate that one causes the other.

Thus the phrase “correlation does not imply causation.”

To calculate the correlation coefficient, you will need a numerical value to represent each variable, your “X” and your “Y.”

There should be no outliers in the data; the association should be completely linear.

Pearson Product-Moment Correlation (PPMC)

The Pearson product-moment correlation, more commonly referred to as the Pearson correlation coefficient, is the most common method for determining correlation. This method was developed in 1895 by Karl Pearson, a British statistician who is considered a founding father of modern statistics.

If you’ve ever taken a statistics course or have previously calculated correlation coefficients, then you have most likely encountered the Pearson correlation coefficient.

Luckily, for those who are not as mathematically inclined, all basic spreadsheet programs and statistical applications have correlation functions to run this formula for you. TI 83 calculators can find the correlation coefficient with the right function. You can also find plenty of pre-made calculators available on the world wide web.

If you want or need to calculate the correlation coefficient by hand, there are steps to follow.

  1. Make a chart. Just as you would in a spreadsheet, you need to chart out all of your information. You want to include a column for your x variable, your y variable, xy, x2, and y2. Use the given information to fill out the chart.

  2. Find the Σ for each column. This would be the Sum formula in Excel if we were using a spreadsheet. Find the sum of each column and put it at the bottom of the respective column.

  3. Find the result. To find the Pearson correlation coefficient, r, you must complete the formula. This formula is divided into a numerator and denominator.

    The numerator consists of the following equation:

    nΣxy – ΣxΣy

    The denominator consists of the following equation:

    √ [nΣx2 – (Σx)2] [nΣy2 – (Σy)2]

    In its complete form, that looks like:

    nΣxy – ΣxΣy
    √ [nΣx2 – (Σx)2] [nΣy2 – (Σy)2]

    n is the number of values you have. Σ is the sigma symbol and is used in mathematics to indicate the sum of multiple terms.

No matter how the formula is run, the process and results will be the same.

Some additional terms you should know when calculating the correlation coefficient are covariance and standard deviation.

Covariance is a measure of how the two variables change together. It indicates the direction of the linear relationship or whether it is a positive relationship or a negative relationship. Since correlation indicates both direction and strength, finding the covariance is necessary to find the correlation coefficient.

Standard deviation is a measure of variability. A high variability implies a greater deviation of the values from their mean.

We can determine the correlation coefficient by dividing the covariance of the variables using their standard deviations.

Once the formula has returned the correlation coefficient, r, the relationship can be determined. The PPMC formula returns a value between -1 and 1. The higher the number, the stronger the relationship.

Understanding Your Results

The result’s absolute value will indicate how strong a relationship is, but there are only three potential outcomes.

  1. Positive relationship. A value greater than zero indicates a positive relationship. This means that as the value of one variable increases, the value of the second variable increases as well.

    Shown in a line graph, this will show an upward slope from left to right.

    Example Answer 1:

    The more hours you work, the higher your paycheck will be.

    Example Answer 2:

    As a child grows, so too does their shoe size.

  2. Negative relationship. A value less than zero indicates a negative relationship. We see the value of one variable increase with negative relationships while the other variable decreases in value.

    Shown in a line graph, this will show a downward slope from left to right.

    Example Answer 1:

    The slower you drive, the longer your trip will take.

    Example Answer 2:

    The more you exercise, the less you’ll weigh.

  3. No relationship. A value of zero indicates that there is no relationship between the variables. The two variables do not affect each other at all if the formula returns a result of 0. An increase or decrease in one variable will not produce an increase or decrease in the other variable.

    This will not produce a proper line graph.

    Example Answer 1:

    The amount of tea someone drinks vs. how British they are.

    Example Answer 2:

    The price of chocolate vs. the price of cereal.

Quinnipiac University’s Political Science Department provided “crude estimates” for interpreting the strength of a correlation using the absolute value result of the Pearson correlation coefficient.

Taking the absolute value of the result, you can determine the strength of the relationship between the two variables. The closer the Pearson correlation coefficient, r, is to 1, the stronger the relationship.

  • .70+ — very strong relationship

  • .40 to .69 — strong relationship

  • .30 to .39 — moderate relationship

  • .20 to .29 — weak relationship

  • .01 to .19 — no or negligible relationship

  • 0 — no relationship

Thirteen Ways of Interpreting Correlation Coefficient

Joseph Lee Rodgers and Alan Nicewander submitted an article to “The American Statistician” in 1988 that discussed how you could look at the correlation coefficient. They are as follows:

  1. Correlation as a function of raw scores and means

  2. Correlation as standardized covariance

  3. Correlation as the standardized slope of the regression line

  4. Correlation as the geometric mean of the two regression slopes

  5. Correlation as the square root of the ratio of two variances

  6. Correlation as the mean cross-product of standardized variables

  7. Correlation as a function of the angle between two standardized regression lines

  8. Correlation as a function of the angle between two variable vectors

  9. Correlation as a rescaled variance of the difference between standardized scores

  10. Correlation estimated from the balloon rule

  11. Correlation in relation to the bivariate ellipses of isoconcentration

  12. Correlation as a function of test statistics from designed experiments

  13. Correlation as the ratio of two means

Disadvantages of Correlation Coefficient

We circle back to “correlation does not imply causation.” The Pearson correlation coefficient cannot differentiate between dependent variables and independent variables. In simpler terms, cause, and effect.

The results yielded from this formula can only determine whether or not there is a relationship. Unfortunately, that means the results can be misleading.

A strong correlation between variables does not necessarily mean that one causes the other.

Similarly, zero correlation does not necessarily mean that there is no relationship. The relationship between the two variables could simply be non-linear. The Pearson correlation coefficient cannot capture non-linear relationships.

Illusory correlations, or false correlations, can often occur as well. It is not uncommon to believe a relationship exists between two variables despite there not being one. Full moons cause crazy behavior is a common illusory correlation.

The Correlation Coefficient In Real Life Situations

Sure it’s fancy mathematics. When are you going to use this outside of the classroom? Never, right?

Maybe. But that may be wrong. It all depends on the career path you choose. Your job and industry may make frequent use of the correlation coefficient.

There are plenty of industries that regularly apply the correlation coefficient and its results:

  1. Insurance. Insurance companies calculate their rates based on correlation coefficients. They have different rates based on the age, gender, location, etc., of the potential client.

  2. Stocks. The use of the correlation coefficient is commonplace when investing. It allows you to develop a sound investment strategy and is extremely helpful when deciding which stocks to invest in.

    Investors will often use negative correlation to diversify their portfolios. They may choose a mixture of positive and negative correlating stocks with varying degrees of strength to diversify even further. The correlation coefficient is also used to calculate portfolio volatility.

  3. Medicine. The correlation coefficient is often used in medical research. They can identify causal relationships between certain variables that may help them treat patients better or predict results.

    Correlational studies are a very common type of research done in medicinal fields. This method is used as a preliminary information gathering or in place of an experiment when one cannot be completed.

How useful was this post?

Click on a star to rate it!

Average rating / 5. Vote count:

No votes so far! Be the first to rate this post.

Articles In Guide
Never miss an opportunity that’s right for you.

Author

Samantha Goddiess

Samantha is a lifelong writer who has been writing professionally for the last six years. After graduating with honors from Greensboro College with a degree in English & Communications, she went on to find work as an in-house copywriter for several companies including Costume Supercenter, and Blueprint Education.

Related posts