Sometimes a correlation means absolutely nothing, and is purely accidental (especially when you compute millions of correlations among thousands of variables) or it can be explained by confounding factors. For instance, the fact that the cost of electricity is correlated to how much people spend on education, is explained by a confounding factor: inflation, which makes both electricity and education costs grow over time. This confounding factor has a bigger influence than true causal factors, such as more administrators / government-funded student loans boosting college tuition.
Even when there is a correlation that can be leveraged to solve a problem, for example a drug that was found to be better than placebo to help with a medical condition, it may work well for some people, and not well for others: the correlation is not universally strong. Also, causation only matters in specific contexts such as root cause analysis, where you need to fix the cause. Sometimes, it does not matter as long as it works, for instance a drug that works against a medical condition even if nobody knows why.
For more articles about cause versus correlations, or correlations in general, click here. Besides, the standard correlation (an L^2 metric) is sensitive to outliers, and indeed, not a great metric. This L^1 metric (to measure correlation) is more robust.
Below are a few examples of spurious correlations. Click here to check out the 15 examples.
DSC Resources
- Career: Training | Books | Cheat Sheet | Apprenticeship | Certification | Salary Surveys | Jobs
- Knowledge: Research | Competitions | Webinars | Our Book | Members Only | Search DSC
- Buzz: Business News | Announcements | Events | RSS Feeds
- Misc: Top Links | Code Snippets | External Resources | Best Blogs | Subscribe | For Bloggers
Additional Reading
- What statisticians think about data scientists
- Data Science Compared to 16 Analytic Disciplines
- 10 types of data scientists
- 91 job interview questions for data scientists
- 50 Questions to Test True Data Science Knowledge
- 24 Uses of Statistical Modeling
- 21 data science systems used by Amazon to operate its business
- Top 20 Big Data Experts to Follow (Includes Scoring Algorithm)
- 5 Data Science Leaders Share their Predictions for 2016 and Beyond
- 50 Articles about Hadoop and Related Topics
- 10 Modern Statistical Concepts Discovered by Data Scientists
- Top data science keywords on DSC
- 4 easy steps to becoming a data scientist
- 22 tips for better data science
- How to detect spurious correlations, and how to find the real ones
- 17 short tutorials all data scientists should read (and practice)
- High versus low-level data science
Follow us on Twitter: @DataScienceCtrl | @AnalyticBridge