Module 7: Lesson 3


Introduction to Statistical Issues

This lesson explores statistical issues that can arise when exploring multi-dimensional data.

Objectives

By the end of this lesson, you will be able to

  • articulate the basic issues around several statistical issues, such as Simpson's paradox, or the base rate fallacy,
  • understand how to identify and mitigate statistical issues that might affect data analytics, and
  • articulate the difference between correlation and causation.

Time Estimate

Approximately 1 hour.

Activities

Video: Watch the Introduction to Statistical Issues video, which will demonstrate some of the concerns around statistical issues that can affect analytics.

Reading: Read about paradoxes of probability and how to avoid them when exploring multi-dimensional data.

Reading: Read about statistical misinterpretation and how to prevent this from affecting results when exploring multi-dimensional data.

Explore the Spurious Correlations website to better understand the phrase correlation does not imply causation.


© 2017: Robert J. Brunner at the University of Illinois.

This notebook is released under the Creative Commons license CC BY-NC-SA 4.0. Any reproduction, adaptation, distribution, dissemination or making available of this notebook for commercial use is not allowed unless authorized in writing by the copyright holder.