Often, as part of exploratory data analysis, a histogram is used to understand how data are distributed, and in fact this technique can be used to compute a probability mass function (or PMF) from a data set as was shown in an earlier module. However, the binning approach has issues, including a dependance on the number and width of the bins used to compute the histogram. One approach to overcome these issues is to fit a function to the binned data, which is known as parametric estimation. Alternatively, we can construct an approximation to the data by employing a non-parametric density estimation. The most commonly used non-parametric technique is kernel density estimation (or KDE). In this module, you will learn about density estimation and specifically how to employ KDE. One often overlooked aspect of density estimation is the model representation that is generated for the data, which can be used to emulate new data. This concept is demonstrated by applying density estimation to images of handwritten digits, and sampling from the resulting model.
Activities and Assignments | Time Estimate | Deadline | Points |
---|---|---|---|
Module 8 Overview Video | 10 Minutes | N/A | N/A |
Module 8 Lesson 1: Why learn Data Analytics? | 1 Hour | N/A | N/A |
Module 8 Lesson 2: Introduction to Density Estimation | 2 Hours | N/A | N/A |
Module 8 Lesson 3: Advanced Density Estimation | 2 Hours | N/A | N/A |
Module 8 Assignment | 2 Hours | N/A | N/A |
© 2017: Robert J. Brunner at the University of Illinois.
This notebook is released under the Creative Commons license CC BY-NC-SA 4.0. Any reproduction, adaptation, distribution, dissemination or making available of this notebook for commercial use is not allowed unless authorized in writing by the copyright holder.