This course will build a practical foundation for data analytics by teaching students basic tools and techniques that can scale to large computational systems and massive data sets. Specifically, students will first learn how to use the Python programming language, with a focus on specific aspects of the language and associated Python modules that are relevant for data analytics. The Python programming language will be introduced, for which students will use Jupyter Notebooks. This introduction will include the NumPy, SciPy, MatPlotlib, Pandas, and Seaborn Python modules. These capabilities will be demonstrated through simple data analytic tasks such as obtaining data, cleaning data, visualizing data, and basic data analysis. In addition, students will learn how to work about the Unix file system, which is used by most big data tools or technologies.
The original content for this course consists of Jupyter notebooks, which are rendered to standard HTML webpages by github automatically. Click Here to get started browsing the original notebook content.
The Jupyter notebook system can convert the original notebooks into a variety of different format, including HTML, which is the standard format used to create web pages. Click Here to get started browsing the notebook content as a static website.