STAT 487: Introduction to Statistical Analysis with Python

About

Credits

Due to the pervasiveness of Python as a statistical analysis tool, there is a demand for statisticians to learn Python to perform descriptive and inferential data analysis. The course will take a case study approach to introduce students to Python. Students will learn to work with complex data using Python and will get hands-on experience on how to use Python to conduct statistical analyses.

Course Topics

The overall goal of the course is to introduce novice Python users to the Python language, Pandas library, Statsmodels package, and data visualization tools to implement the statistical knowledge gained in STAT 500.

Specific goals:

Import, manipulate, analyze and export DataFrames in Python using Pandas
Manipulate arrays in Python using NumPy
Visualize data using Matplotlib and Seaborn
Analyze data in Python using Statsmodels
Identify technical documentation to solve programming tasks using web-based resources

Course Author(s)

Dr. Linda Clark

Software

The course uses Google CoLab through the Penn State Google Suite. No purchase necessary. Students will access and CoLab during the first week of class.

Textbook

There are no required texts for this class.

Last updated:

FA23

Assessment Plan

40% of final grade: 12 Colab Notebook Activities

60% of final grade: Final Project: Applied Statistical Analyses in Python

The project is split up into 4 separate assignments:

15 % Section 1: Import data, ensure alignment between statistical question and data formatting
15 % Section 2: Exploratory data analysis
15 % Section 3: Inferential data analysis
15 % Section 4: Visualizations to make meaning of inferential results

Prerequisites

STAT 300 or STAT 460 or STAT 461 or STAT 462 or STAT 500