About
Due to the pervasiveness of Python as a statistical analysis tool, there is a demand for statisticians to learn Python to perform descriptive and inferential data analysis. The course will take a case study approach to introduce students to Python. Students will learn to work with complex data using Python and will get hands-on experience on how to use Python to conduct statistical analyses.
Course Topics
The overall goal of the course is to introduce novice Python users to the Python language, Pandas library, Statsmodels package, and data visualization tools to implement the statistical knowledge gained in STAT 500.
Specific goals:
- Import, manipulate, analyze and export DataFrames in Python using Pandas
- Manipulate arrays in Python using NumPy
- Visualize data using Matplotlib and Seaborn
- Analyze data in Python using Statsmodels
- Identify technical documentation to solve programming tasks using web-based resources
Course Author(s)
Dr. Linda Clark
Software
The course uses Google CoLab through the Penn State Google Suite. No purchase necessary. Students will access and CoLab during the first week of class.
Textbook
There are no required texts for this class.
Assessment Plan
40% of final grade: 12 Colab Notebook Activities
60% of final grade: Final Project: Applied Statistical Analyses in Python
The project is split up into 4 separate assignments:
- 15 % Section 1: Import data, ensure alignment between statistical question and data formatting
- 15 % Section 2: Exploratory data analysis
- 15 % Section 3: Inferential data analysis
- 15 % Section 4: Visualizations to make meaning of inferential results
Prerequisites
STAT 300 or STAT 460 or STAT 461 or STAT 462 or STAT 500