About
Due to the pervasiveness of Python as a statistical analysis tool, there is a demand for statisticians to learn Python to perform descriptive and inferential data analysis. The course will take a case study approach to introduce students to Python. Students will learn to work with complex data using Python and will get hands-on experience on how to use Python to conduct statistical analyses.
Course Topics
The overall goal of the course is to introduce novice Python users to the Python language, Pandas library, Statsmodel package, and data visualization tools to implement the statistical knowledge gained in STAT 500.
Specific goals:
- Import, manipulate, analyze and export dataframes in Python using Pandas
- Analyze data in Python using Statsmodel
- Manipulate arrays in Python using Numpy
- Identify technical documentation to solve programming tasks in Python by using web based resources
- Visualize data using Matplotlib and Seaborn
Course Author(s)
Dr. Linda Clark
Software
The course uses Google CoLab through the Penn State Google Suite. No purchase necessary. Students will access and CoLab during the first week of class.
Textbook
There are no required texts for this class.
Assessment Plan
40% of final grade: 12 Colab Notebook Activities
60% of final grade: Final Project: Applied Statistical Analyses in Python
The project is split up into 4 separate assignments:
- 15 % Section 1: Import data, ensure alignment between statistical question and data formatting
- 15 % Section 2: Exploratory data analysis
- 15 % Section 3: Inferential data analysis
- 15 % Section 4: Visualizations to make meaning of inferential results
Prerequisites
STAT 300 or STAT 460 or STAT 461 or STAT 462 or STAT 500