DataFrames and Series in Pandas#

This section of the workshop covers data ingestion, cleaning, manipulation, analysis, and visualization in Python.

We build on the skills learned in the Python fundamentals section and teach the pandas library.

At the end of this section, you will be able to:

Access data stored in a variety of formats
Combine multiple datasets based on observations that link them together
Perform custom operations on tables of data
Use the split-apply-combine method for analyzing sub-groups of data
Automate static analysis on changing data
Produce publication quality visualizations

In the end, our goal with this section is to provide you the necessary skills to – at a minimum – immediately replicate your current data analysis workflow in Python with no loss of total (computer + human) time.

This is a lower bound on the benefits you should expect to receive by studying this section.

The expression “practice makes perfect” is especially true here.

As you work with these tools, both the time to write and the time to run your programs will fall dramatically.

DataFrames and Series in Pandas#

Introduction #

Basic Functionality #

The Index #

Storage Formats #

Cleaning Data #

Reshape #

Merge #

GroupBy #

Time Series #