Course Overview
TOPThis course provides students with Python data science skills that can immediately be applied in real life. The course focuses on Pandas as the primary tool, using related packages such as NumPy and Seaborn to enhance processing and visualization.
Scheduled Classes
TOPOutline
TOP- Pandas input and output
Reading data into Pandas dataframes and exporting to various formats.
- General input considerations
- Reading CSV Files
- Data cleaning
- Reading other data formats
- Exporting data
- Pandas filtering and sorting
Selecting subsets of dataframes for focused analysis.
- Indexing rows and columns
- Multi-indexing
- Selection by conditions
- Sorting data
- Pandas grouping and aggregation
Consolidating data and providing sums and other aggregate values.
- Using groupby()
- Aggregate functions
- Using data summaries
- Alternate approaches
- Pandas Data Transformation
Manipulating datasets for simpler analysis.
- Applying functions to data
- Renaming columns and indexes
- Inserting and removing data
- Combining and merging dataframes
- Reshaping datasets
- Advanced Matplotlib
Going beyond the basics with Matplotlib.
- Components of a figure
- Multiple plots
- Complex plots
- Matplotlib options and settings
- Customing styles (and everything else)
- Seaborn
Learning how Seaborn supplements and improves on Matplotlib.
- What does Seaborn provide?
- Using themes
- Advanced plot types
- Fine-tuning the details
- Using NumPy
Loading large datasets into NumPy arrays for further analysis.
- NumPy basics
- Creating arrays
- Indexing and slicing
- Builtin functions()
- Reading and writing data
- Useful SciPy subpackages
A look at some of the 20-odd SciPy subpackages.
- What is SciPy?
- scipy.stats
- scipy.interpolate
- scipy.optimize
Prerequisites
TOPParticipants should have a solid foundation in Python and introductory Pandas concepts. This course assumes prior experience with Python syntax, basic data structures, and simple data manipulation in Pandas.
Who Should Attend
TOPThis course is designed for data professionals who already have foundational Python and Pandas skills and want to apply Python more effectively to real-world data analysis problems. Typical roles include data analysts, data scientists, engineers, and researchers.