Introduction to Data Processing with Python
This is the course content for Introduction to Data Processing with Python, which has been developed and maintained by OpenTechSchool.
Welcome
Welcome to Introduction to Data Processing with Python. In this workshop we will take you through the fundamentals of working with text and other types of data with Python. We’ll learn how to read data from files into data structures in our program, to extract the information we want. We’ll display that data in graphs and charts, and get a glimpse of the world of Open Data that’s available online. You’ll never want to wrangle a spreadsheet again!
We only expect you to know a little Python, not a lot. If you’ve done our Introduction to Programming workshop then that will be perfect.
Core workshop material
- Recap of Python essentials - A quick recap of some of the Introduction to Programming essentials.
- Data Structures in Python - An introduction to the list and dictionary data structures.
- Introducing IPython Notebook - A whole new way to work with Python!
- Analyzing a survey - Once we have our text in Python, what can we do with it?
- Creating Charts - Using IPython Notebook with matplotlib to create charts.
- CSV Files - Reading comma-separated data.
Extra fun stuff
-
Alternative Approaches - Other ways to store and process data (Pandas, SQL databases.)
-
Open Data Sources Data.
Reference material
More Advanced Follow Up Material
If you’re wondering how to “level up” to more advanced data analysis after mastering the material here, you could try these;
-
A tutorial series by Hernan Rojas showing the basics of using the Python library “pandas” for data analysis. You can click on the lessons directly to view them in your browser, or click the “(download)” link on the right of the page to download a ZIP file containing them as IPython Notebook files.
-
If you have statistical modelling experience then this video tutorial by Skipper Seabold from SciPy 2012 (accompanying material as IPython Notebook files) comes highly recommended.
-
SciPy tutorial notes, that can be used for a full course of scientific computing with Python.