DSC 478
Fall 2020


 Course Material 


 Class Project 

 Online Resources 



 Online Resources & Reference Material

General Python Resources

Important Tools and Libraries

  • IPython: A REPL for easy interactive python development. Extremely useful for testing ideas out one line of code at a time. We will use IPython Notebook (a Web based interactive shell for Python) extensively in this class.
  • Jupyter Notebook (formerly IPython Notebook)
  • Jupyter Notebook Tutorial - Nice  tutorial video by Corey Schafer.
  • matplotlib:  A very nice plotting library, capable of generating production-level visualizations programmatically. Matlab-like syntax makes plotting very easy.
  • NumPy: The fundamental package for scientific computing with Python.
  • SciPy: the open source library for mathematics, science and engineering
  • scikit-learn: a robust machine learning library building on top of NumPy, SciPy and matplotlib. Includes of a wide variety of modeling techniques.
  • Pandas (python data analysis library): data structures and tools for common data analysis tasks, including an efficient data frame implementation (similar to R).
  • BeautifulSoup: A general parsing library particularly useful for parsing html and xml.
  • NLTK: Natural Language Toolkit for Python, including tools for text preprocessing, tokenization, and vectorization (you may  also be interested in an online book that shows how NLTK is used).
  • NetworkX: Python language library for the creation, manipulation, and analysis of graphs and networks.

Installation of Python and Scientific Libraries

References for Data Analysis in Python

Other Relevant Tools & Resources

Data Sets


Copyright ©, Bamshad Mobasher, DePaul University.