Advanced Programming course for Scientists
Target Audience
- This is an advanced course suitable for people who already have programming background and want to process scientific data. It is especially suitable for students and people working in scientific environments.
Prerequisites
- Programming background in Python
- Understanding and being able to use data structures in Python such as lists, dictionaries, sets, and tuples.
- Scientific background (at least Bsc in a science subject).
Objectives
- Have an understanding of the differences between environments like Matlab, R, and Python.
- Introduction to tools used by Scientists to process their data.
- Introduction to Machine Learning
Course Format
- Duration of the course is 40 academic hours. (5 full days or 10 meetings of 4 hours each).
- The course includes approximately 40% hands on lab work.
- The course can be followed using Python 3.
Language
- The course can be given either in Hebrew or in English with slides and materials in English.
Syllabus
- Overview of Python syntax
- Scalars, Lists, Dictionaries, Tuples, Sets
- I/O
- Control flow: for-in and while loops, if, else elif
- Files (Plain text, CSV, Excel, JSON, YAML)
- Functions
- Modules
- Git and GitHub
- Working with Git locally
- Cloning remote repository
- Forking/Sending Pull-request
- Introduction to Testing
- doctest
- pytest
- Data, Algorithms, and complexity
- What's behind the data structures of Python.
- What is a hashing algorithm when and how to use it.
- How picking a data structure impacts algorithmic complexity.
- Stack (LIFO), Queue (FIFO)
- Trees, Graphs, Binary graphs
- Why and how are the numpy arrays much faster than regular Python lists?
- Using Jupyter notebook
- Navigation
- Files
- Markdown
- Introduction to Numpy
- Arrays, matrices
- Indexing of matrices in Numpy
- Transformations on arrays
- Data type and conversion to
- Selecting data
- Read/write data
- Numpy and Matlab
- Introduction to Pandas
- DataFrame and Series in Pandas
- Indexing methods in Pandas
- Groupby in Pandas
- Mathematical Operations in Pandas
- Indexing a data Series
- Modifying data
- Merging DataFrames
- Dates in Pandas
- Create graphs in Pandas
- Data Visualization
- Matplotlib
- Seaborn
- Bokeh
- Holoviz
- Histograms
- Heatmaps
- Various plots (line, scatter)
- The Scientific libraries
- NumPy
- Pandas
- SciPy
- Matplotlib
- Seaborn
- Comparing with Matlab and R
- Image manipulation
- OpenCV
- Natural Language Processing
- Spacy
- Machine learning
- scikit-learn
- Linear regression
- Correlation vs. Regression
- Factor analysis
- Logistic regression
- Classification
- Decision Tree
- Random forest
- Supervised learning
- Unsupervised learning
- Parallel programming
- multiprocessing
- forking
- threading
- async programming
- Accessing web APIs
- Getting a token
- Sending GET request
- Sending POST request
- Retrieving JSON document
- Submitting data
- Web scraping
- Using the Python requests module
- Using Scrapy
- Using Selenium
- GUI Using Python Tk
- File selector
- Buttons
- Entry box
- Event handling
- Showing progress
- Timer event
- Relational Database access (SQL)
- Quick overview
- Creating a simple schema
- Creating the database (SQLite)
- Accessing SQL database
- Inserting data into the database
- Fetching data from the database
- Updating data
- Deleting data
- XML processing
- DOM
- SAX
- Web based GUI - Flask
- Introduction to Web development
- Introduction to HTML
- Introduction to CSS
- Simple examples using Flask
Contact
Contact: Gabor Szabo gabor@hostlocal.com
Phone: +972-54-4624648