COMP 551: Applied Machine Learning - Winter 2022
Contact: comp551mcgill@gmail.complease make sure to use this email to receive a timely response
- Jan 05, 2022 - Apr 12, 2022
- Tuesday & Thursday, 1:00 pm - 2:25 pm
- Remote [lectures will be recorded]
- Instructor: Reihaneh Rabbany
- Ayush Jain : Head TA
- Elham Daneshmand : Quiz TA
- Brendon McGuinness
- Sai Praneeth
- Safa Alver
- Xiru Zhu
- Yinan Wang
- David Venuto
[expand]
[expand all]
[collapse all]
Overview
- This course covers a selected set of topics in machine learning and data mining, with an emphasis on good methods and practices for deployment of real systems. The majority of sections are related to commonly used supervised learning techniques, and to a lesser degree unsupervised methods. This includes fundamentals of algorithms on linear and logistic regression, decision trees, support vector machines, clustering, neural networks, as well as key techniques for feature selection and dimensionality reduction, error estimation and empirical validation.
- Prerequisites [click to expand the list]
- This course requires programming skills (python) and basic knowledge of probabilities, calculus and linear algebra. For more information see the course prerequisites and restrictions at McGill's webpage.
Textbooks
- [Bishop] Pattern Recognition and Machine Learning by Christopher Bishop (2007)
- [Goodfellow] Deep Learning by Ian Goodfellow, Yoshua Bengio, and Aaron Courville (2016)
- [Murphy] Machine Learning: A Probabilistic Perspective by Kevin Murphy (2012)
- Chapters from these three books are cited as optional reference materials for the slides.
There are several other related references. [click to expand the list] -
- The Elements of Statistical Learning: Data Mining, Inference, and Prediction by Trevor Hastie, Robert Tibshirani and Jerome Friedman (2009)
- Information Theory, Inference, and Learning Algorithms, by David MacKay (2003)
- Bayesian Reasoning and Machine Learning, by David Barber (2012).
- Understanding Machine Learning: From Theory to Algorithms, by Shai Shalev-Shwartz and Shai Ben-David (2014)
- Foundations of Machine Learning, by Mehryar Mohri, Afshin Rostamizadeh, and Ameet Talwalkar (2018)
- Dive into Deep Learning, by Aston Zhang, Zachary Lipton, Mu Li, and Alexander J. Smola (2019)
- Mathematics for Machine Learning, by Marc Peter Deisenroth, A Aldo Faisal, and Cheng Soon Ong (2019)
- A Course in Machine Learning, by Hal Daume III (2017)
- Hands-on Machine Learning with Scikit-Learn and TensorFlow, by Aurelien Geron (2017)
- Machine Learning, by Tom Mitchell (1997)
- Introduction to Data Mining, by Pang-Ning Tan, Michael Steinbach, Anuj Karpatne, and Vipin Kumar (2020)
- Machine Learning, Dynamical Systems and Control, by Steven L. Brunton and J. Nathan Kutz (2019)
- Probabilistic Machine Learning: An Introduction, by Kevin P. Murphy (2021)
Schedule
- Thu., Jan. 6
- Tue., Jan. 11
- Thu., Jan. 13
- Tue., Jan. 18
- Thu., Jan. 20
- Tue., Jan. 25
- Thu., Jan. 27
- Tue., Feb. 1
- Thu., Feb. 3
- Tue., Feb. 8
- Thu., Feb. 10
- Tue., Feb. 15
- Thu., Feb. 17
- Tue., Feb. 22
- Thu., Feb. 24
Tue., Mar. 1Thu., Mar. 3- Tue., Mar. 8
- Thu., Mar. 10
- Tue., Mar. 15
- Thu., Mar. 17
- Tue., Mar. 22
- Thu., Mar. 24
- Tue., Mar. 29
- Thu., Mar. 31
- Tue., Apr. 5
- Thu., Apr. 7
- Tue., Apr. 12
Outline
- *** quick note ***
- Please note that this list of topics is tentative and copied from the last year's offering of the same course, we might add/drop some topics or change the order as the course progresses.
- Introduction
- slides
- Nearest Neighbours
- slides, notebook (Colab), reference: chapter 1 [Murphy]
- Classification and regression trees
- slides, notebook (Colab), reference: 16.1-16.2.6 [Murphy], 14.4 [Bishop]
- Core concepts
- slides, notebook for model selection (Colab), notebook for curse of dimensionality (Colab)
- Maximum likelihood and Bayesian Reasoning
- slides, notebook (Colab), reference: 2-2.3 [Bishop], 3-3.5 [Murphy]
- Naive Bayes
- slides, notebook (Colab), reference: 3.5-3.5.4 [Murphy]
- Linear regression
- slides, notebook (Colab), reference: 7-7.3.3 [Murphy], 3-3.1.2 [Bishop]
- Logistic and softmax regression
- slides, notebook (Colab), reference: 8.1-8.3.3 [Murphy], 4.1-4.1.3 + 4.3-4.3.3 [Bishop]
- Gradient descent methods
- slides, notebook (Colab), reference: 8.3.2 [Murphy] and this overview by S. Ruder (in pdf )
- Regularization
- slides, notebook (Colab), reference: 3.1.4-3.3 [Bishop]
- Perceptrons & Multilayer Perceptrons
- slides, Perceptrons Colab, MLP demo, reference: 4.1.1-4.1.3 + 4.1.7 [Bishop], 6-6.5 + parts of 7 [Goodfellow]
- Gradient computation and automatic differentiation
- slides, notebook (Colab), reference: 6.5 + 8.2 [Goodfellow], blog post, visualization
- Convolutional neural networks
- slides, reference: 9 [Goodfellow], blog post, optional reading
- Linear support vector machines
- slides, notebook (Colab), reference: 4.1.1-4.1.3 + 4.1.7 + 7.1-7.1.4 excluding kernels [Bishop]
- Bagging & Boosting
- slides, notebook (Colab), reference: 3.2 [Bishop], demos for Bias-Variance Tradeoff, Gradient Boosting explanation, and Interactive playground
- Unsupervised learning
- slides, notebook (Colab), reference: 25.5 [Murphy] and 9.1 [Bishop], demos for K-Means and DB-SCAN
- Dimensionality reduction
- slides, notebook (Colab), reference: 12.2 [Murphy], 12.1 [Bishop], demo
- Learning with graphs
- slides,
- Frontiers
Evaluation
- Regular Practice Quizzes [20%]
-
- One per lecture to check the key concepts discussed in the last lecture
- Timed to be done in 1 hour after starting the quiz
- Available until the start of the next lecture
- Late Mid-term Exam [30%]
-
- Online, during class on March 22nd, might have an oral component
- Hands-on Mini-Projects [50%]
-
- Four programming assignments to be done in groups of three*, *no exception to this given the grading load on TAs
- Groups can stay the same between projects, you can also regroup when needed
- All group members receive the same mark unless there are major complains on not contributing, responding, etc. from group-mates, which will be resolved in a case by case basis. If a significant difficulty/conflict arises, please send an email to the course email, put 'Group-Issue' in the title
- Mark the due dates: Feb 8th [10%], March 1st [15%], March 29th [15%], April 26th [10%]
- Work submitted for evaluation as part of this course may be checked with text-matching software within myCourses
- Late submission policy
- All due dates are 11:59 pm in Montreal, unless specified otherwise [e.g. check the due dates for quizzes].
No make-up quizzes will be given.
For mini-projects, 2^k% percent will be deducted per k days of delay.
If you experience barriers (including a covid related issue) to learning in this course, submitting the projects, etc., please do not hesitate to discuss them with me directly, and please make sure to put "551 special" in the header to make sure I see your email [for general course correspondence, please use the course email: comp551mcgill@gmail.com]. As a point of reference, you can reach the Office for Students with Disabilities at 514-398-6009.
Academic Integrity
- The ``McGill University values academic integrity. Therefore, all students must understand the meaning and consequences of cheating, plagiarism and other academic offenses under the Code of Student Conduct and Disciplinary Procedures'' (see McGill's webpage for more information). (Approved by Senate on 29 January 2003)
Online Resources
- Learning plan
- metacademy
- Video Playlists
-
- StatQuest
- FreeCodeCamp
- Essence of linear algebra and Neural Networks by 3Blue1Brown
- Mathematics for ML by David Rolnick
- Courses with Playlist and/or Code
-
- Introduction to Machine Learning by Google
- Machine Learning by Stanford
- Deep Learning by UC Berkeley
- Hinton's Lectures on Neural Networks for Machine Learning
- Deep Learning & Linear Algebra courses by fastai
- Learning from Data by Caltech
- Deep Learning (with PyTorch) playlist and course by NYU
- Deep Learning by Stanford
- Deep Learning by deeplearning.ai
- Introduction to Deep Learning by MIT
- Information Theory, Pattern Recognition, and Neural Networks by David MacKay
- Books with Code
-
- Probabilistic Machine Learning: An Introduction by Kevin Murphy (book 1)
- Dive into Deep Learning BY by Aston Zhang, Zachary Lipton, Mu Li, and Alexander J. Smola
- Machine Learning Notebooks for O'Reilly book Hands-on Machine Learning with Scikit-Learn and TensorFlow
- Similar Courses - Graduate Level
-
- https://www.cs.toronto.edu/~rgrosse/courses/csc2515_2019/
- https://www.cs.cornell.edu/courses/cs4780/2019fa/
- Similar Courses - Undergraduate Level
-
- hhttps://cs.mcgill.ca/~wlh/comp451/schedule.html
- https://www.cs.toronto.edu/~rgrosse/courses/csc311_f20/
- https://www.cs.toronto.edu/~rgrosse/courses/csc411_f18/
- http://cs229.stanford.edu/syllabus-fall2020.html
- https://cs230.stanford.edu/lecture/
- Cheatsheets: https://stanford.edu/~shervine/teaching/
- Similar Courses - Last Versions
-
- Fall 2019
- Winter 2020
- Fall 2020