263-5300-00L  Guarantees for Machine Learning

SemesterSpring Semester 2021
LecturersF. Yang
Periodicityyearly recurring course
Language of instructionEnglish
CommentNumber of participants limited to 30.

Last cancellation/deregistration date for this graded semester performance: 17 March 2021! Please note that after that date no deregistration will be accepted and a "no show" will appear on your transcript.


AbstractThis course is aimed at advanced master and doctorate students who want to conduct independent research on theory for modern machine learning (ML). It teaches classical and recent methods in statistical learning theory commonly used to prove theoretical guarantees for ML algorithms. The knowledge is then applied in independent project work that focuses on understanding modern ML phenomena.
ObjectiveLearning objectives:

- acquire enough mathematical background to understand a good fraction of theory papers published in the typical ML venues. For this purpose, students will learn common mathematical techniques from statistics and optimization in the first part of the course and apply this knowledge in the project work
- critically examine recently published work in terms of relevance and determine impactful (novel) research problems. This will be an integral part of the project work and involves experimental as well as theoretical questions
- find and outline an approach (some subproblem) to prove a conjectured theorem. This will be practiced in lectures / exercise and homeworks and potentially in the final project.
- effectively communicate and present the problem motivation, new insights and results to a technical audience. This will be primarily learned via the final presentation and report as well as during peer-grading of peer talks.
ContentThis course touches upon foundational methods in statistical learning theory aimed at proving theoretical guarantees for machine learning algorithms, touching on the following topics
- concentration bounds
- uniform convergence and empirical process theory
- high-dimensional statistics (e.g. sparsity)
- regularization for non-parametric statistics (e.g. in RKHS, neural networks)
- implicit regularization via gradient descent (e.g. margins, early stopping)
- minimax lower bounds

The project work focuses on current theoretical ML research that aims to understand modern phenomena in machine learning, including but not limited to
- how overparameterization could help generalization ( RKHS, NN )
- how overparameterization could help optimization ( non-convex optimization, loss landscape )
- complexity measures and approximation theoretic properties of randomly initialized and trained NN
- generalization of robust learning ( adversarial robustness, standard and robust error tradeoff, distribution shift)
Prerequisites / NoticeIt’s absolutely necessary for students to have a strong mathematical background (basic real analysis, probability theory, linear algebra) and good knowledge of core concepts in machine learning taught in courses such as “Introduction to Machine Learning”, “Regression”/ “Statistical Modelling”. In addition to these prerequisites, this class requires a high degree of mathematical maturity—including abstract thinking and the ability to understand and write proofs.

Students have usually taken a subset of Fundamentals of Mathematical Statistics, Probabilistic AI, Neural Network Theory, Optimization for Data Science, Advanced ML, Statistical Learning Theory, Probability Theory (D-MATH)