This course is designed to prepare Master students for successful research in ML, and prepare PhD students to find new research ideas related to ML theory. Content wise, the technical part will focus on generalization bounds using uniform convergence, and non-parametric regression.
By the end of the course
How to get there
Homeworks are designed to
No late homework
Each homework write-up must be neatly typeset as a PDF document using TeX, LaTeX, or similar systems (for more details see below). This is for you to practice getting efficient at it. Ensure that the following appear on the first page of the write-up:
Submit your write-up, one page per question, as a single PDF file by 11:59 PM of the specified due date to gradescope. Follow the instructions and mark the pages that belong to the corresponding questions. See more details on the homework sheet.
Some questions will be graded by the TAs. All questions will be self-graded by you.
Discussions on piazza
As graduates students we expect you to take this class because you want to learn the material and how to do research. All assessments are designed to maximize the learning effect. Cheating will harm yourself and hence it is of your own interest to adhere to the following policy.
All homework is submitted individually, and must be in your own words.
For homeworks 1-2, you may discuss only at a high level with up to three classmates; please list their IDs on the first page of your homework. Everyone must still submit an individual writeup, and yours must be in your own words; indeed, your discussions with classmates should be too high level for it to be possible that they are not in your own words.
We prefer you do not dig around for homework solutions; if you do rely upon external resources, cite them, and still write your solutions in your own words.
When integrity violations are found, they will be submitted to the department’s evaluation board.
|19.2||Logistics, Uniform convergence, Rademacher complexity||MW 2, 4||HW 1|
|26.2||Uniform law proof, VC dimension and Rademacher contraction||MW 4||HW 1 due|
|4.3.||Margin bounds, metric entropy and chaining||MW 5||HW 1 sol|
|11.3.||Chaining, Localized complexities and critical inequality||MW 5, 13||HW 2, HW 1 selfgrade due|
|18.3.||Non-parametric regression, from feature maps to RKHS [Lec Pt. 1 Lec Pt. 2], [Live notes]||MW 12, 13||Project proposal|
|25.3.||From kernels to RKHS, Error bounds for RKHS [Lecture video] [Live notes]||MW 12, 13||HW 2 due, HW 2 sol|
|1.4.||Mercer and Bochner’s theorem, random features, 2-layer NN [Lecture video] [Live notes]||MW 12, SC 4||HW 2 selfgrade due|
|8.4.||Gaussian process vs. penalized regression, Random design[Lecture video][Live notes]||MW 13, 14||HW 3|
|22.4.||Minimax lower bounds[Lecture videos][Live notes]||MW 15||HW 3 sol, HW 3 due|
|6.5.||Implicit regularization: Theory and practice[Lecture videos]||Mid-Project drafts due|
|13.5.||Presentations 1, see full schedule|
|20.5.||Presentations 2, see full schedule|
|27.5.||Presentations 3, see full schedule|
|14.6.||No class||Project reports due|
Links to books are online resources free from the ETH Zurich network
Martin Wainwright: High-dimensional statistics (core reference for the course)
Some more background reading for your general wisdom, knowledge and entertainment
Keener: Theoretical Statistics: e.g. asymptotic optimality (MLE), UMVU testing
Steinwart and Christmann: Support Vector Machines: more mathematical treatment of RKHS