- Time: Tuesdays 10-12 CAB G59, Fridays 14-16 CHN G 42 (to be confirmed)
- Lectures
*will not*be recorded! This is an in-person only class - Instructor: Fanny Yang
- Teaching assistants:
- Konstantin Donhauser (konstantin.donhauser at inf.ethz.ch), Julia Kostin (julia.kostin at inf.ethz.ch)
- Office hours: upon request via email

- Sign up on waitlist until
**September 29th** - De-register until
**October 11th**- if you don’t appear to the oral exam and do not present a project, it will count as a no-show *E-mails will most likely not be responded to if it’s a logistical inquiry - please post privately on Moodle*

- Please ask all of your questions in moodle, eternal gratitude from your peers is ensured ;). It will also be used for all announcements, discussion of homeworks and lectures, and parts of the assignments. If you’re still on the waitlist, you can enroll using the password Dudley@ETH2023
- gradescope enroll with entry code NP867G (to hand in homeworks)

This course is designed to prepare Master students for successful research in ML, and prepare PhD students to find new research ideas related to ML theory. Content wise, the technical part will focus on generalization bounds using uniform convergence, and non-parametric regression.

By the end of the course

- both easily read and write theorems that provide generalization guarantees for machine learning algorithms
- find high-impact questions and theorems to prove and work on that you are highly passionate about

**Learning objectives**

acquire enough mathematical background to understand a good fraction of theory papers published in the typical ML venues. For this purpose, students will learn common mathematical techniques from statistics and optimization in the first part of the course and apply this knowledge in the project work

critically examine recently published work in terms of relevance and determine impactful (novel) research problems. This will be an integral part of the project work and involves experimental as well as theoretical questions

find and outline an approach (some subproblem) to prove a conjectured theorem. This will be practiced in lectures / exercise and homeworks and potentially in the final project.

effectively communicate and present the problem motivation, new insights and results to a technical audience. This will be primarily learned via the final presentation and report as well as during peer-grading of peer talks.

- 10% HW, 50% oral midterm, 40% project
- Homework:
- randomly selected graded homework problems

- Project report and presentation: see project website
- Presence is mandatory in the last four weeks of classes during presentations

- Homeworks are designed to
- do some technical (“just algebra”) work that needs to be practiced individually
- learn how to read more material on the matter effectively (
**homework content will be part of the midterm exam!**)

- No late homework
- Each homework write-up must be neatly typeset as a PDF document using TeX, LaTeX, or similar systems (for more details see below). This is for you to practice getting efficient at it. Make sure you indicate on the first page, which students you discussed the assignment with, but do not add your
**own**name to the sheet. - Submit your write-up as a single PDF file by 11:59 PM of the specified due date to gradescope. Follow the instructions and mark the pages that belong to the corresponding questions. See more details on the homework sheet.
- Make sure that your name on gradescope matches your real name.
- All questions will be graded by the TAs.
- Discussions on moodle

As graduates students we expect you to take this class because you want to learn the material and how to do research. All assessments are designed to maximize the learning effect. Cheating will harm yourself and hence it is of your own interest to adhere to the following policy.

All homework is submitted individually, and must be in your own words.

You may discuss only at a high level with up to two classmates; please list their IDs on the first page of your homework. Everyone must still submit an individual write-up, and yours must be in your own words; indeed, your discussions with classmates should be too high level for it to be possible that they are not in your own words.

We prefer you do not dig around for homework solutions; if you do rely upon external resources, cite them, and still write your solutions in your own words.

When integrity violations are found, they will be submitted to the department’s evaluation board.

- Links to files are dead until the date of
- Subject to frequent changes, check back often!
- The links to the material (handout, HW and solutions) are not active until announced in the moodle
- The slides are not shown as is during lecture, but they contain a superset of the content of each lecture

Date | Topic | Location | Material | Assignments | |
---|---|---|---|---|---|

19.9 |
No class |
||||

22.9 |
No class |
||||

26.9. |
Lecture: Introduction and concentration bounds, Recording | CAB G59 | MW 2 | ||

29.9. |
Lecture: Uniform tail bound and McDiarmid | CHN G42 | MW 2,3,4 | HW 1 | |

3.10. |
Lecture: Azuma-Hoeffding and the uniform law | CAB G59 | MW 2,4 | ||

6.10. |
Lecture: Uniform law and Rademacher complexity | CHN G42 | |||

10.10. |
Lecture: VC bound and margin bounds Exercise sheet | CAB G59 | SS 26 | De-registration deadline 11.10. | |

13.10. |
No class |
HW 1 due 12.10., Project sign-up | |||

17.10. |
Lecture: Metric entropy | CAB G59 | MW 5 | ||

20.10. |
Lecture: Chaining | CHN G42 | HW 1 sol | ||

24.10. |
Lecture: Non-parametric regression and kernels | CAB G59 | SC 4, MW 12, MW 13 | Project proposals due | |

27.10. |
Lecture: Kernel ridge regression | CHN G42 | HW 2 | HW 2 | |

31.10. |
Lecture: Random design | CAB G59 | |||

3.11. |
No class |
||||

7.11. |
No class |
||||

10.11. |
Lecture: Minimax lower bounds | CHN G42 | MW 14, MW 15 | HW 2 due 9.11. 23:59, HW 2 sol | |

14.11. |
Interactive session: Lower bounds for semi-supervised learning | CAB G59 | Exercise sheet Solution | ||

17.11. |
No class |
||||

20./21.11. |
Oral midterm |
TBA | |||

24.11. |
Guest lecture on zoom by Matus Telgarsky (Courant, NYU): Implicit bias | on zoom | |||

28.11. |
Guest lecture by Gil Kur | CAB G59 | Mid-Project drafts due | ||

1.12. |
Guest lecture by Vidya Muthukumar and FY: Overparameterization and double descent | CHN G42 | |||

5.12. |
No class |
CAB G59 | |||

8.12. |
[Presentations 1], see full schedule | CHN G42 | |||

12.12. |
[Presentations 2], see full schedule | CAB G59 | |||

15.12. |
[Presentations 3], see full schedule | CHN G42 | |||

19.12. |
[Presentations 4], see full schedule | CAB G59 | [Peer-grading due] | ||

12.1. |
No class |
Project reports due |

Links to books are online resources free from the ETH Zurich network:

**Learning Theory**

Martin Wainwright: High-dimensional statistics (core reference for the course)

Percy Liang: Statistical Learning Theory, Stanford Lecture notes

Steinwart and Christmann: Support Vector Machines: more mathematical treatment of RKHS

**Some more background reading for your general wisdom, knowledge and entertainment**

Keener: Theoretical Statistics: e.g. asymptotic optimality (MLE), UMVU testing

van der Vaart and Wellner: Weak Convergence and Empirical Processes