Course basics
- Logistics
- Learning objectives
Evaluation
- Homework information
- Academic integrity for homeworks
Schedule & course content
References
- Course content
- Typesetting
Interactive sessions
- Some notes on our setup in gather.town environment

Course basics

Logistics

Time: Thursdays 12-14, Fridays 12-13
Location of lectures: for now, online over zoom (passcode given via email)
Location of interactive sessions: online over gather.town (only works with a computer, not with the phone) see schedule
Lectures will also be recorded and posted online
Instructor: Fanny Yang
Teaching assistants:
- Konstantin Donhauser (konstantin.donhauser at inf.ethz.ch), Parnian Kassraie (parnian.kassraie at inf.ethz.ch)
- Office hours: online over zoom upon request via email
Sign up on waitlist until March 7th
De-register until March 17th
E-mails will most likely not be responded to if it’s a logistical inquiry - please post privately on campuswire

We will use the following platforms

Please ask questions in campuswire, eternal gratitude from your peers is ensured ;). It will also be used for all announcements, discussion of homeworks and lectures, and parts of the assignments.
gather.town for interactive sessions
gradescope enroll with entry code 5VRB46 (homeworks)
hackmd We will use this for some collaborative work. Please sign up so that you can start working on a joint sheet right away.

Learning objectives

This course is designed to prepare Master students for successful research in ML, and prepare PhD students to find new research ideas related to ML theory. Content wise, the technical part will focus on generalization bounds using uniform convergence, and non-parametric regression.

By the end of the course

both easily read and write theorems that provide generalization guarantees for machine learning algorithms
find high-impact questions and theorems to prove and work on that you are highly passionate about

Learning objectives

acquire enough mathematical background to understand a good fraction of theory papers published in the typical ML venues. For this purpose, students will learn common mathematical techniques from statistics and optimization in the first part of the course and apply this knowledge in the project work
critically examine recently published work in terms of relevance and determine impactful (novel) research problems. This will be an integral part of the project work and involves experimental as well as theoretical questions
find and outline an approach (some subproblem) to prove a conjectured theorem. This will be practiced in lectures / exercise and homeworks and potentially in the final project.
effectively communicate and present the problem motivation, new insights and results to a technical audience. This will be primarily learned via the final presentation and report as well as during peer-grading of peer talks.

Evaluation

10% HW, 50% oral midterm, 40% project
Homework:
- some graded homework problems
- rest is self-graded with mandatory hand-in. The discrepancy between your own and our score for the selected problem will enter the final homework grade. This is to encourage you to go through the solutions carefully while self-grading.
Project report and presentation: see project website
Presence is mandatory in the last four weeks of classes during presentations

Homework information

Homeworks are designed to
- do some technical (“just algebra”) work that needs to be practiced individually
- learn how to read more material on the matter effectively (homework content will be part of the midterm exam!)
No late homework
Each homework write-up must be neatly typeset as a PDF document using TeX, LaTeX, or similar systems (for more details see below). This is for you to practice getting efficient at it. Ensure that the following appear on the first page of the write-up:
- your name,
- your Student ID, and
- the names and IDs of any students with whom you discussed the assignment.
Submit your write-up, one page per question, as a single PDF file by 11:59 PM of the specified due date to gradescope. Follow the instructions and mark the pages that belong to the corresponding questions. See more details on the homework sheet.
Some questions will be graded by the TAs. All questions will be self-graded by you.
Discussions on campuswire

Academic integrity for homeworks

As graduates students we expect you to take this class because you want to learn the material and how to do research. All assessments are designed to maximize the learning effect. Cheating will harm yourself and hence it is of your own interest to adhere to the following policy.

All homework is submitted individually, and must be in your own words.
You may discuss only at a high level with up to two classmates; please list their IDs on the first page of your homework. Everyone must still submit an individual write-up, and yours must be in your own words; indeed, your discussions with classmates should be too high level for it to be possible that they are not in your own words.
We prefer you do not dig around for homework solutions; if you do rely upon external resources, cite them, and still write your solutions in your own words.
When integrity violations are found, they will be submitted to the department’s evaluation board.

Schedule & course content

Subject to frequent changes, check back often!
The slides are not shown as is during lecture, but they contain a superset of the content of each lecture

Date	Topic	Location	Material	Assignments
25.2	Logistics, Risk decomposition [Notes]	Recording	MW 1	HW 1
26.2	Concentration bounds and uniform convergence [Notes]	Recording	MW 2,3,4
4.3.	Azuma-Hoeffding, McDiarmid, Uniform Law [Notes]	Recording	MW 2, 4	HW 1 due, HW 1 sol
5.3.	Symmetrization and Rademacher complexity [Notes]	Recording
11.3.	VC bound, Rademacher contraction [Notes] Margin bound (ex)	Recording → Gather	MW 4	HW 1 self-grade due
12.3.	Margin bound proof	Recording		HW 2, Project sign-up
18.3.	Structural risk minimization, metric entropy [Notes]	Recording	SS 7, 26, MW 5	Project proposal due
19.3.	Chaining and Dudley’s integral[Notes]	Recording	MW 5
25.3.	From features maps to kernels to RKHS [Notes]	Recording	MW 12
26.3.	From RKHS to features, Mercer’s Theorem [Notes]	Recording	SC4, MW 12
1.4.	Kernels in high dimensions (ex)	Gather	Paper	HW 2 due, HW 2 sol
2.4.	No class, holiday
7.-8.4.	Holidays, enjoy!			HW 3
15.4.	Non-parametric regression and localized complexities [Notes]	Recording	MW 13
16.4.	Risk bounds for kernel ridge regression (KRR) [Notes]	Recording	MW 13
22.4.	Random design, Minimax lower bounds [Notes]	Recording	MW 14, MW 15
23.4.	Minimax lower bounds [Notes]	Recording	MW 15
29.4.	Minimax lower bounds [Notes - NTK vs. NN (ex)	Recording		HW 3 due, HW 3 sol, HW 2 self-grade due
30.4.	NTK vs. NN	Recording
6.5.	Oral midterm
7.5.	Oral midterm
13.5.	No class, holiday
14.5.	Interpolation and double descent	Recording		Mid-Project drafts due
20.5.	Implicit bias [Notes	Recording	HW 3 self-grade due
21.5.	Project feedback
27.5.	Presentations 1, see full schedule
28.5.	Presentations 2, see full schedule
3.6.	Presentations 3, see full schedule
4.6.	Presentation 4, see full schedule			Peer-grading due
18.6.	No class			Project reports due

References

Course content

Links to books are online resources free from the ETH Zurich network

Learning Theory

Martin Wainwright: High-dimensional statistics (core reference for the course)
Percy Liang: Statistical Learning Theory, Stanford Lecture notes
Shalev-Schwartz, Ben-David: Understanding Machine Learning
Anthony & Bartlett: Neural Network Learning

Some more background reading for your general wisdom, knowledge and entertainment

Keener: Theoretical Statistics: e.g. asymptotic optimality (MLE), UMVU testing
Steinwart and Christmann: Support Vector Machines: more mathematical treatment of RKHS
Tsybakov: Introduction to non-parametric Statistics
van der Vaart and Wellner: Weak Convergence and Empirical Processes
Boucheron, Lugosi, Massart: Concentration inequalities
Ledoux, Talagrand: Probability for Banach spaces

Typesetting

For LaTeX, see 1, 2 or 3, 4
For Pandoc Markdown by John McFarlane, refer to my git repo with sample instructions on how to use Pandoc for simple math notes and webpages

Interactive sessions

We meet in gather.town. Please make sure before the start of the lecture that you can enter gather.town. Some people have had problems with the microphone and camera in the past. The problem sheet for the session are

The virtual interactive session will take place as follows:

Everyone goes into the gather.town main hall where the podium is and the chairs are.
There, the speaker briefly presents the problem and instructions. The problem is usually divided into 3-4 sub-problems. The problem sheet can be found on the website.
Each participant can choose which of the presented sub-problems he or she would like to solve.
For each sub-problem there are two rooms in our meeting room. The goal is for each room to independently solve the corresponding subproblem. Please spread out so that no more than 3-4 students work together in one room.
In each room: 25 minutes of

discussion - you may use the prepared hackmd link or scribbletogether (on iPad/tablet use app) to collaborate (press x to open)
representative prepares a 6 min presentation using hackmd or scribble

After 25 minutes: 20 minutes of short presentations

One group per question (random choice) will be called to go on stage
Introduce yourself and group members by names
Present your results w/ screenshare (6 min.), take questions (1 min.)
To ask questions please move onto the red carpet in the big hall

Some notes on our setup in gather.town environment

You can hear and see only your direct neighbors
If you enter a public space (red carpet and the spot directly behind the podium), everyone can see and hear you
You can move using the arrows on your keyboard
We have created private spaces which you enter when you walk in one of the side rooms. Every group is assigned to one of these rooms where they can discuss and solve the assigned problems.
In each room there are two white bards. If you stand right next to one of these whiteboards, you can click x to open the hackmd respectively scribbles, which you can use to interactively solve the problem
One of the whiteboards contains a scribbles link which you can use as an interactive whiteboard. You can also join the board on your tablet using the 4-digit code (via browser or app, the latter is easier)
The other whiteboard contains a link to a hackmd site where you can jointly write in markdown. For markdown syntax see e.g. this primer. For adding formulas use “$$” and latex syntax. You can also start standard latex environments such as ‘\begin{align}’

You can look here to familiarize yourself with the gather.town environment.

For completeness, the links for collaboration for each of the individual rooms are:

Question 1: Group A Hackmd Scribble (Code: 2Q9T); Group B Hackmd Scribble (Code: GHE9)
Question 2: Group A Hackmd Scribble (Code: 2D3W); Group B Hackmd Scribble (Code: 5RL6)
Question 3: Group A Hackmd Scribble (Code: A9CH); Group B Hackmd Scribble (Code: YWFH)
Question 4: Group A Hackmd Scribble (Code: RYMD); Group B Hackmd Scribble (Code: 2FEZ)