SML group at ETH | Konstantin Donhauser

I am an ETH AI Center Doctoral Fellow. My research interest is in High-Dimensional Statistics and more generally in the combination of Mathematics & Machine Learning. I’m part of the groups led by Fanny Yang and Afonso Bandeira.

Papers

Copyright-Protected Language Generation via Adaptive Model Fusion

Javier Abad, Konstantin Donhauser, Francesco Pinto*, and Fanny Yang*

International Conference on Learning Representations (ICLR), Oral, 2025

abstract arXiv pdf code

The risk of language models reproducing copyrighted material from their training data has led to the development of various protective measures. Among these, inference-time strategies that impose constraints via post-processing have shown promise in addressing the complexities of copyright regulation. However, they often incur prohibitive computational costs or suffer from performance trade-offs. To overcome these limitations, we introduce Copyright-Protecting Model Fusion (CP-Fuse), a novel approach that combines models trained on disjoint sets of copyrighted material during inference. In particular, CP-Fuse adaptively aggregates the model outputs to minimize the reproduction of copyrighted content, adhering to a crucial balancing property that prevents the regurgitation of memorized data. Through extensive experiments, we show that CP-Fuse significantly reduces the reproduction of protected material without compromising the quality of text and code generation. Moreover, its post-hoc nature allows seamless integration with other protective measures, further enhancing copyright safeguards. Lastly, we show that CP-Fuse is robust against common techniques for extracting training data.
Privacy-preserving data release leveraging optimal transport and particle gradient descent

Konstantin Donhauser*, Javier Abad*, Neha Hulkund, and Fanny Yang

International Conference on Machine Learning (ICML), 2024

abstract arXiv pdf poster code

We present a novel approach for differentially private data synthesis of protected tabular datasets, a relevant task in highly sensitive domains such as healthcare and government. Current state-of-the-art methods predominantly use marginal-based approaches, where a dataset is generated from private estimates of the marginals. In this paper, we introduce PrivPGD, a new generation method for marginal-based private data synthesis, leveraging tools from optimal transport and particle gradient descent. Our algorithm outperforms existing methods on a large range of datasets while being highly scalable and offering the flexibility to incorporate additional domain-specific constraints.
Detecting critical treatment effect bias in small subgroups

Piersilvio De Bartolomeis, Javier Abad, Konstantin Donhauser, and Fanny Yang

Conference on Uncertainty in Artificial Intelligence (UAI), 2024

abstract arXiv pdf slides code

Randomized trials are considered the gold standard for making informed decisions in medicine, yet they often lack generalizability to the patient populations in clinical practice. Observational studies, on the other hand, cover a broader patient population but are prone to various biases. Thus, before using an observational study for decision-making, it is crucial to benchmark its treatment effect estimates against those derived from a randomized trial. We propose a novel strategy to benchmark observational studies beyond the average treatment effect. First, we design a statistical test for the null hypothesis that the treatment effects estimated from the two studies, conditioned on a set of relevant features, differ up to some tolerance. We then estimate an asymptotically valid lower bound on the maximum bias strength for any subgroup in the observational study. Finally, we validate our benchmarking strategy in a real-world setting and show that it leads to conclusions that align with established medical knowledge.
Hidden yet quantifiable: A lower bound for confounding strength using randomized trials

Piersilvio De Bartolomeis*, Javier Abad*, Konstantin Donhauser, and Fanny Yang

International Conference on Artificial Intelligence and Statistics (AISTATS), 2024

abstract arXiv pdf poster code

In the era of fast-paced precision medicine, observational studies play a major role in properly evaluating new drugs in clinical practice. Yet, unobserved confounding can significantly compromise causal conclusions from observational data. We propose a novel strategy to quantify unobserved confounding by leveraging randomized trials. First, we design a statistical test to detect unobserved confounding with strength above a given threshold. Then, we use the test to estimate an asymptotically valid lower bound on the unobserved confounding strength. We evaluate the power and validity of our statistical test on several synthetic and semi-synthetic datasets. Further, we show how our lower bound can correctly identify the absence and presence of unobserved confounding in a real-world setting.
Certified private data release for sparse Lipschitz functions

Konstantin Donhauser, Johan Lokna, Amartya Sanyal, March Boedihardjo, Robert Hoenig, and Fanny Yang

International Conference on Artificial Intelligence and Statistics (AISTATS), 2024

abstract arXiv pdf

As machine learning has become more relevant for everyday applications, a natural requirement is the protection of the privacy of the training data. When the relevant learning questions are unknown in advance, or hyper-parameter tuning plays a central role, one solution is to release a differentially private synthetic data set that leads to similar conclusions as the original training data. In this work, we introduce an algorithm that enjoys fast rates for the utility loss for sparse Lipschitz queries. Furthermore, we show how to obtain a certificate for the utility loss for a large class of algorithms.
Tight bounds for maximum l1-margin classifiers

Stefan Stojanovic, Konstantin Donhauser, and Fanny Yang

Algorithmic Learning Theory (ALT), 2024

abstract arXiv pdf slides video

Popular iterative algorithms such as boosting methods and coordinate descent on linear models converge to the maximum l1-margin classifier, a.k.a. sparse hard-margin SVM, in high dimensional regimes where the data is linearly separable. Previous works consistently show that many estimators relying on the l1-norm achieve improved statistical rates for hard sparse ground truths. We show that surprisingly, this adaptivity does not apply to the maximum l1-margin classifier for a standard discriminative setting. In particular, for the noiseless setting, we prove tight upper and lower bounds for the prediction error that match existing rates of order ||w*||_1^2/3/n^1/3 for general ground truths. To complete the picture, we show that when interpolating noisy observations, the error vanishes at a rate of order 1/sqrt(log(d/n)). We are therefore first to show benign overfitting for the maximum l1-margin classifier.
Strong inductive biases provably prevent harmless interpolation

Michael Aerni*, Marco Milanta*, Konstantin Donhauser, and Fanny Yang

International Conference on Learning Representations (ICLR), 2023

abstract arXiv pdf poster code

Classical wisdom suggests that estimators should avoid fitting noise to achieve good generalization. In contrast, modern overparameterized models can yield small test error despite interpolating noise – a phenomenon often called "benign overfitting" or "harmless interpolation". This paper argues that the degree to which interpolation is harmless hinges upon the strength of an estimator’s inductive bias, i.e., how heavily the estimator favors solutions with a certain structure: while strong inductive biases prevent harmless interpolation, weak inductive biases can even require fitting noise to generalize well. Our main theoretical result establishes tight non-asymptotic bounds for high-dimensional kernel regression that reflect this phenomenon for convolutional kernels, where the filter size regulates the strength of the inductive bias. We further provide empirical evidence of the same behavior for deep neural networks with varying filter sizes and rotational invariance.
Fast rates for noisy interpolation require rethinking the effects of inductive bias

Konstantin Donhauser, Nicolo Ruggeri, Stefan Stojanovic, and Fanny Yang

International Conference on Machine Learning (ICML), 2022

abstract arXiv pdf poster slides video

Good generalization performance on high-dimensional data crucially hinges on a simple structure of the ground truth and a corresponding strong inductive bias of the estimator. Even though this intuition is valid for regularized models, in this paper we caution against a strong inductive bias for interpolation in the presence of noise: Our results suggest that, while a stronger inductive bias encourages a simpler structure that is more aligned with the ground truth, it also increases the detrimental effect of noise. Specifically, for both linear regression and classification with a sparse ground truth, we prove that minimum \ell_p-norm and maximum \ell_p-margin interpolators achieve fast polynomial rates up to order 1/n for p > 1 compared to a logarithmic rate for p = 1. Finally, we provide experimental evidence that this trade-off may also play a crucial role in understanding non-linear interpolating models used in practice.
Tight bounds for minimum l1-norm interpolation of noisy data

Guillaume Wang*, Konstantin Donhauser*, and Fanny Yang

International Conference on Artificial Intelligence and Statistics (AISTATS), 2022

abstract arXiv pdf poster

We provide matching upper and lower bounds of order σ2/log(d/n) for the prediction error of the minimum ℓ1-norm interpolator, a.k.a. basis pursuit. Our result is tight up to negligible terms when d≫n, and is the first to imply asymptotic consistency of noisy minimum-norm interpolation for isotropic features and sparse ground truths. Our work complements the literature on "benign overfitting" for minimum ℓ2-norm interpolation, where asymptotic consistency can be achieved only when the features are effectively low-dimensional.
How rotational invariance of common kernels prevents generalization in high dimensions

Konstantin Donhauser, Mingqi Wu, and Fanny Yang

International Conference on Machine Learning (ICML), 2021

abstract arXiv pdf poster

Kernel ridge regression is well-known to achieve minimax optimal rates in low-dimensional settings. However, its behavior in high dimensions is much less understood. Recent work establishes consistency for high-dimensional kernel regression for a number of specific assumptions on the data distribution. In this paper, we show that in high dimensions, the rotational invariance property of commonly studied kernels (such as RBF, inner product kernels and fully-connected NTK of any depth) leads to inconsistent estimation unless the ground truth is a low-degree polynomial. Our lower bound on the generalization error holds for a wide range of distributions and kernels with different eigenvalue decays. This lower bound suggests that consistency results for kernel ridge regression in high dimensions generally require a more refined analysis that depends on the structure of the kernel beyond its eigenvalue decay.
Interpolation can hurt robust generalization even when there is no noise

Konstantin Donhauser*, Alexandru Tifrea*, Michael Aerni, Reinhard Heckel, and Fanny Yang

Neural Information Processing Systems (NeurIPS), 2021

abstract arXiv pdf workshop poster slides video

Numerous recent works show that overparameterization implicitly reduces variance for min-norm interpolators and max-margin classifiers. These findings suggest that ridge regularization has vanishing benefits in high dimensions. We challenge this narrative by showing that, even in the absence of noise, avoiding interpolation through ridge regularization can significantly improve generalization. We prove this phenomenon for the robust risk of both linear regression and classification and hence provide the first theoretical result on robust overfitting.

Preprints

Efficient Randomized Experiments Using Foundation Models

Piersilvio De Bartolomeis, Javier Abad, Guanbo Wang, Konstantin Donhauser, Raymond M. Duch, Fanny Yang, and Issa J. Dahabreh

arXiv preprint, 2025

abstract arXiv code

Randomized experiments are the preferred approach for evaluating the effects of interventions, but they are costly and often yield estimates with substantial uncertainty. On the other hand, in silico experiments leveraging foundation models offer a cost-effective alternative that can potentially attain higher statistical precision. However, the benefits of in silico experiments come with a significant risk: statistical inferences are not valid if the models fail to accurately predict experimental responses to interventions. In this paper, we propose a novel approach that integrates the predictions from multiple foundation models with experimental data while preserving valid statistical inference. Our estimator is consistent and asymptotically normal, with asymptotic variance no larger than the standard estimator based on experimental data alone. Importantly, these statistical properties hold even when model predictions are arbitrarily biased. Empirical results across several randomized experiments show that our estimator offers substantial precision gains, equivalent to a reduction of up to 20% in the sample size needed to match the same precision as the standard estimator based on experimental data alone.

Blog posts

There will be hopefully soon some blog posts.

Short C.V.

04/2021 -	PhD, ETH Zurich
10/2018 - 3/2021	Research Intern - SML Group, ETH Zurich
1/2018 - 6/2020	M.Sc. Electrical Engineering, ETH Zurich
10/2017 - 6/2020	B.Sc. Mathematics, ETH Zurich
10/2014 - 4/2018	B.Sc. Electrical Engineering, ETH Zurich

Contact information

You can find me on find me Linkedin, Twitter and Google Scholar or just simply write me an Email via konstantin.donhauser [at] ai.ethz.ch or