I am a PhD candidate at the Department of Computer Science of ETH Zürich. Prior to starting my PhD, I got my master degree in electrical engineering and information technology at ETH Zürich.
Preprints
-
How robust accuracy suffers from certified training with convex relaxations
Piersilvio De Bartolomeis,
Jacob Clarysse,
Amartya Sanyal,
and Fanny Yang
NeurIPS Workshop on empirical falsification (Long Talk)
2022
Adversarial attacks pose significant threats to deploying state-of-the-art classifiers in safety-critical applications. Two classes of methods have emerged to address this issue: empirical defences and certified defences. Although certified defences come with robustness guarantees, empirical defences such as adversarial training enjoy much higher popularity among practitioners. In this paper, we systematically compare the standard and robust error of these two robust training paradigms across multiple computer vision tasks. We show that in most tasks and for both 𝓁∞-ball and 𝓁2-ball threat models, certified training with convex relaxations suffers from worse standard and robust error than adversarial training. We further explore how the error gap between certified and adversarial training depends on the threat model and the data distribution. In particular, besides the perturbation budget, we identify as important factors the shape of the perturbation set and the implicit margin of the data distribution. We support our arguments with extensive ablations on both synthetic and image datasets.
-
Margin-based sampling in high dimensions: When being active is less efficient than staying passive
Alexandru Tifrea*,
Jacob Clarysse*,
and Fanny Yang
International Conference on Machine Learning (ICML),
2023
It is widely believed that given the same labeling budget, active learning (AL) algorithms like margin-based active learning achieve better predictive performance than passive learning (PL), albeit at a higher computational cost. Recent empirical evidence suggests that this added cost might be in vain, as margin-based AL can sometimes perform even worse than PL. While existing works offer different explanations in the low-dimensional regime, this paper shows that the underlying mechanism is entirely different in high dimensions: we prove for logistic regression that PL outperforms margin-based AL even for noiseless data and when using the Bayes optimal decision boundary for sampling. Insights from our proof indicate that this high-dimensional phenomenon is exacerbated when the separation between the classes is small. We corroborate this intuition with experiments on 20 high-dimensional datasets spanning a diverse range of applications, from finance and histology to chemistry and computer vision.
-
Why adversarial training can hurt robust accuracy
Jacob Clarysse,
Julia Hörrmann,
and Fanny Yang
International Conference on Learning Representations (ICLR),
2023
Machine learning classifiers with high test accuracy often perform poorly under adversarial attacks. It is commonly believed that adversarial training alleviates this issue. In this paper, we demonstrate that, surprisingly, the opposite may be true – Even though adversarial training helps when enough data is available, it may hurt robust generalization in the small sample size regime. We first prove this phenomenon for a high-dimensional linear classification setting with noiseless observations. Our proof provides explanatory insights that may also transfer to feature learning models. Further, we observe in experiments on standard image datasets that the same behavior occurs for perceptible attacks that effectively reduce class information such as mask attacks and object corruptions.
Research interests
Currently, my main interest is in various theoretical perspectives and methods concerning trustworthy machine learning, ranging from classical common image corruptions to adversarial robustness and compositions thereof. To be extended.
Blog posts
Comming soon!
jacobcl@inf.ethz.ch
CAB E62.1 ETH Zürich