I am a PhD candidate at the Department of Computer Science of ETH Zürich. Prior to starting my PhD, I got my master degree in electrical engineering and information technology at ETH Zürich.
Margin-based sampling in high dimensions: When being active is less efficient than staying passive
and Fanny Yang
International Conference on Machine Learning (ICML),
It is widely believed that given the same labeling budget, active learning (AL) algorithms like margin-based active learning achieve better predictive performance than passive learning (PL), albeit at a higher computational cost. Recent empirical evidence suggests that this added cost might be in vain, as margin-based AL can sometimes perform even worse than PL. While existing works offer different explanations in the low-dimensional regime, this paper shows that the underlying mechanism is entirely different in high dimensions: we prove for logistic regression that PL outperforms margin-based AL even for noiseless data and when using the Bayes optimal decision boundary for sampling. Insights from our proof indicate that this high-dimensional phenomenon is exacerbated when the separation between the classes is small. We corroborate this intuition with experiments on 20 high-dimensional datasets spanning a diverse range of applications, from finance and histology to chemistry and computer vision.
Why adversarial training can hurt robust accuracy
and Fanny Yang
International Conference on Learning Representations (ICLR),
Machine learning classifiers with high test accuracy often perform poorly under adversarial attacks. It is commonly believed that adversarial training alleviates this issue. In this paper, we demonstrate that, surprisingly, the opposite may be true – Even though adversarial training helps when enough data is available, it may hurt robust generalization in the small sample size regime. We first prove this phenomenon for a high-dimensional linear classification setting with noiseless observations. Our proof provides explanatory insights that may also transfer to feature learning models. Further, we observe in experiments on standard image datasets that the same behavior occurs for perceptible attacks that effectively reduce class information such as mask attacks and object corruptions.
Currently, my main interest is in various theoretical perspectives and methods concerning trustworthy machine learning, ranging from classical common image corruptions to adversarial robustness and compositions thereof. To be extended.
CAB E62.1 ETH Zürich