I am a postdoctoral researcher at the University of Cambridge, working in the Machine Learning Group with Richard Turner. I am also a Research Fellow at St. John’s College, a Branco Weiss Fellow, and my research is supported by a Postdoc.Mobility Fellowship from the Swiss National Science Foundation. My research focuses on the interface between deep learning and probabilistic modeling. I am particularly keen to develop models that are more interpretable and data efficient, following the Bayesian paradigm. To this end, I am mostly trying to find better priors and more efficient inference techniques for Bayesian deep learning. Apart from that, I am also interested in deep generative modeling, meta-learning, and PAC-Bayesian theory.
I did my undergraduate studies in Molecular Life Sciences at the University of Hamburg, where I worked on phylogeny inference for quickly mutating virus strains with Andrew Torda. I then went to ETH Zürich to study Computational Biology and Bioinformatics (in a joint program with the University of Zürich), with a focus on systems biology and machine learning. My master’s studies were suported by an ETH Excellence Scholarship. My master’s thesis was about the application of deep learning to gene regulatory network inference under supervision of Manfred Claassen, for which I received the Willi Studer Prize. During my master’s studies, I also spent some time in Jacob Hanna’s group at the Weizmann Institute of Science, working on multiomics data analysis in stem cell research. I then did my PhD in Computer Science at ETH Zürich under the supervision of Gunnar Rätsch and Andreas Krause, where I was a member of the Biomedical Informatics group as well as the ETH Center for the Foundations of Data Science. I was supported by a PhD fellowship from the Swiss Data Science Center and was also an ELLIS PhD student. Within my PhD studies, I visited and worked with Stephan Mandt at the University of California in Irvine and Richard Turner at the University of Cambridge. Moreover, I completed internships at Disney Research Zürich, working with Romann Weber on deep learning for natural language understanding in the Machine Intelligence and Data Science team, at Microsoft Research Cambridge, working with Katja Hofmann on uncertainty quantification in deep learning in the Game Intelligence team, and at Google Brain, working with Efi Kokiopoulou and Rodolphe Jenatton on uncertainty estimation and out-of-distribution detection in the Reliable Deep Learning team. My Erdös–Bacon number is 6.
PhD in Machine Learning, 2021
MSc in Computational Biology and Bioinformatics, 2017
BSc in Molecular Life Sciences, 2015
University of Hamburg
We show that empirical weight distributions of SGD-trained neural networks are heavy-tailed and correlated and that incorporating these insights into Bayesian neural network priors can improve their performance and reduce the cold-posterior effect.
We provide a comprehensive review of the recent advances regarding the choice of priors for Bayesian neural networks, variational autoencoders, and (deep) Gaussian processes.
We derive a novel PAC-Bayes bound for meta-learning with Bayesian models, which gives rise to a computationally efficient meta-learning method that outperforms existing approaches on a range of tasks, especially when the number of meta-tasks is small.
We show that introducing a repulsive force between the members of a deep ensemble can improve the ensemble’s diversity and performance, especially when this force is applied in the function space, and that it can also guarantee asymptotic convergence to the true Bayes posterior.
We show that a Laplace-Generalized-Gauss-Newton approximation to the marginal likelihood of Bayesian neural networks can effectively be used for model selection and can often discover better hyperparameter settings than cross-validation.
We show that using a Gaussian process prior in the latent space of a variational autoencoder can improve time series imputation performance, while still allowing for computationally efficient inference through a variational Gauss-Markov process.