I am a PhD student of machine learning at the University of Oxford supervised by Dino Sejdinović and Yee Whye Teh. My research interests lie on the intersection of deep learning and kernel methods. In particular, I am interested in
University of Oxford, Department of Statistics
ETH Zurich, Department of Mathematics
ETH Zurich, Department of Mathematics
Performing exact posterior inference in complex generative models is often difficult or impossible due to an expensive to evaluate or intractable likelihood function. Approximate Bayesian computation (ABC) is an inference framework that constructs an approximation to the true likelihood based on the similarity between the observed and simulated data as measured by a predefined set of summary statistics. Although the choice of appropriate problem-specific summary statistics crucially influences the quality of the likelihood approximation and hence also the quality of the posterior sample in ABC, there are only few principled general-purpose approaches to the selection or construction of such summary statistics. In this paper, we develop a novel framework for this task using kernel-based distribution regression. We model the functional relationship between data distributions and the optimal choice (with respect to a loss function) of summary statistics using kernel-based distribution regression. We show that our approach can be implemented in a computationally and statistically efficient way using the random Fourier features framework for large-scale kernel learning. In addition to that, our framework shows superior performance when compared to related methods on toy and real-world problems.
While deep neural networks have achieved state-of-the-art performance on many tasks across varied domains, they still remain black boxes whose inner workings are hard to interpret and understand. In this paper, we develop a novel method for efficiently capturing the behaviour of deep neural networks using kernels. In par- ticular, we construct a hierarchy of increasingly complex kernels that encode indi- vidual hidden layers of the network. Furthermore, we discuss how our framework motivates a novel supervised weight initialization method that discovers highly discriminative features already at initialization.
DeepMind. London, UK
Supervisors: Yoshua Bengio and Aaron Courville
MILA, University of Montreal, Canada