We are a research group at UCL’s Centre for Artificial Intelligence. Our research expertise is in data-efficient machine learning, probabilistic modeling, and autonomous decision making. Applications focus on robotics, climate science, and sustainable development.
If you are interested in joining the team, please check out our openings.
Machine learning, Gaussian processes, Reinforcement learning, Robotics, Meta learning
Machine learning, Robotics
Machine learning, Bayesian theory, Differential-geometric learning, Structural priors
Machine learning, Gaussian processes, Bayesian optimization, Practical approximate inference
Machine learning, Optimal transport, Gaussian processes
Machine learning, Gaussian processes, Meta learning, Structural priors, Variational inference
Machine learning, Gaussian processes, Bayesian optimization
Machine learning, Reinforcement learning, Optimal control, Copulas
Machine learning, Generative models, Large-scale deep learning, Variational inference, Information theory, Sparsity
Machine learning, Meta learning, Differential geometry, Reinforcement learning
Machine learning, Climate science, Traffic engineering
Machine learning, Robotics, Reinforcement Learning, Meta learning
Machine learning, Robotics
Machine learning, Climate science, Fluid mechanics, Geometric mechanics
Machine learning, Deep probabilistic models, Approximate inference
Machine learning, Community detection, Representation of graphs, Hyperbolic embeddings
Machine learning, Bayesian optimization, Mechanistic models, Model discrimination
Gaussian processes are the gold standard for many real-world modeling problems, especially in cases where a model’s success hinges upon its ability to faithfully represent predictive uncertainty. These problems typically exist as parts of larger frameworks, wherein quantities of interest are ultimately defined by integrating over posterior distributions. These quantities are frequently intractable, motivating the use of Monte Carlo methods. Despite substantial progress in scaling up Gaussian processes to large training sets, methods for accurately generating draws from their posterior distributions still scale cubically in the number of test locations. We identify a decomposition of Gaussian processes that naturally lends itself to scalable sampling by separating out the prior from the data. Building off of this factorization, we propose an easy-to-use and general-purpose approach for fast posterior sampling, which seamlessly pairs with sparse approximations to afford scalability both during training and at test time. In a series of experiments designed to test competing sampling schemes’ statistical properties and practical ramifications, we demonstrate how decoupled sample paths accurately represent Gaussian process posteriors at a fraction of the usual cost.
Gaussian processes are nonparametric Bayesian models that have been applied to regression and classification problems. One of the approaches to alleviate their cubic training cost is the use of local GP experts trained on subsets of the data. In particular, product-of-expert models combine the predictive distributions of local experts through a tractable product operation. While these expert models allow for massively distributed computation, their predictions can suffer from erratic behaviour of the mean or uncalibrated uncertainty quantification. By calibrating predictions via tempered softmax weighting, we provide a solution to these problems for multiple product-of-expert models, including the generalised product of experts and the robust Bayesian committee machine. Furthermore, we leverage the optimal transport literature and propose a new product-of-expert model that combines predictions of local experts by computing their Wasserstein barycenter, which can be applied to both regression and classification.
We present a Bayesian non-parametric way of inferring stochastic differential equations for both regression tasks and continuous-time dynamical modelling. The work has high emphasis on the stochastic part of the differential equation, also known as the diffusion, and modelling it with Wishart processes. Further, we present a semi-parametric approach that allows the framework to scale to high dimensions. This successfully lead us onto how to model both latent and autoregressive temporal systems with conditional heteroskedastic noise. Experimentally, we verify that modelling diffusion often improves performance and that this randomness in the differential equation can be essential to avoid overfitting.
Learning workable representations of dynamical systems is becoming an increasingly important problem in a number of application areas. By leveraging recent work connecting deep neural networks to systems of differential equations, we propose variational integrator networks, a class of neural network architectures designed to preserve the geometric structure of physical systems. This class of network architectures facilitates accurate long-term prediction, interpretability, and data-efficient learning, while still remaining highly flexible and capable of modeling complex behavior. We demonstrate that they both noisy observations in phase space and from image pixels within which the unknown dynamics are embedded.
The interpretation of Large Hadron Collider (LHC) data in the framework of Beyond the Standard Model (BSM) theories is hampered by the need to run computationally expensive event generators and detector simulators. Performing statistically convergent scans of high-dimensional BSM theories is consequently challenging, and in practice unfeasible for very high-dimensional BSM theories. We present here a new machine learning method that accelerates the interpretation of LHC data, by learning the relationship between BSM theory parameters and data. As a proof-of-concept, we demonstrate that this technique accurately predicts natural SUSY signal events in two signal regions at the High Luminosity LHC, up to four orders of magnitude faster than standard techniques. The new approach makes it possible to rapidly and accurately reconstruct the theory parameters of complex BSM theories, should an excess in the data be discovered at the LHC.
Efficient sampling from Gaussian process posteriors is relevant in practical applications. With Matheron’s rule we decouple the …
Products of Gaussian process experts commonly suffer from poor performance when experts are weak. We propose aggregations and weighting …
Our group got three papers accepted at ICML 2020. Very well done to everyone and congratulations to some great work!
Congratulations to Simon Olofsson for defending his PhD!
We are looking for a (Senior) Research Fellow at the intersection of climate science and machine learning.
We are looking for a (Senior) Research Fellow at the intersection of robotics and machine learning.