The Statistical Machine Learning group is a research group at UCL’s Centre for Artificial Intelligence. Our research expertise is in data-efficient machine learning, probabilistic modeling, and autonomous decision making.

If you are interested in joining the Statistical Machine Learning group, please check out our openings.

Learning workable representations of dynamical systems is becoming an increasingly important problem in a number of application areas. …

Gibbs sampling is a Markov Chain Monte Carlo (MCMC) method often used in Bayesian learning. It is widely believed that MCMC methods are …

Mathematics for Machine Learning is a book that motivates people to learn mathematical concepts. The book is not intended to cover …

Gaussian processes are the gold standard for many real-world modeling problems, especially in cases where a model’s success …

Robots are envisioned as capable machines who easily navigate and interact in a world built for humans. However, looking around us we …

The Earth’s system dynamics has an intrinsically multi-scale and nonlinear nature, which fundamentally affects the ability to …

To achieve long-term climate change goals, such as limiting global warming to 1.5 or 2°C, there must be a global effort to decide and …

Way too often, observations from weather stations fall outside of the ensemble generated by initial uncertainty in weather models. This …

Robots are envisioned as capable machines who easily navigate and interact in a world built for humans. However, looking around us we …

Learning workable representations of dynamical systems is becoming an increasingly important problem in a number of application areas. By leveraging recent work connecting deep neural networks to systems of differential equations, we propose variational integrator networks, a class of neural network architectures designed to preserve the geometric structure of physical systems. This class of network architectures facilitates accurate long-term prediction, interpretability, and data-efficient learning, while still remaining highly flexible and capable of modeling complex behavior. We demonstrate that they both noisy observations in phase space and from image pixels within which the unknown dynamics are embedded.

Gaussian processes are the gold standard for many real-world modeling problems, especially in cases where a model’s success hinges upon its ability to faithfully represent predictive uncertainty. These problems typically exist as parts of larger frameworks, where quantities of interest are ultimately defined by integrating over posterior distributions. However, these algorithms’ inner workings rarely allow for closed-form integration, giving rise to a need for Monte Carlo methods. Despite substantial progress in scaling up Gaussian processes to large training sets, methods for accurately generating draws from their posterior distributions still scale cubically in the number of test locations. We identify a decomposition of Gaussian processes that naturally lends itself to scalable sampling by enabling us to efficiently generate functions that accurately represent their posteriors. Building off of this factorization, we propose decoupled sampling, an easy-to-use and general-purpose approach for fast posterior sampling. Decoupled sampling works as a drop-in strategy that seamlessly pairs with sparse approximations to Gaussian processes to afford scalability both during training and at test time. In a series of experiments designed to test competing sampling schemes’ statistical behaviors and practical ramifications, we empirically show that functions drawn using decoupled sampling faithfully represent Gaussian process posteriors at a fraction of the usual cost.

The interpretation of Large Hadron Collider (LHC) data in the framework of Beyond the Standard Model (BSM) theories is hampered by the need to run computationally expensive event generators and detector simulators. Performing statistically convergent scans of high-dimensional BSM theories is consequently challenging, and in practice unfeasible for very high-dimensional BSM theories. We present here a new machine learning method that accelerates the interpretation of LHC data, by learning the relationship between BSM theory parameters and data. As a proof-of-concept, we demonstrate that this technique accurately predicts natural SUSY signal events in two signal regions at the High Luminosity LHC, up to four orders of magnitude faster than standard techniques. The new approach makes it possible to rapidly and accurately reconstruct the theory parameters of complex BSM theories, should an excess in the data be discovered at the LHC.

Bayesian optimization is a sample-efficient approach to global optimization that relies on theoretically motivated value heuristics (acquisition functions) to guide its search process. Fully maximizing acquisition functions produces the Bayes’ decision rule, but this ideal is difficult to achieve since these functions are frequently non-trivial to optimize. This statement is especially true when evaluating queries in parallel, where acquisition functions are routinely non-convex, high-dimensional, and intractable. We first show that acquisition functions estimated via Monte Carlo integration are consistently amenable to gradient-based optimization. Subsequently, we identify a common family of acquisition functions, including EI and UCB, whose characteristics not only facilitate but justify use of greedy approaches for their maximization.

We are looking for a (Senior) Research Fellow at the intersection of climate science and machine learning.

We are looking for a (Senior) Research Fellow at the intersection of robotics and machine learning.

Rasmus Larsen visits SML