Recent & Upcoming Talks

Oct 19, 2023

Maud Lemercier: Non-adversarial training of Neural SDEs with signature kernel scores

Neural SDEs are continuous-time generative models for sequential data. State-of-the-art performance for irregular time series generation has been previously obtained by training these models adversarially as GANs. However, as typical for GAN architectures, training is notoriously unstable, often suffers from mode collapse, and requires specialised techniques such as weight clipping and gradient penalty to mitigate these issues. In this talk, I will introduce a novel class of scoring rules on path space based on signature kernels and use them as an objective for training Neural SDEs non-adversarially. The strict properness of such kernel scores and the consistency of the corresponding estimators, provide existence and uniqueness guarantees for the minimiser. With this formulation, evaluating the generator-discriminator pair amounts to solving a system of linear path-dependent PDEs which allows for memory-efficient adjoint-based backpropagation. Moreover, because the proposed kernel scores are well-defined for paths with values in infinite-dimensional spaces of functions, this framework can be easily extended to generate spatiotemporal data. This procedure permits conditioning on a rich variety of market conditions and significantly outperforms alternative ways of training Neural SDEs on a variety of tasks including the simulation of rough volatility models, the conditional probabilistic forecasts of real-world forex pairs where the conditioning variable is an observed past trajectory, and the mesh-free generation of limit order book dynamics.

Oct 5, 2023

Thomas Baldwin-McDonald: Bayesian Deep Learning with Physics-informed Gaussian Processes

Dynamical systems are ubiquitous across the natural sciences, with many physical and biological processes being driven on a fundamental level by differential equations. In particularly complex systems it is often infeasible to characterise all of the individual processes present and the interactions between them. Rather than attempt to fully describe such systems, latent force models (LFMs) specify a simplified mechanistic model which captures salient features of the dynamics present. This leads to a model which is able to readily extrapolate beyond the training input space, thereby retaining one of the key advantages of mechanistic modeling over purely data-driven techniques. However, modeling nonlinear dynamical systems presents an additional challenge, as shallow models such as LFMs are generally less capable of modeling the non-stationarities often present in nonlinear systems than deep probabilistic models such as deep Gaussian processes (DGPs).

Feb 8, 2023

Viacheslav Borovitskiy: Geometric Gaussian Processes

Gaussian processes (GPs) are often considered to be the gold standard in settings where well-calibrated predictive uncertainty is of utter importance, such as decision making. It is important for applications to have a class of “general purpose” GPs. Traditionally, these are the stationary processes, e.g. RBF or Matérn GPs, at least for the usual vectorial inputs. For non-vectorial inputs, however, there is often no such class. This state of affairs hinders the use of GPs in a number of application areas ranging from robotics to drug design. In this talk, I will consider GPs taking inputs on a manifold, on a node set of a graph, or in a discrete “space” of graphs. I will discuss a framework for defining the appropriate general purpose GPs, as well as the analytic and numerical techniques that make them tractable.

Feb 3, 2023

Michel Tsamados: AI for polar remote sensing: making sense or making it up?

My background is in physics and my early work since being a postdoc and academic has been on model parameterizations of sea ice and physically based remote sensing techniques to retrieve sea ice motion, waves and currents. While I am still attached to finely tuned models and precise Earth Observation satellite engineering, I will review instead in this presentation some recent work where we used artificial intelligence to derive innovative satellite products. The unifying theme will be how to retrieve information on sea ice thickness from space. First, I will show how we derived 20+ years of sea ice roughness (turns out to be a good proxy for sea ice thickness) from NASA’s Multi-Angle Imaging SpectroRadiometer (MISR) instrument modelled using support vector regression techniques with airborne training data. These pan-Arctic maps seem ideally suited to provide information for safe shipping (i.e. for a certain RRS sir David Attenborough). Second, I will present how combining deep learning techniques for effective surface classifications with state-of-the-art altimetry allowed us to produce the first year-round sea ice thickness map in the Arctic. Finally, I will introduce some recent work using Gaussian Processes to perform optimal interpolation of altimetry tracks to infer sea ice variability at enhanced spatio-temporal resolutions but also to provide improved statistical forecasts of sea ice extent.

Feb 2, 2023

Marc Killpack: Modelling and Optimal Control for Uncertain Robotic Systems

Over the last several years, our research group has worked to develop control and modeling methods for large-scale, deformable, pneumatic, robot manipulators. In parallel, we have also worked to understand how teams of human agents successfully communicate intent and reach consensus while co-manipulating large objects (in terms of volume, or mass, or both). In this talk, I will present a brief overview of soft robot modelling and control, and human-robot co-manipulation problems. Then I will share approaches that we have used in optimal control and machine learning to improve on state-of-the-art methods. We expect these advances to be essential for improving the performance of our soft robots for real-world tasks such as servicing satellites or space stations and working near human collaborators. However, we also expect these results for control of large degree-of-freedom, nonlinear, uncertain systems to extend beyond the field of soft robotics and human-robot collaboration. Finally, I will outline open questions that I believe experts at the UCL Centre for Artificial Intelligence are primed to answer and that would hopefully lead to potential collaborations.

Nov 10, 2022

Kalesha Bullard: Multi-Agent Reinforcement Learning towards Zero-Shot (Emergent) Communication

Effective communication is an important skill for enabling information exchange and cooperation in multi-agent settings, in which AI agents coexist in shared environments with other agents (artificial or human). Indeed, emergent communication is now a vibrant field of research, with common settings involving discrete cheap-talk channels. One limitation of this setting however is that it does not allow for the emergent protocols to generalize beyond the training partners. Furthermore, the typical problem setting of discrete cheap-talk channels may be less appropriate for embodied agents that communicate implicitly through action. This talk presents research that investigates methods for enabling AI agents to learn general communication skills through interaction with other artificial agents. In particular, the talk will focus on my Postdoctoral work in cooperative Multi-Agent Reinforcement Learning, investigating emergent communication protocols, inspired by communication in more realistic settings. We present a novel problem setting and a general approach that allows for zero-shot communication (ZSC), i.e., emergence of communication protocols that can generalize to independently trained agents. We also explore and analyze specific difficulties associated with finding globally optimal ZSC protocols, as complexity of the communication task increases or the modality for communication changes (e.g. from symbolic communication to implicit communication through physical movement, by an embodied artificial agent). Overall, this work opens up exciting avenues for learning general communication protocols in more complex domains.

Nov 3, 2022

Dominik Baumann: Safe reinforcement learning: global exploration and discrete contexts

Leveraging reinforcement learning algorithms to control dynamical systems has become an increasingly popular approach over the past years. An important difference between dynamical systems and, for instance, gaming environments, is that failures in dynamical systems are often critical. While a game can simply be restarted, failures in dynamical systems often result in damaging expensive hardware. Thus, algorithms have emerged that guarantee, with high probability, that the system will not incur in any failures during exploration. In this talk, I will present two recent approaches that fall into this category. In exchange for their guarantees, safe learning algorithms can often only explore locally around an initially given safe policy. That way, they may fail to find the global optimum. To address this, I present a recent approach that allows for global exploration while retaining probabilistic safety guarantees. Second, most algorithms focus on regression from continuous sensor inputs to actions of the system. In reality, system dynamics are often affected by discrete “context” variables, such as whether the surface is frozen or wet, which they cannot measure directly. Thus, I present an approach for multi-class classification that provides frequentist guarantees and, therefore, can be used to classify discrete contexts in safe learning algorithms while still providing probabilistic guarantees. Apart from theoretical guarantees, I also show results from hardware experiments for both approaches.

Oct 13, 2022

Xiaoyu Lu: Additive Gaussian Processes Revisited

Gaussian Process (GP) models are a class of flexible non-parametric models that have rich representational power. By using a Gaussian process with additive structure, complex responses can be modelled whilst retaining interpretability. Previous work showed that additive Gaussian process models require high-dimensional interaction terms. We propose the orthogonal additive kernel (OAK), which imposes an orthogonality constraint on the additive functions, enabling an identifiable, low-dimensional representation of the functional relationship. We connect the OAK kernel to functional ANOVA decomposition, and show improved convergence rates for sparse computation methods. With only a small number of additive low-dimensional terms, we demonstrate the OAK model achieves similar or better predictive performance compared to black-box models, while retaining interpretability.

Yasemin Bekiroğlu

Sep 29, 2022

Dmitry Berenson: Learning Where to Trust Unreliable Dynamics Models for Motion Planning and Manipulation

The world outside our labs seldom conforms to the assumptions of our models. This is especially true for dynamics models used in control and motion planning for complex high-DOF systems like deformable objects. We must develop better models, but we must also accept that, no matter how powerful our simulators or how big our datasets, our models will sometimes be wrong. This talk will present our recent work on using unreliable dynamics models for motion planning and manipulation. Given a dynamics model, our methods learn where that model can be trusted given either batch data or online experience. These approaches allow imperfect dynamics models to be useful for a wide range of tasks in novel scenarios, while requiring much less data than baseline methods. This data-efficiency is a key requirement for scalable and flexible motion planning and manipulation capabilities.

Marc Deisenroth

Jul 20, 2022

Dan Roy: Admissibility is Bayes Optimality with Infinitesimals

We give an exact characterization of admissibility in statistical decision problems in terms of Bayes optimality in a so-called nonstandard extension of the original decision problem, as introduced by Duanmu and Roy. Unlike the consideration of improper priors or other generalized notions of Bayes optimalitiy, the nonstandard extension is distinguished, in part, by having priors that can assign ‘infinitesimal’ mass in a sense that can be made rigorous using results from nonstandard analysis. With these additional priors, we find that, informally speaking, a decision procedure δ0 is admissible in the original statistical decision problem if and only if, in the nonstandard extension of the problem, the nonstandard extension of δ0 is Bayes optimal among the (extensions of) standard decision procedures with respect to a nonstandard prior that assigns at least infinitesimal mass to every standard parameter value. We use the above theorem to give further characterizations of admissibility, one related to Blyth’s method, one to a condition due to Stein which characterizes admissibility under some regularity assumptions; and finally, a characterization using finitely additive priors in decision problems meeting certain regularity requirements. Our results imply that Blyth’s method is a sound and complete method for establishing admissibility. Buoyed by this result, we revisit the univariate two-sample common-mean problem, and show that the Graybill–Deal estimator is admissible among a certain class of unbiased decision procedures. Joint work with Haosui Duanmu (HIT) and David Schrittesser (Toronto).