Recent & Upcoming Talks

William Gregory: Improving Arctic Sea Ice Predictability with Gaussian Processes

Arctic sea ice is a major component of the Earth’s climate system, as well as an integral platform for travel, subsistence, and habitat. Since the late 1970s, significant advancements have been made in our ability to closely monitor the state of the ice cover at the polar regions through the launch of Earth-observation satellites. Subsequently, now over 4 decades of time-series data at our disposal, we have observed significant reductions in the spatial extent of Arctic sea ice, and more recently its thickness — directly in line with increasing anthropogenic CO2 emissions. The summer months, in particular, present the largest rate of decline in sea ice extent compared to other seasons, and also the largest pattern of inter-annual variability, making seasonal to inter-annual predictions difficult. Advanced predictions of the summer ice conditions are important as this is the time when the ice cover is at its minimum extent, and the Arctic becomes open to a whole host of traffic including coastal resupply vessels, eco-tourism, and the movement of local communities. This presentation explores Gaussian processes as a framework for both sea ice forecasting, and for optimally combining and interpolating multiple satellite observation sets. In the first instance, the spatio-temporal patterns of variability in past ice conditions are exploited using a framework of a complex network, which is then fed into a Gaussian process regression forecast model in the form of a random walk graph kernel, to predict regional and pan-Arctic (basin-wide) September sea ice extents with high skill. Following this, we will see how extensions to this work can be made in the form of spatial forecasts by adopting a multi-task learning approach. In the second application, the Gaussian process regression method is used to optimally combine (and interpolate) observations from 3 separate satellite altimeters in space and time, in order to produce the first-ever daily pan-Arctic observational data set of Arctic sea ice freeboard (the base product for deriving sea ice thickness). Following this, we will see how extensions to this work can be made through computational speed-ups by using relevant vector machines. In both the forecasting and interpolation applications, the hyperparameters of the models are learned through the empirical Bayes, or type-II maximum likelihood approach, which in the second application allows us to derive information relating to the spatio-temporal correlation length scales of Arctic sea ice thickness.

Geoff Pleiss: Understanding Neural Networks through Gaussian Processes, and Vice Versa

Neural networks and Gaussian processes represent different learning paradigms: the former are parametric and rely on ERM-based training, while the latter are non-parametric and employ Bayesian inference. Despite these differences, I will discuss how Gaussian processes can help us understand and improve neural network design. One example of this is our recent work investigating the effect of width on neural networks. We study a generalized class of models—Deep Gaussian Processes—where parametric layers are replaced with GP layers. Analysis techniques from Bayesian nonparametrics uncover surprising pathologies of wide models, introduce a new interpretation of feature learning, and demonstrate a loss of adaptability with increasing width. We empirically confirm these findings hold for DGP, Bayesian neural networks, and conventional neural networks alike. With time permitting, I will also discuss recent work that leverages insights from neural network training to improve Gaussian process scalability. Taking inspiration from deep learning libraries, we constrain ourselves to write GP inference algorithms that only use matrix multiplication and other linear operations—procedures amenable to GPU acceleration and distributed computing. While these methods induce a slight bias—which we quantify and bound through a novel numerical analysis—we demonstrate that this can be eliminated through randomized truncation techniques and stochastic optimization.