Schedule | Statistics and Machine Learning Interface Meeting

Workshop Schedule

10:00 - 10:15	Registration for the Workshop

10:15 - 10:25	Welcome and Introduction [slides]
	Neil Lawrence

10:25 - 11:05	Geostatistical model, covariance structure and cokriging [slides]
	Hans Wackernagel, Mines, Paris Tech
	Kriging has been introduced as a statistical interpolation method for the design of computer experiments some twenty years ago. However, many aspects of the geostatistical methodology originally developed for natural resource estimation have been ignored when switching to this new context. This talk reviews concepts of multivariate geostatistics and in particular the estimation of components of spatial variation in the context of multiple correlated outputs. Application examples to ocean model output and remote sensing sea-surface temperature data are discussed.

11:05 - 11:25	Discussion

11:25 - 11:45	Coffee Break

11:45 - 12:25	Gaussian process emulation of multiple outputs [slides]
	Tony O'Hagan, Department of Probability and Statistics, University of Sheffield
	The use of GPs to model the outputs of complex simulators is well established. The GP trained on a sample of simulator outputs is known as an emulator, and allows tasks such as uncertainty analysis and calibration to be done with a minimum of expensive simulator runs. Simulators typically produce many outputs, or even whole time series or spatial fields of outputs, and the challenge then is to find effective ways to emulate multiple outputs. I will review the various emulation strategies that have been suggested, drawing connections between them and discussing the situations under which alternative methods would be appropriate.

12:25 - 12:45	Discussion

12:45 - 14:15	Lunch

14:15 - 14:55	Efficient Sparse Approximations for Convolution Processes [slides]
	Mauricio Alvarez, School of Computer Science, University of Manchester
	One approach to account for non-trivial correlations between outputs employs convolution processes. Under a latent function interpretation of the convolution transform it is possible to establish dependencies between output variables. However, efficient inference for this approach is usually a critical point. We present different sparse approximations for dependent output Gaussian processes constructed through the convolution formalism. Basically, we exploit the conditional independencies present naturally in the model leading to forms of the covariance similar in spirit to the PITC approximation and FITC approximations for a single output. We also present another set of variational approximations that provide a rigorous lower bound for the marginal likelihood of the model and introduce the concepts of variational inducing functions and variational inducing kernels to allow the latent functions to be white noise processes. Joint work with Neil Lawrence, David Luengo and Michalis Titsias.

14:55 - 15:15	Discussion

15:15 - 15:55	What should be transferred in transfer learning? [slides]
	Chris Williams, School of Informatics, University of Edinburgh
	I will start by discussing multi-task learning, and a number of ways in which transfer between tasks can take place, mainly in a co-kriging (or Gaussian process) framework. If there is time I will go into more detail on Multi-task Gaussian Process Learning of Robot Inverse Dynamics (joint work with Kian Ming Chai, Stefan Klanke, Sethu Vijayakumar).

15:55 - 16:15	Discussion

16:15 - 16:35	Tea Break

16:35 - 17:15	Latent Force Models and Multiple Output Gaussian Processes [slides]
	Neil D. Lawrence, School of Computer Science, University of Manchester
	We are used to dealing with the situation where we have a latent variable. Often we assume this latent variable to be independently drawn from a distribution, e.g. probabilistic PCA or factor analysis. This simplification is often extended for temporal data where tractable Markovian independence assumptions are used (e.g. Kalman filters or hidden Markov models). In this talk we will consider the more general case where the latent variable is a forcing function in a differential equation model. We will show how for some simple ordinary differential equations the latent variable can be dealt with analytically for particular Gaussian process priors over the latent force. In this talk we will introduce the general framework and present results in systems biology and motion capture.

17:15 - 17:35	Discussion

19:00 - 21:00	Workshop Dinner
	New Samsi Restaurant

Friday 24 July

10:00 - 10:40	Using prior knowledge in dynamic settings for multivariate Gaussian processes [slides]
	Dan Cornford, Computer Science, Aston University
	In this talk I will review the basic methods that physical scientists have been using for many years to work with huge, uncertain multivariate systems. I'll try and provide a context, starting with early work on constructing balanced fields in weather prediction models, move on to more modern dynamic methods based on 'ensembles' (which I will define) and finally briefly present some of our recent work on variational approaches to inference in stochastic dynamic models and show how this relates to the earlier works. I will try and suggest what I think the big open issues are, and how these might be addressed.

10:40 - 11:00	Discussion

11:00 - 11:20	Coffee Break

11:20 - 12:00	Bayes Linear Emulation of Computer Models
	Ian Vernon, Department of Mathematical Sciences, University of Durham
	I will introduce Bayes Linear Methodology, where expectation as opposed to probability is viewed as primitive. In this approach Bayes Theorem is replaced by the corresponding Bayes Linear Update for expectations, and we require prior specification of only expectations, variances and covariances of all quantities of interest. I will discuss how this can be applied naturally to the process of emulating computer model output, and describe an application involving the calibration of a Galaxy Formation simulation.

12:00 - 12:20	Discussion

12:20 - 13:20	Lunch

13:20 - 14:00	Calibrating the UVic climate model using principal component emulation [slides]
	Richard Wilkinson, Department of Probability and Statistics, University of Sheffield
	Uncertainties about potential feedbacks in the terrestrial carbon cycle are a key driver of the uncertainty in carbon cycle and climate projections. Here we analyze how oceanic, atmospheric and ice core carbon cycle observations improve key biogeochemical parameter estimates. We emulate the output of the UVic climate model using principal component analysis to reduce the dimension of the output before using Gaussian processes to model the reduced dimension model. We then reconstruct to the full space in order to calibrate the model, carefully accounting for code uncertainty as well as measurement, model and reconstruction error. Joint work with Nathan Urban

14:00 - 14:20	Discussion

14:20 - 15:00	Multivariate Emulation: Is it Worth the Trouble? [slides]
	Tom Fricker, Department of Probability and Statistics, University of Sheffield
	An emulator is a statistical surrogate for an expensive computer model, used to obtain fast probabilistic predictions of the outputs. Gaussian processes are frequently used for this purpose. Typically the data used to train the emulator is isotopic: values for all outputs are available at all sampling points. In this scenario, independent univariate GP emulators often produce predictions of individual outputs that are at least as good as a multivariate GP emulator, and avoid the difficulty of specifying the cross-covariance structure. This begs the question: why bother with a multivariate specification? I this talk I shall present examples where, while independent emulators are indeed the best option if interest is only in the marginal predictions of individual outputs, important information is lost when we consider what the model-user wishes to actually do with the outputs. Then the multivariate specification becomes necessary, and I shall compare some different options that are available for the covariance structure.

15:00 - 15:20	Discussion

15:20 - 15:40	Tea Break

15:40 - 16:20	Generalization Errors and Learning Curves for Regression with Multi-task Gaussian Processes
	Kian Ming Chai, School of Informatics, University of Edinburgh, U.K.
	I will discuss how task correlations in multi-task Gaussian process (GP) regression affect the generalization error and the learning curve, concentrating on the asymmetric two-task case. Lower and upper bounds to the generalization error and the learning curve will be given.

16:20 - 16:40	Discussion

16:40 - 16:45	Closing Comments
	Neil Lawrence