Schedule | Kernels for Multiple Outputs and Multi-task Learning: Frequentist and Bayesian Points of View

Workshop Schedule

07:30 - 07:45	Introduction
	Neil D. Lawrence, University of Manchester

07:45 - 08:25	Geostatistics for Gaussian Processes [video] [slides]
	Hans Wackernagel, MINES-ParisTech
	Gaussian process methodology has inspired a number of stimulating new ideas in the area of machine learning. Kriging has been introduced as a statistical interpolation method for the design of computer experiments twenty years ago. However, some aspects of the geostatistical methodology originally developed for natural resource estimation have been ignored when switching to this new context. This talk reviews concepts of geostatistics and in particular the estimation of components of spatial variation in the context of multiple correlated outputs.

08:25 - 09:05	Borrowing strength, learning vector valued functions, and supervised dimension reduction [video] [slides]
	Sayan Mukherjee, Duke University
	We study the problem of supervised dimension reduction from the perspective of learning vector valued functions and multi-task or hierarchical modeling in a regularization framework. An algorithm is specified and empirical results are provided. In the second part of the talk the same problem of supervised dimension reduction for a hierarchical model is revisted from a non-parametric Bayesian perspective.

09:05 - 09:30

Coffee Break

09:30 - 10:10	Gaussian processes and process convolutions from a Bayesian Perspective [video] [slides]
	Dave Higdon, Los Alamos National Laboratory

10:10 - 10:30

Discussion session

03:30 - 04:10	Prior Knowledge and Sparse Methods for Convolved Multiple Outputs Gaussian Processes [video] [slides]
	Mauricio A. Alvarez, University of Manchester
	One approach to account for non-trivial correlations between outputs employs convolution processes. Under a latent function interpretation of the convolution transform it is possible to establish dependencies between output variables. Two important aspects in this framework are how can we introduce prior knowledge and how can we perform efficient inference. Relating the convolution operation with dynamical systems, we can specify richer covariance functions for multiple outputs. We also present different sparse approximations for dependent output Gaussian processes in the context of structured covariances. Joint work with Neil Lawrence, David Luengo and Michalis Titsias.

04:10 - 04:50	Multi-Task Learning and Matrix Regularization [video] [slides]
	Andreas Argyriou, Toyota Technological Institute
	Multi-task learning extends the standard paradigm of supervised learning. In multi-task learning, samples for multiple related tasks are given and the goal is to learn a function for each task and also to generalize well (transfer learned knowledge) on new tasks. The applications of this paradigm are numerous and range from computer vision to collaborative filtering to bioinformatics while it also relates to vector valued problems, multiclass, multiview learning etc. I will present a framework for multi-task learning which is based on learning a common kernel for all tasks. I will also show how this formulation connects to the trace norm and group Lasso approaches. Moreover, the proposed optimization problem can be solved using an alternating minimization algorithm which is simple and efficient. It can also be "kernelized" by virtue of a multi-task representer theorem, which holds for a large family of matrix regularization problems and includes the classical representer theorem as a special case.

04:50 - 05:20

Coffee Break

05:20 - 06:00	Learning Vector Fields with Spectral Filtering [video] [slides]
	Lorenzo Rosasco, Massachusetts Institute of Technology and Universita' di Genova
	We present a class of regularized kernel methods for vector valued learning, which are based on filtering the spectrum of the kernel matrix. The considered methods include Tikhonov regularization as a special case, as well as interesting alternatives such as vector valued extensions of L2 boosting. While preserving the good statistical properties of Tikhonov regularization, some of the new algorithms allows for a much faster implementation since they require only matrix vector multiplications. We discuss the computational complexity of the different methods, taking into account the regularization parameter choice step. The results of our analysis are supported by numerical experiments.

06:00 - 06:30

Discussion session