journeesdusud

14 Juin	15 Juin	16 Juin
8:30 - 9:00 Accueil
9:00 - 10:30 V. Koltchinskii (course 1)	9:00 - 10:30 R. Willett (course 2)	9:00 - 10:00 R. Nickl
10:30 - 11:00 Pause café	10:30 - 11:00 Pause café	10:00 - 10:30 Pause café
11:00 - 12:00 L. Cavalier	11:00 - 12:00 V. Spokoiny	10:30 - 11:30 P. Pudlo
12:00 - 14:30 Buffet	12:00 - 14:30 Buffet	11:30 - 12:30 G. Lugosi
14:30 - 16:00 R. Willett (course 1)	14:30 - 16:00 V. Koltchinskii (course 2)	12:30 - 14:00 Buffet
16:00 - 16:30 Pause café	16:00 - 16:30 Pause café
16:30 - 17:30 L. Rosasco	16:30 - 17:30 T-M Pham-Ngoc
17:30 - 18:30 C. Tuleau-Malot	17:30 - 18:30 E. Arias-Castro

__________________________________________________________________________________

Programme détaillé

Mardi 14 Juin

9:00 - 10:30 V. Koltchinskii (course 1) Low Rank Matrix Recovery: Regularization and Oracle Inequalities

We will discuss a problem of estimation of a large matrix A based on independent measurements of n linear functionals of this matrix. If A is low rank or it can be well approximated by low rank matrices, the problem is to estimate A with an error that properly depends on the rank rather than on the overall size of the matrix. The basic examples of such problems are the matrix completion that has been intensively studied in the recent years, especially, in the case of noiseless measurements (e.g., Candes and Recht (2009); Candes and Tao (2009)), and the problem of estimation of a density matrix in quantum state tomography (e.g. Gross (2009)). We will discuss the most popular low rank estimation method based on nuclear norm penal- ization as well as some other methods suitable in the case of density matrix estimation (von Neumann entropy penalization and low rank recovery “without penalization”). Our goal is to establish low rank oracle inequalities in spirit of recent work on von Neumann entropy penalization by Koltchinskii (2010) and on nuclear norm penalization by Koltchinskii, Lounici and Tsybakov (2011). The proofs of these inequalities rely on a variety of empirical processes tools including concentration inequalities, generic chaining bounds and noncommutative extensions of classical exponential bounds for sums of independent random variables.

11:00 - 12:00 L. Cavalier                               Inverse problems in statistics

There exist many fields where inverse problems appear. Some examples are: astronomy (blurred images of the Hubble satellite), econometrics (instrumental variables), financial mathematics (model calibration of the volatility), medical image processing (X-ray tomography) and quantum physics (quantum homodyne tomography). These are problems where we have indirect observations of an object (a function) that we want to reconstruct, through a linear operator A. One needs regularization methods in order to get a stable and accurate reconstruction. We present the framework of statistical inverse problems where the data are corrupted by some stochastic error. This white noise model may be discretized in the spectral domain using Singular Value Decomposition (SVD), when the operator A is compact. Several examples of inverse problems where the SVD is known are presented (circular deconvolution, tomography). We explain some basic issues regarding nonparametric statistics applied to inverse problems. Standard regularization methods are presented (projection, Landweber, Tikhonov,...). The notion of optimal rate of convergence leads to some optimal choice of the tuning parameter. However these optimal parameters are unachievable since they depend on the unknown smoothness of the function. This leads to more recent concepts like adaptive estimation and oracle inequalities. Data-driven selection procedures of the regularization parameter are discussed.

14:30 - 16:00 R. Willett (course 1)             Signal Reconstruction from Poisson Data – Performance Bounds, Algorithms, and Physical Constraints

Many critical scientific and engineering applications rely upon the accurate reconstruction of spatially or temporally distributed phenomena from Poisson data. When the number of observed events is very small, accurately extracting knowledge from this data requires the development of both new computational methods and novel theoretical analysis frameworks. This task is particularly challenging since sensing is often indirect in nature, such as in compressed sensing or tomographic projections in medical imaging, resulting in complicated reconstruction problems. Furthermore, limited system resources, such as data acquisition time and sensor array size, lead to complex tradeoffs between sensing and processing. All of these issues combine to make accurate reconstruction a complicated task, involving a myriad of system-level and algorithm tradeoffs.

In this talk, I will describe a theoretical framework for assessing tradeoffs between reconstruction accuracy and system resources when the underlying intensity is sparse. The theory supporting these methods facilitates characterization of fundamental performance limits. Examples include lower bounds on the best achievable error performance in photon-limited image reconstruction and upper bounds on the data acquisition time required to achieve a target reconstruction accuracy. We will also see that compressed sensing with Poisson noise has very different properties than more conventional formulations. Finally, I will describe novel reconstruction algorithms which use a penalized negative Poisson log-likelihood objective function with nonnegativity constraints (since Poisson intensities are naturally nonnegative). The effectiveness of the theory and methods will be demonstrated for several important applications, including coded aperture imaging and medical image reconstruction.

16:30 - 17:30 L. Rosasco                            Spectral Methods for Computational Learning

In this talk we present a class of spectral methods to learn from high dimensional data sets arising in a wide variety of applications. The approach we propose can be applied to supervised learning (regression, classiﬁcation, multiclass/multitask) as well as to unsupervised learning (set/support estimation). Empirically the derived algorithms obtain state of the art performances both on simulated and real data, while achieving optimal learning rates from a theoretical point of view. Interestingly, our analysis indicates a deep connection between inference and computational principles. The tools we build upon are spectral and analytical methods that highlight the relationships between learning theory and other ﬁelds in applied sciences, such as inverse problems, statistics and signal processing.

17:30 - 18:30 C. Tuleau-Malot                   Adaptive density estimation: a curse of support?

The estimation of a density on the real line is a classical problem since it is at the core of many data preprocessing. However, in a lot of works, a main assumption is that the support of the underlying density is a known compact, and often that the density is bounded. With P. Reynaud-Bouret and V. Rivoirard, we developped a new adaptive method, based on wavelet thresholding, which makes as few assumptions as possible on the density and in particular, no assumption on its support. In this presentation, after a state of the art, a first part is devoted to the practical point of view. I compare, in practice, our method to some others, in particular to gaussian kernel, to the root-unroot algorithm and to the Willet and Novak one. Therefore, I show the influence of the support on the estimation. The second part is more theoretical and exposes the main results obtained.

Mercredi 15 Juin

9:00 - 10:30 R. Willett (course 2)             Signal Reconstruction from Poisson Data – Performance Bounds, Algorithms, and Physical Constraints

11:00 - 12:00 V. Spokoiny                        Parametric inference. Revisited

The classical parametric theory is based on the assumptions of parametric structure and of a large sample size (relative to the number of parameters). The talk discusses the parametric estimation and inference problem in the situation when the parametric model is possibly misspecified and the sample size is fixed. The main results describe the concentration properties of the (quasi) MLE and the coverage bounds of the likelihood based confidence sets. Corollaries about approximate efficiency and expansions of the MLE are presented. We also discuss extensions to penalized MLE, Bayes and semiparametric estimation.

14:30 - 16:00 V. Koltchinskii (course 2)Low Rank Matrix Recovery: Regularization and Oracle Inequalities

16:30 - 17:30 T-M Pham-Ngoc               Spherical deconvolution: the dictionary approach and needlet thresholding algorithm

In this talk, we deal with the problem of spherical deconvolution. We present two different approaches to this problem, a thresholding algorithm through the very well localized basis of needlets and a well calibrated l1 criterion which allows to consider an overcomplete dictionary based on needlets and spherical harmonics. We obtain theoretical performances for these two methods and we compare their practical performances.

17:30 - 18:30 E. Arias-Castro             Cluster Detection in Networks using Percolation

We consider the task of detecting a salient cluster in a (sensor) network, i.e., an undirected graph with a random variable attached to each node. Motivated by recent research in environmental statistics and the drive to compete with the reigning scan statistic, we explore alternatives based on the percolative properties of the network. The first method is based on the size of the largest connected component after removing any node in the network whose value is lower than a given threshold. The second one is the upper level set scan test introduced by Patil and Taillie (2003). We establish their performance in an asymptotic decision theoretic framework where the network size increases. We make abundant use of percolation theory to derive our theoretical results and our theory is complemented with some numerical experiments. [Joint work with Geoffrey Grimmett (Cambridge)]

Jeudi 16 Juin

9:00 - 10:00 R. Nickl Adaptive Nonparametric Confidence Sets

I shall review some of the key problems with and results on adaptive nonparametric confidence sets, both for confidence balls and confidence bands. The various ways adaptive confidence statements are connected with nonparametric testing problems will be highlighted, and I will discuss recent results that give necessary and sufficient conditions for existence of adaptive confidence balls and bands, and explain the intimate link of such results to the study of certain minimax nonparametric hypothesis testing problems where both the null and alternative hypothesis are composite and infinite-dimensional.

10:30 - 11:30 P. Pudlo Approximation Bayesian computational (ABC) methods: an overview

Also known as likelihood-free methods, approximate Bayesian computational (ABC) methods have appeared in the past ten years as the most satisfactory approach to untractable likelihood problems, first in genetics then in a broader spectrum of applications. However, these methods suffer to some degree from calibration difficulties that make them rather volatile in their implementation and thus render them suspicious to the users of more traditional Monte Carlo methods. We will review its recent developments, and illustrate those methods in some population genetics experiment, where models based on coalescence processes do not allow computation of the likelihood.

11:30 - 12:30 G. Lugosi Random geometric graphs in high dimensions.

Motivated by a statistical problem of testing dependencies, we introduce a model of random geometric graphs in high dimensions. We show that as the dimension grows, the graph becomes similar to an Erdôs-Rényi random graph. We pay particular attention to the clique number of such graphs and show that it is very close to that of the corresponding Erdôs-Rényi graph when the dimension is larger than log^3 n where n is the number of vertices. The talk is based on joint work with Luc Devroye, András György, and Frederic Udina.