This page is no longer updated. Future seminars are here.



Matthew Nunes
University of Bath
Friday, June 19, 2020
Time 16:00 – zoom meeting
Title: Spectral analysis and stationarity tests for time series with missing values
Abstract
Quadratic forms are ubiquitous and intensively studied in statistics, often in time series analysis, including those formed out of wavelet coefficients. Most wavelet transform methods in statistics assume regularlyspaced and complete data, which does not always occur in real problems where observations are sometimes missing, resulting in a nonregular design. To handle this, we use secondgeneration wavelets (lifting) which are explicitly designed to handle nonregular situations: we introduce a new estimator of the secondgeneration wavelet spectrum and show that it is consistent in the case of an underlying locally stationary wavelet process where the observations are subject to a random dropout model. Our new estimator is then used to construct a new liftingbased stationarity test with significance assessed by the bootstrap. The simulation study shows excellent results, not only on time series with missing observations, but in the complete data settings too.

 ___

Prof. Sonja Petrovic
Illinois Institute of Technology
Friday, June 12, 2020
Time 15:00 – zoom meeting
Title: Algebraic statistics, tables, and networks
Abstract
Testing goodness of fit of a model for network data is a difficult problem that has received some attention recently from the statistical community. We will overview this problem from the point of view of contingency tables and exact testing, and illustrate it on a few examples.
The contingency table point of view follows late Stephen Fienberg’s vision, and the developments highlight his affinity for contingency table problems and reinterpreting models with a fresh look, which gave rise to a new approach for hypothesis testing of network models that are linear exponential families. The family of the socalled loglinear exponential random graph models turns out to be surprisingly broad and includes many popular models, such as degreebased, stochastic blockmodels, and combinations of these.

 ___

Prof. Yulia R. Gel
University of Dallas at Texas
Friday, May 29, 2020
Time 16:00 – zoom meeting
Title: On the the Role of Higher Order Topological Properties in Functionality of Complex Networks
Abstract
The emerging machinery of topological data analysis and, particularly, persistent homology allow to unveil some critical characteristics behind organization of complex networks and interactions of their components at multiscale levels, which are otherwise largely inaccessible with conventional analytical approaches.
The ultimate idea is to study properties of progressively finer simplical complexes on graphs over a range of (dis)similarity thresholds and then to assess which topological characteristics exhibit a longer lifespan (or persist) across multiple (dis)similarity thresholds. Such persistent features are likelier to be related to intrinsic network organization and functionality.
In turn, features with shorter lifespan can be referred to as topological noise. In this talk we discuss how geometry and topology of blockchain transaction networks, assessed with the machinery of persistent homology, can enhance our understanding of hidden mechanisms behind blockchain graph anomalies and associated crypto price dynamics.

 ___

Prof. Eric D. Kolaczyk
Boston University
Friday, May 22, 2020
Time 16:00 – zoom meeting
Title: Statistics 101 for network data objects
Abstract
It is becoming increasingly common to see large collections of network data objects — that is, data sets in which a network is viewed as a fundamental unit of observation. As a result, there is a pressing need to develop networkbased analogues of even many of the most basic techniques already standard for scalar and vector data. At the same time, principled extensions of familiar techniques to this context are nontrivial, given that networks are inherently nonEuclidean. We will present a number of results extending the notion of asymptotic inference for means to the contexts of various types of networks, i.e., both labeled and unlabeled, and either single or multilayer. These results rely on a combination of tools from geometry, probability theory, and statistical shape analysis. We will illustrate drawing from various applications in bioinformatics, computational neuroscience, and social network analysis under privacy

 ___

Prof. Michael Wolf
UZH
Friday, May 15, 2020 (this talk was initially planned on March 13)
Time 15:00 – zoom meeting
The talk can be found here
Title: Shrinkage Estimation of Large Covariance Matrices: Keep It Simple, Statistician?
Abstract
Under rotationequivariant decision theory, sample covariance matrix eigenvalues can be optimally shrunk by recombining sample eigenvectors with a (potentially nonlinear) function of the unobservable population covariance matrix. The optimal shape of this function reflects the loss/risk that is to be minimized.
We solve the problem of optimal covariance matrix estimation under a variety of loss functions motivated by statistical precedent, probability theory, and differential geometry. A key ingredient of our nonlinear shrinkage methodology s a new estimator of the angle between sample and population eigenvectors, without making strong assumptions on the population eigenvalues.
We also introduce a broad family of covariance matrix estimators that can handle all regular functional transformations of the population covariance matrix under largedimensional asymptotics.
In addition, we compare via Monte Carlo simulations our methodology to two simpler ones from the literature, linear shrinkage and shrinkage based on the spiked covariance model.

 ___

Prof. Ernst Wit
Università della Svizzera italiana
Friday, May 8, 2020
Time 14:00 – zoom meeting
The talk can be found here
Title: COVID19 and the perils of inferring epidemiological parameters from clinical data
Abstract
Knowing the infection fatality ratio (IFR) is of crucial importance for evidencebased epidemic management: for immediate planning; for balancing the life years saved against the life years lost due the consequences of management; and for evaluating the ethical issues associated with the tacit willingness to pay substantially more for life years lost to the epidemic, than for those to other diseases. Against this background, in an impressive paper, Verity et al. (2020) have rapidly assembled case data and used statistical modelling to infer the IFR for COVID19.
Given the importance of the issues, the necessarily compromised nature of the data and the consequent heavy reliance on modelling assumptions, my collaborators and I present an indepth statistical review of what has been done. We have attempted this, conscious that the circumstances require setting aside the usual standards of statistical nitpicking. Facilitated by Verity et al. (2020)’s exemplary provision of their code and data, we have attempted to identify to what extent the data are sufficiently informative about the IFR to play a greater role than the modelling assumptions, and have tried to identify those assumptions that appear to play a key role.
After having identified some of the weakness in the analysis, we propose a crude alternative Bayesian model to estimate the IFR, which results in lower values. Nevertheless, we do not believe that it is possible to model our way out of the deficiencies in the clinical data in order to estimate crucial epidemiological parameters. There is an urgent need to replace complex models of inadequate clinical data, with simpler models using adequate epidemiological prevalence data based on appropriately designed, random sampling.
This is joint work with Simon N. Wood, Matteo Fasiolo and Peter J. Green

 ___

Prof. David Madigan
Columbia University
Friday, April 24, 2020
Time 16:00 – zoom meeting
The talk can be found here
Title: Towards honest inference from realworld healthcare data
Abstract
In practice, our learning healthcare system relies primarily on observational studies generating one effect estimate at a time using customized study designs with unknown operating characteristics and publishing – or not – one estimate at a time. When we investigate the distribution of estimates that this process has produced, we see clear evidence of its shortcomings, including an apparent overabundance of statistically significant effects.
We propose a standardized process for performing observational research that can be evaluated, calibrated and applied at scale to generate a more reliable and complete evidence base than previously possible. We demonstrate this new paradigm by generating evidence about all pairwise comparisons of 39 treatments for hypertension for a relevant set of 58 health outcomes using nine largescale health record databases from four countries.
In total, we estimate 1.3M hazard ratios, each using a comparative effectiveness study design and propensity score stratification on par with current oneoff observational studies in the literature. Moreover, the process enables us to employ negative and positive controls to evaluate and calibrate estimates ensuring, for example, that the 95% confidence interval includes the true effect size 95% of the time. The result set consistently reflects current established knowledge where known, and its distribution shows no evidence of the faults of the current process.Joint work with George Hripcsak, Patrick Ryan, Martijn Schuemie, and Marc Suchard.

 ___

Prof. Joe Guinness
Cornell University
Friday, March 27, 2020
Time 16:00 – zoom meeting
Title: Inverses of Matern Covariances on Grids
Abstract
We conduct a theoretical and numerical study of the aliased spectral densities and inverse operators of Matérn covariance functions on regular grids. We apply our results to provide clarity on the properties of a popular approximation based on stochastic partial differential equations; we find that it can approximate the aliased spectral density and the covariance operator well as the grid spacing goes to zero, but it does not provide increasingly accurate approximations to the inverse operator as the grid spacing goes to zero. If a sparse approximation to the inverse is desired, we suggest instead to select a KLdivergenceminimizing sparse approximation and demonstrate in simulations that these sparse approximations deliver accurate Matérn parameter estimates, while the SPDE approximation overestimates spatial dependence.

 ___

Dr. Ioannis Kosmidis
Warwick University
Friday, March 6, 2020
Time 15:15 – Room CM 1 221
Title: Improved estimation of partiallyspecified models
Abstract
Abstract:
Many popular methods for the reduction of estimation bias rely on an approximation of the bias function under the assumption that the model is correct and fully specified. Other bias reduction methods, like the bootstrap, the jackknife and indirect inference require fewer assumptions to operate but are typically computerintensive, requiring repeated optimization.We present a novel framework for reducing estimation bias that:
i) can deliver estimators with smaller bias than reference estimators even for partiallyspecified models, as long as estimation is through unbiased estimating functions;
ii) always results in closedform biasreducing penalties to the objective function if estimation is through the maximization of one, like maximum likelihood and maximum composite likelihood.
iii) relies only on the estimating functions and/or the objective and their derivatives, greatly facilitating implementation for general modelling frameworks through numerical or automatic differentiation techniques and standard numerical optimization routines.
The biasreducing penalized objectives closely relate to information criteria for model selection based on the KullbackLeibler divergence, establishing, for the first time, a strong link between reduction of estimation bias and model selection. We also discuss the asymptotic efficiency properties of the new estimator, inference and model selection, and present illustrations in wellused, important modelling settings of varying complexity.
Related preprint:
http://arxiv.org/abs/2001.03786Joint work with:
Nicola Lunardon, University of MilanoBicocca, Milan, Italy

 ___

Dr. Pramita Bagchi
Volgenau School of Engineering
Thursday, November 28, 2019
Time 14:15 – Room MA 10
Title: A test for separability in covariance operators of random surfaces
Abstract
The assumption of separability is a simplifying and very popular assumption in the analysis of spatiotemporal or hypersurface data structures. It is often made in situations where the covariance structure cannot be easily estimated, for example because of a small sample size or because of computational storage problems. We propose a new and very simple test to validate this assumption. Our approach is based on a measure of separability which is zero in the case of separability and positive otherwise. We derive the asymptotic distribution of a corresponding estimate under the null hypothesis and the alternative and develop an asymptotic and a bootstrap test, which are very easy to implement. In particular, the approach does neither require projections on subspaces generated by the eigenfunctions of the covariance operator nor distributional assumptions as recently used by other works to construct tests for separability. We investigate the finite sample performance by means of a simulation study and also provide a comparison with the currently available methodology. Finally, the new procedure is illustrated analyzing a data example.
—

Dr. Heather Battey
Imperial College London
Friday, November 15, 2019
Time 10:30 Room GR A3 31
Title: Aspects of highdimensional inference
Abstract
Statistical analysis when the number of unknown parameters is comparable with the number of independent observations may demand modification of maximum likelihood based methods. There are comparable difficulties with Bayesian analyses based on high dimensional “flat” priors. This discursive talk will cover a number of perspectives on this situation, including the implications of sparsity and the role of different types of parameters.
—

Dr. Kaushik Jana
Imperial College London
Friday, September 27, 2019
Time 15:15 – Room CM 1 113
Title: The Statistical Face of a Region under Monsoon Rainfall in Eastern India
Abstract
A region under rainfall is a contiguous spatial area receiving positive precipitation at a particular time. The probabilistic behavior of such a region is an issue of interest in meteorological studies. A region under rainfall can be viewed as a shape object of a special kind, where scale and rotational invariance are not necessarily desirable attributes of a mathematical representation. For modeling variation in objects of this type, we propose an approximation of the boundary that can be represented as a real valued function, and arrive at further approximation through functional principal component analysis, after suitable adjustment for asymmetry and incompleteness in the data. The analysis of an open access satellite data set on monsoon precipitation over Eastern Indian subcontinent leads to explanation of most of the variation in shapes of the regions under rainfall through a handful of interpretable functions that can be further approximated parametrically. The most important aspect of shape is found to be the size followed by contraction/elongation, mostly along two pairs of orthogonal axes. The different modes of variation are remarkably stable across calendar years and across different thresholds for minimum size of the region.
Authors:
Kaushik Jana (Imperial College London), Debasis Sengupta, Subrata Kundu,
Arindam Chakraborty and Purnima Shaw—

Prof. Andrew Harvey
University of Cambridge
Thursday, September 26, 2019
Time 14:15 – Room CM 1 100
Title: Modeling directional (circular) time series
Abstract
Circular observations pose special problems for time series modeling. This article shows how the scoredriven approach, developed primarily in econometrics, provides a natural solution to the difficulties and leads to a coherent and unified methodology for estimation, model selection and testing. The new methods are illustrated with hourly data on wind direction.
—

Dr. Nina Miolane
Stanford University
Monday, September 9, 2019
Time 15:15 – Room MA10
Title: Learning submanifolds with geometric variational autoencoders
Abstract
Geometric statistics is a theory of statistics for data belonging to nonEuclidean spaces or manifolds. Such data naturally arise when computing with biomedical images. For example, the data space of brain connectomes computed from functional magnetic resonance imaging (fMRI) can be represented as a Riemannian manifold of symmetric positive definite (SPD) matrices equipped with an affineinvariant metric.
We are interested in dimensionality reduction methods for this type of data. Principal Component Analysis (PCA) on Euclidean spaces has been generalized to manifolds with, for example, Principal Geodesic Analysis (PGA) which learns a lowerdimensional “geodesic subspace” N that best captures the data variability. Nonlinear dimensionality reduction methods like the popular variational autoencoders (VAE), however, have not been generalized to manifolds.
We introduce the “geometric variational autoencoder” (gVAE), a method to learn a submanifold N of a Riemannian manifold M. On the one hand, it extends VAEs to Riemannian manifolds and adds a geometric prior. On the other hand, it extends PGA and its variants (i) by relaxing the geodesic constraint on the subspace N and (ii) by providing approximate posterior distributions of the lowerdimensional representations of the data. We present a python package for geometric statistics, geomstats, that we use to implement gVAE on GPUs. We show results on simulated and real brain connectomes data.
—

Prof. Ryan Tibshirani
Carnegie Mellon University
Thursday, September 5, 2019
Time 16:15 – Room MA11
Title: Surprises in HighDimensional Ridgeless Least Squares Interpolation
Abstract
Interpolators—estimators that achieve zero training error—have attracted growing attention in machine learning, mainly because stateofthe art neural networks appear to be models of this type. We study minimum L2 norm (“ridgeless”) interpolation in highdimensional least squares regression. We consider two different models for the feature distribution: a linear model, where the feature vectors are obtained by applying a linear transform to a vector of i.i.d. entries, and a nonlinear model, where the feature vectors are obtained by passing the input through a random onelayer neural network. We recover—in a precise quantitative way—several phenomena that have been observed in largescale neural networks and kernel machines, including the “double descent” behavior of the prediction risk, and the potential benefits of overparametrization.
This represents work with Trevor Hastie, Andrea Montanari, and Saharon Rosset.  Spring semester
Dr. Finn Lindgren
The University of Edinburgh
Friday, February 22, 2019
Time 15:15 – Room MA10
Title: Quantifying the uncertainty of contour maps
Abstract
Contour maps are ubiquitous for visualising estimated spatial fields, but the uncertainty associated with such maps has been given a surprisingly small amount of attention. The question is closely connected with the dual problem of constructing credible regions for excursion sets, which leads to a more stringent formulation of the problem. With computational implementations, we can answer questions such as “How many or few contours levels is it reasonable to use, given the inherent uncertainty?”.
I will discuss these issues in particular in the context of Bayesian latent Gaussian random field models estimated with Integrated Nested Laplace Approximations. 
Dr. David Kraus
Masaryk University
Thursday, February 28, 2019
Time 14:15 – Room MA12
Title: Regularized classification of functional data under incomplete observation
Abstract
Classification of functional data into two groups by linear classifiers is considered on the basis of onedimensional projections of functions. Finding the best classifier is seen as an optimization problem that can be approximately solved by regularization methods, e.g., the conjugate gradient method with early stopping, the principal component method and the ridge method. We study the empirical version with finite training samples consisting of incomplete functions observed on different subsets of the domain and show that the optimal, possibly zero, misclassification probability can be achieved in the limit along a possibly nonconvergent empirical regularization path. We propose a domain extension and selection procedure that finds the best domain beyond the common observation domain of all curves. In a simulation study we compare the different regularization methods and investigate the performance of domain selection. Our methodology is illustrated on a medical data set, where we observe a substantial improvement of classification accuracy due to domain extension.
The talk is based on joint work with Marco Stefanucci.

Dr. Phyllis Wan
Erasmus University Rotterdam
Friday, March 15, 2019
Time 14:15 – Room CM 1 113
Title: Applications of distance covariance to time series
Abstract
In many statistical frameworks, goodnessoffit tests are administered to the estimated residuals. In the time series setting, whiteness of the residuals is assessed using the sample autocorrelation function (ACF). In this talk, we apply the autodistance covariance function (ADCV) to evaluate the serial dependence of the estimated residuals. Distance covariance can discriminate between dependence and independence of two random vectors. The limit behavior of the test statistic based on the ADCV is derived for a general class of time series models. One of the key aspects in this theory is adjusting for the dependence that arises due to parameter estimation. This adjustment has essentially the same form regardless of the model specification. We illustrate the results in simulated examples.

Dr. Guillaume Obozinski
Swiss Data Science Center, EPFL
Friday, April 12, 2019
Time 14:15 – Room CM 1 113
Title: Convex unmixing and learning the effect of latent variables in Gaussian Graphical models with unobserved variables
Abstract
The edge structure of the graph defining an undirected graphical model describes precisely the structure of dependence between the variables in the graph. In many applications, the dependence structure is unknown and it is desirable to learn it from data, often because it is a preliminary step to be able to ascertain causal effects. This problem, known as structure learning, is hard in general, but for Gaussian graphical models it is slightly easier because the structure of the graph is given by the sparsity pattern of the precision matrix of the joint distribution, and because independence coincides with decorrelation. A major difficulty too often ignored in structure learning is the fact that if some variables are not observed, the marginal dependence graph over the observed variables will possibly be significantly more complex and no longer reflect the direct dependencies that are potentially associated with causal effects. In this work, we consider a family of latent variable Gaussian graphical models in which the graph of the joint distribution between observed and unobserved variables is sparse, and the unobserved variables are conditionally independent given the others. Prior work was able to recover the connectivity between observed variables, but could only identify the subspace spanned by unobserved variables, whereas we propose a convex optimization formulation based on structured matrix sparsity to estimate the complete connectivity of the complete graph including unobserved variables, given the knowledge of the number of missing variables, and a priori knowledge of their level of connectivity. Our formulation is supported by a theoretical result of identifiability of the latent dependence structure for sparse graphs in the infinite data limit, which is a particular instance of a more general result we prove for unmixing with convex norms. We propose an algorithm leveraging recent active set methods, which performs well in the experiments on synthetic data.

Prof. Alan Welsh
ANU College of Science
Thursday, April 18, 2019
Time 15:15 – Room MA 12
Title: Using the Bootstrap in Generalized Regression Estimation
Abstract
We discuss a generalized regression estimation procedure that can lead to much improved estimators of general population characteristics, such as quantiles, variances, and coefficients of variation. The method is quite general and requires minimal assumptions, the main ones being that the asymptotic joint distribution of the target and auxiliary parameter estimators is multivariate normal, and that the population values of the auxiliary parameters are known. The assumption on the asymptotic joint distribution implies that the relationship between the estimated target and the estimated auxiliary parameters is approximately linear with coefficients determined by their asymptotic covariance matrix. Use of the bootstrap to estimate these coefficients avoids the need for parametric distributional assumptions. Firstorder correct conditional confidence intervals based on asymptotic normality can be improved upon using quantiles of a conditional double bootstrap approximation to the distribution of the studentized target parameter estimate.

Dr. Simon Barthelme
CNRS Grenoble
Friday, May 17, 2019
Time 14:15 – Room CM 1 104
Title: Determinantal Point Processes for data subsampling
Abstract
Determinantal Point Processes (DPPs) are a class of point processes that
exhibit “repulsion”. This property can be leveraged to obtain
highdiversity subsets, meaning that DPPs can be used to subsample
various objects (surfaces, datasets, graphs, etc.) with relatively high
fidelity.
In this talk I will introduce DPPs and explain their use in constructing
“coresets”. A coreset is a small weighted subset of data that can be
used for learning, in lieu of the original data. A typical strategy for
constructing a coreset is to use a rough heuristic that quantifies how
important each datapoint is, and retain only the important
(highleverage) ones. We show that DPPs can be used to construct
coresets with provable guarantees. Because the resulting sets are
diverse, they can also be made smaller, speeding up inference. I’ll
discuss applications to kmeans.Joint work with Nicolas Tremblay and PierreOlivier Amblard.

Prof. Rainer von Sachs
UC Louvain
Thursday, May 23, 2019
Time 14:15 – Room CM 1 113
Title: Intrinsic wavelet smoothing of curves and surfaces of Hermitian positive definite matrices
Abstract
In multivariate time series analysis, nondegenerate autocovariance and spectral density matrices are necessarily Hermitian and positive definite, and it is important to preserve these properties in any estimation procedure. Our main contribution is the development of intrinsic wavelet transforms and nonparametric wavelet regression for curves in the nonEuclidean space of Hermitian positive definite matrices. The primary focus is on the construction of intrinsic averageinterpolation wavelet transforms in the space equipped with a natural invariant Riemannian metric. In addition, we derive the wavelet coefficient decay and linear wavelet thresholding convergence rates of intrinsically smooth curves of Hermitian positive definite matrices. The intrinsic wavelet transforms are computationally fast, and nonlinear wavelet thresholding captures localized features, such as cups or kinks, in the matrixvalued curves. In the context of nonparametric spectral estimation, the intrinsic (linear or nonlinear) wavelet spectral estimator satisfies the important property that it is equivariant under a change of basis of the time series, in contrast to most existing approaches. The finitesample performance of the intrinsic wavelet spectral estimator based on nonlinear treestructured trace thresholding is benchmarked against several stateoftheart nonparametric curve regression procedures in the Riemannian manifold by means of simulated time series data, and also a real (brain) data example.
Some extensions to treating timevarying spectral density matrices via intrinsic wavelet smoothing of surfaces on Riemannian manifolds are given, too.This is joint work with Joris Chau (Université catholique de Louvain).

Prof. Clément Hongler
EPFL MATH CSFT
Friday, May 24, 2019
Time 14:15 – Room CM 1 113
Title: Neural Tangent Kernel and Applications
Abstract
The Neural Tangent Kernel is a new way to understand the gradient descent in deep neural networks, connecting them with kernel methods. In this talk, I’ll introduce this formalism and give a number of results on the neural tangent kernel and explain how they give us insight into the dynamics of neural networks during training and into their generalization features.
Based off joint work with Arthur Jacot and Franck Gabriel.
