Statistics Seminar

    •  
    • ___
      • Prof. Eric D. Kolaczyk

        Boston University

        Friday, May 22, 2020

        Time 16:00 – zoom meeting

        Title: Statistics 101 for network data objects

        Abstract

        It is becoming increasingly common to see large collections of network data objects — that is, data sets in which a network is viewed as a fundamental unit of observation. As a result, there is a pressing need to develop network-based analogues of even many of the most basic techniques already standard for scalar and vector data. At the same time, principled extensions of familiar techniques to this context are nontrivial, given that networks are inherently non-Euclidean. We will present a number of results extending the notion of asymptotic inference for means to the contexts of various types of networks, i.e., both labeled and unlabeled, and either single- or multi-layer. These results rely on a combination of tools from geometry, probability theory, and statistical shape analysis. We will illustrate drawing from various applications in bioinformatics, computational neuroscience, and social network analysis under privacy

    •  
    • ___
    •  
    • ___
      • Prof. David Madigan

        Columbia University

        Friday, April 24, 2020

        Time 16:00 – zoom meeting

        Title: Towards honest inference from real-world healthcare data

        Abstract

        In practice, our learning healthcare system relies primarily on observational studies generating one effect estimate at a time using customized study designs with unknown operating characteristics and publishing – or not – one estimate at a time. When we investigate the distribution of estimates that this process has produced, we see clear evidence of its shortcomings, including an apparent over-abundance of statistically significant effects.
        We propose a standardized process for performing observational research that can be evaluated, calibrated and applied at scale to generate a more reliable and complete evidence base than previously possible. We demonstrate this new paradigm by generating evidence about all pairwise comparisons of 39 treatments for hypertension for a relevant set of 58 health outcomes using nine large-scale health record databases from four countries.
        In total, we estimate 1.3M hazard ratios, each using a comparative effectiveness study design and propensity score stratification on par with current one-off observational studies in the literature. Moreover, the process enables us to employ negative and positive controls to evaluate and calibrate estimates ensuring, for example, that the 95% confidence interval includes the true effect size 95% of the time. The result set consistently reflects current established knowledge where known, and its distribution shows no evidence of the faults of the current process.

        Joint work with George Hripcsak, Patrick Ryan, Martijn Schuemie, and Marc Suchard.

    •  
    • ___
      • Prof. Joe Guinness

        Cornell University

        Friday, March 27, 2020

        Time 16:00 – zoom meeting

        Title: Inverses of Matern Covariances on Grids

        Abstract

        We conduct a theoretical and numerical study of the aliased spectral densities and inverse operators of Matérn covariance functions on regular grids. We apply our results to provide clarity on the properties of a popular approximation based on stochastic partial differential equations; we find that it can approximate the aliased spectral density and the covariance operator well as the grid spacing goes to zero, but it does not provide increasingly accurate approximations to the inverse operator as the grid spacing goes to zero. If a sparse approximation to the inverse is desired, we suggest instead to select a KL-divergence-minimizing sparse approximation and demonstrate in simulations that these sparse approximations deliver accurate Matérn parameter estimates, while the SPDE approximation over-estimates spatial dependence.

    •  
    • ___
      • Prof. Michael Wolf

        UZH

        Friday, March 13, 2020

        Time 14:15 – Room CM 1 221

        Title: Shrinkage Estimation of Large Covariance Matrices: Keep It Simple, Statistician?

        Abstract

        Under rotation-equivariant decision theory, sample covariance matrix eigenvalues can be optimally shrunk by recombining sample eigenvectors with a (potentially nonlinear) function of the unobservable population covariance matrix. The optimal shape of this function reflects the loss/risk that is to be minimized.

        We solve the problem of optimal covariance matrix estimation under a variety of loss functions motivated by statistical precedent, probability theory, and differential geometry. A key ingredient of our nonlinear shrinkage methodology s a new estimator of the angle between sample and population eigenvectors, without making strong assumptions on the population eigenvalues.

        We also introduce a broad family of covariance matrix estimators that can handle all regular functional transformations of the population covariance matrix under large-dimensional asymptotics.

        In addition, we compare via Monte Carlo simulations our methodology to two simpler ones from the literature, linear shrinkage and shrinkage based on the spiked covariance model.

    •  
    • ___
      • Dr. Ioannis Kosmidis

        Warwick University

        Friday, March 6, 2020

        Time 15:15 – Room CM 1 221

        Title: Improved estimation of partially-specified models

        Abstract

        Abstract:
        Many popular methods for the reduction of estimation bias rely on an approximation of the bias function under the assumption that the model is correct and fully specified. Other bias reduction methods, like the bootstrap, the jackknife and indirect inference require fewer assumptions to operate but are typically computer-intensive, requiring repeated optimization.

        We present a novel framework for reducing estimation bias that:

        i) can deliver estimators with smaller bias than reference estimators even for partially-specified models, as long as estimation is through unbiased estimating functions;

        ii) always results in closed-form bias-reducing penalties to the objective function if estimation is through the maximization of one, like maximum likelihood and maximum composite likelihood.

        iii) relies only on the estimating functions and/or the objective and their derivatives, greatly facilitating implementation for general modelling frameworks through numerical or automatic differentiation techniques and standard numerical optimization routines.

        The bias-reducing penalized objectives closely relate to information criteria for model selection based on the Kullback-Leibler divergence, establishing, for the first time, a strong link between reduction of estimation bias and model selection. We also discuss the asymptotic efficiency properties of the new estimator, inference and model selection, and present illustrations in well-used, important modelling settings of varying complexity.

        Related preprint:
        http://arxiv.org/abs/2001.03786

        Joint work with:
        Nicola Lunardon, University of Milano-Bicocca, Milan, Italy

    • ___
      • Dr. Pramita Bagchi

        Volgenau School of Engineering

        Thursday, November 28, 2019

        Time 14:15 – Room MA 10

        Title: A test for separability in covariance operators of random surfaces

        Abstract

        The assumption of separability is a simplifying and very popular assumption in the analysis of spatio-temporal or hypersurface data structures. It is often made in situations where the covariance structure cannot be easily estimated, for example because of a small sample size or because of computational storage problems. We propose a new and very simple test to validate this assumption. Our approach is based on a measure of separability which is zero in the case of separability and positive otherwise. We derive the asymptotic distribution of a corresponding estimate under the null hypothesis and the alternative and develop an asymptotic and a bootstrap test, which are very easy to implement. In particular, the approach does neither require projections on subspaces generated by the eigenfunctions of the covariance operator nor distributional assumptions as recently used by other works to construct tests for separability. We investigate the finite sample performance by means of a simulation study and also provide a comparison with the currently available methodology. Finally, the new procedure is illustrated analyzing a data example.

      • Dr. Heather Battey

        Imperial College London

        Friday, November 15, 2019

        Time 10:30- Room GR A3 31

        Title: Aspects of high-dimensional inference

        Abstract

        Statistical analysis when the number of unknown parameters is comparable with the number of independent observations may demand modification of  maximum likelihood based methods. There are comparable difficulties with Bayesian analyses based on high dimensional “flat” priors. This discursive talk will cover a number of perspectives on this situation, including the implications of sparsity and the role of different types of parameters.

      • Dr. Kaushik Jana

        Imperial College London

        Friday, September 27, 2019

        Time 15:15 – Room CM 1 113

        Title: The Statistical Face of a Region under Monsoon Rainfall in Eastern India

        Abstract

        A region under rainfall is a contiguous spatial area receiving positive precipitation at a particular time. The probabilistic behavior of such a region is an issue of interest in meteorological studies. A region under rainfall can be viewed as a shape object of a special kind, where scale and rotational invariance are not necessarily desirable attributes of a mathematical representation. For modeling variation in objects of this type, we propose an approximation of the boundary that can be represented as a real valued function, and arrive at further approximation through functional principal component analysis, after suitable adjustment for asymmetry and incompleteness in the data. The analysis of an open access satellite data set on monsoon precipitation over Eastern Indian subcontinent leads to explanation of most of the variation in shapes of the regions under rainfall through a handful of interpretable functions that can be further approximated parametrically. The most important aspect of shape is found to be the size followed by contraction/elongation, mostly along two pairs of orthogonal axes. The different modes of variation are remarkably stable across calendar years and across different thresholds for minimum size of the region.

        Authors:
        Kaushik Jana (Imperial College London), Debasis Sengupta, Subrata Kundu,
        Arindam Chakraborty and Purnima Shaw

      • Prof. Andrew Harvey

        University of Cambridge

        Thursday, September 26, 2019

        Time 14:15 – Room CM 1 100

        Title: Modeling directional (circular) time series

        Abstract

        Circular observations pose special problems for time series modeling. This article shows how the score-driven approach, developed primarily in econometrics, provides a natural solution to the difficulties and leads to a coherent and unified methodology for estimation, model selection and testing. The new methods are illustrated with hourly data on wind direction.

      • Dr. Nina Miolane

        Stanford University

        Monday, September 9, 2019

        Time 15:15 – Room MA10

        Title: Learning submanifolds with geometric variational autoencoders

        Abstract

        Geometric statistics is a theory of statistics for data belonging to non-Euclidean spaces or manifolds. Such data naturally arise when computing with biomedical images. For example, the data space of brain connectomes computed from functional magnetic resonance imaging (fMRI) can be represented as a Riemannian manifold of symmetric positive definite (SPD) matrices equipped with an affine-invariant metric.

        We are interested in dimensionality reduction methods for this type of data. Principal Component Analysis (PCA) on Euclidean spaces has been generalized to manifolds with, for example, Principal Geodesic Analysis (PGA) which learns a lower-dimensional “geodesic subspace” N that best captures the data variability. Non-linear dimensionality reduction methods like the popular variational autoencoders (VAE), however, have not been generalized to manifolds.

        We introduce the “geometric variational autoencoder” (gVAE), a method to learn a submanifold N of a Riemannian manifold M. On the one hand, it extends VAEs to Riemannian manifolds and adds a geometric prior. On the other hand, it extends PGA and its variants (i) by relaxing the geodesic constraint on the subspace N and (ii) by providing approximate posterior distributions of the lower-dimensional representations of the data. We present a python package for geometric statistics, geomstats, that we use to implement gVAE on GPUs. We show results on simulated and real brain connectomes data.

      • Prof. Ryan Tibshirani

        Carnegie Mellon University

        Thursday, September 5, 2019

        Time 16:15 – Room MA11

        Title: Surprises in High-Dimensional Ridgeless Least Squares Interpolation

        Abstract

        Interpolators—estimators that achieve zero training error—have attracted growing attention in machine learning, mainly because state-of-the art neural networks appear to be models of this type. We study minimum L2 norm (“ridgeless”) interpolation in high-dimensional least squares regression. We consider two different models for the feature distribution: a linear model, where the feature vectors are obtained by applying a linear transform to a vector of i.i.d. entries, and a nonlinear model, where the feature vectors are obtained by passing the input through a random one-layer neural network. We recover—-in a precise quantitative way—-several phenomena that have been observed in large-scale neural networks and kernel machines, including the “double descent” behavior of the prediction risk, and the potential benefits of overparametrization.
        This represents work with Trevor Hastie, Andrea Montanari, and Saharon Rosset.

      • Spring semester

        Dr. Finn Lindgren

        The University of Edinburgh

        Friday, February 22, 2019

        Time 15:15 – Room MA10

        Title: Quantifying the uncertainty of contour maps

        Abstract

        Contour maps are ubiquitous for visualising estimated spatial fields, but the uncertainty associated with such maps has been given a surprisingly small amount of attention. The question is closely connected with the dual problem of constructing credible regions for excursion sets, which leads to a more stringent formulation of the problem. With computational implementations, we can answer questions such as “How many or few contours levels is it reasonable to use, given the inherent uncertainty?”.
        I will discuss these issues in particular in the context of Bayesian latent Gaussian random field models estimated with Integrated Nested Laplace Approximations.

      • Dr. David Kraus

        Masaryk University

        Thursday, February 28, 2019

        Time 14:15Room MA12

        Title: Regularized classification of functional data under incomplete observation

        Abstract

        Classification of functional data into two groups by linear classifiers is considered on the basis of one-dimensional projections of functions. Finding the best classifier is seen as an optimization problem that can be approximately solved by regularization methods, e.g., the conjugate gradient method with early stopping, the principal component method and the ridge method. We study the empirical version with finite training samples consisting of incomplete functions observed on different subsets of the domain and show that the optimal, possibly zero, misclassification probability can be achieved in the limit along a possibly non-convergent empirical regularization path. We propose a domain extension and selection procedure that finds the best domain beyond the common observation domain of all curves. In a simulation study we compare the different regularization methods and investigate the performance of domain selection. Our methodology is illustrated on a medical data set, where we observe a substantial improvement of classification accuracy due to domain extension.

        The talk is based on joint work with Marco Stefanucci.

      • Dr. Phyllis Wan

        Erasmus University Rotterdam

        Friday, March 15, 2019

        Time 14:15Room CM 1 113

        Title: Applications of distance covariance to time series

        Abstract

        In many statistical frameworks, goodness-of-fit tests are administered to the estimated residuals. In the time series setting, whiteness of the residuals is assessed using the sample autocorrelation function (ACF). In this talk, we apply the auto-distance covariance function (ADCV) to evaluate the serial dependence of the estimated residuals. Distance covariance can discriminate between dependence and independence of two random vectors. The limit behavior of the test statistic based on the ADCV is derived for a general class of time series models. One of the key aspects in this theory is adjusting for the dependence that arises due to parameter estimation. This adjustment has essentially the same form regardless of the model specification. We illustrate the results in simulated examples.

      • Dr. Guillaume Obozinski

        Swiss Data Science Center, EPFL

        Friday, April 12, 2019

        Time 14:15Room CM 1 113

        Title: Convex unmixing  and learning the effect of latent variables in Gaussian Graphical models with unobserved variables

        Abstract

        The edge structure of the graph defining an undirected graphical model describes precisely the structure of dependence between the variables in the graph. In many applications, the dependence structure is unknown and it is desirable to learn it from data, often because it is a preliminary step to be able to ascertain causal effects. This problem, known as structure learning, is hard in general, but for Gaussian graphical models it is slightly easier because the structure of the graph is given by the sparsity pattern of the precision matrix of the joint distribution, and because independence coincides with decorrelation. A major difficulty too often ignored in structure learning is the fact that if some variables are not observed, the marginal dependence graph over the observed variables will possibly be significantly more complex and no longer reflect the direct dependencies that are potentially associated with causal effects. In this work, we consider a family of latent variable Gaussian graphical models in which the graph of the joint distribution between observed and unobserved variables is sparse, and the unobserved variables are conditionally independent given the others. Prior work was able to recover the connectivity between observed variables, but could only identify the subspace spanned by unobserved variables, whereas we propose a convex optimization formulation based on structured matrix sparsity to estimate the complete connectivity of the complete graph including unobserved variables, given the knowledge of the number of missing variables, and a priori knowledge of their level of connectivity. Our formulation is supported by a theoretical result of identifiability of the latent dependence structure for sparse graphs in the infinite data limit, which is a particular instance of a more general result we prove for unmixing with convex norms. We propose an algorithm leveraging recent active set methods, which performs well in the experiments on synthetic data.

      • Prof. Alan Welsh

        ANU College of Science

        Thursday, April 18, 2019

        Time 15:15Room MA 12

        Title: Using the Bootstrap in Generalized Regression Estimation

        Abstract

        We discuss a generalized regression estimation procedure that can lead to much improved estimators of general population characteristics, such as quantiles, variances, and coefficients of variation. The method is quite general and requires minimal assumptions, the main ones being that the asymptotic joint distribution of the target and auxiliary parameter estimators is multivariate normal, and that the population values of the auxiliary parameters are known. The assumption on the asymptotic joint distribution implies that the relationship between the estimated target and the estimated auxiliary parameters is approximately linear with coefficients determined by their asymptotic covariance matrix. Use of the bootstrap to estimate these coefficients avoids the need for parametric distributional assumptions. First-order correct conditional confidence intervals based on asymptotic normality can be improved upon using quantiles of a conditional double bootstrap approximation to the distribution of the studentized target parameter estimate.

      • Dr. Simon Barthelme

        CNRS Grenoble

        Friday, May 17, 2019

        Time 14:15Room CM 1 104

        Title: Determinantal Point Processes for data sub-sampling

        Abstract

        Determinantal Point Processes (DPPs) are a class of point processes that
        exhibit “repulsion”. This property can be leveraged to obtain
        high-diversity subsets, meaning that DPPs can be used to sub-sample
        various objects (surfaces, datasets, graphs, etc.) with relatively high
        fidelity.
        In this talk I will introduce DPPs and explain their use in constructing
        “coresets”. A coreset is a small weighted subset of data that can be
        used for learning, in lieu of the original data. A typical strategy for
        constructing a coreset is to use a rough heuristic that quantifies how
        important each datapoint is, and retain only the important
        (high-leverage) ones. We show that DPPs can be used to construct
        coresets with provable guarantees. Because the resulting sets are
        diverse, they can also be made smaller, speeding up inference. I’ll
        discuss applications to k-means.

        Joint work with Nicolas Tremblay and Pierre-Olivier Amblard.

      • Prof. Rainer von Sachs

        UC Louvain

        Thursday, May 23, 2019

        Time 14:15Room CM 1 113

        Title: Intrinsic wavelet smoothing of curves and surfaces of Hermitian positive definite matrices

        Abstract

        In multivariate time series analysis, non-degenerate autocovariance and spectral density matrices are necessarily Hermitian and positive definite, and it is important to preserve these properties in any estimation procedure. Our main contribution is the development of intrinsic wavelet transforms and nonparametric wavelet regression for curves in the non-Euclidean space of Hermitian positive definite matrices. The primary focus is on the construction of intrinsic average-interpolation wavelet transforms in the space equipped with a natural invariant Riemannian metric. In addition, we derive the wavelet coefficient decay and linear wavelet thresholding convergence rates of intrinsically smooth curves of Hermitian positive definite matrices. The intrinsic wavelet transforms are computationally fast, and nonlinear wavelet thresholding captures localized features, such as cups or kinks, in the matrix-valued curves. In the context of nonparametric spectral estimation, the intrinsic (linear or nonlinear) wavelet spectral estimator satisfies the important property that it is equivariant under a change of basis of the time series, in contrast to most existing approaches. The finite-sample performance of the intrinsic wavelet spectral estimator based on nonlinear tree-structured trace thresholding is benchmarked against several state-of-the-art nonparametric curve regression procedures in the Riemannian manifold by means of simulated time series data, and also a real (brain) data example.
        Some extensions to treating time-varying spectral density matrices via intrinsic wavelet smoothing of surfaces on Riemannian manifolds are given, too.

        This is joint work with Joris Chau (Université catholique de Louvain).

      •  

    Prof. Clément Hongler

    EPFL MATH CSFT

    Friday, May 24, 2019

    Time 14:15Room CM 1 113

    Title: Neural Tangent Kernel and Applications

    Abstract

    The Neural Tangent Kernel is a new way to understand the gradient descent in deep neural networks, connecting them with kernel methods. In this talk, I’ll introduce this formalism and give a number of results on the neural tangent kernel and explain how they give us insight into the dynamics of neural networks during training and into their generalization features.

    Based off joint work with Arthur Jacot and Franck Gabriel.

    •