Advanced Topics in Data Sciences

 

Instructor

Prof. Volkan Cevher

 

Description

This course describes theory and methods to address three key challenges in data sciences: estimation, prediction, and computation. We use convex analysis and methods as a common connecting theme, and illustrate the main ideas on concrete applications from machine learning and signal processing. 

Learning outcomes

By the end of the course, students must be able to follow up advanced research in machine learning and optimization. 

Prerequisites

Multivariable calculus, linear algebra and probability theory. 

 

Outline

The course consists of the following topics

                         
Lecture 1 Overview of the course. Learning-based compressive subsampling. Introduction to submodularity, examples, and properties. 
 
     
Lecture 2 Discrete optimization. Submodular function maximization. The greedy algorithm. Relations between submodularity and convexity. The Lovazs extension.  Scribe
     
Lecture 3 Introduction to structured sparsity. Convex relaxation by biconjugation. Structured sparsity via submodular functions.
 Scribe
     
Lecture 4 Integer and linear programming. Structured sparsity via totally unimodular constraints. Robust submodular function maximization. The Saturate algorithm.
 Scribe
     
Lecture 5 Stochastic optimization. Stochastic subgradient method. Stochastic proximal method. Stochastic accelerated methods.   Scribe
     
Lecture 6 Variance reduction. Coordinate descent methods for smooth objectives.
 Scribe
     
Lecture 7 Coordinate descent methods for composite functions. Coordinate descent primal-dual algorithms. Randomized Linear Algebra. Randomized matrix decompositions. Comparison to classical methods.  Scribe
 
 
Lecture 8 Randomized Linear Algebra. Row extraction method. Power method. Column selection methods. Stochastic quasi-Newton Method.  Scribe
 
 
Lecture 9 Introduction to statistical learning theory. Statistics vs statistical learning. Approximation error. Excess risk, Hoeffding’s inequality.  Scribe
 
 
Lecture 10 Concentration of measure inequalities. Chernoff-type bounds. Azuma-Hoeffding inequality. Bounded differences inequality.  Scribe
 
 
Lecture 11 Uniform Convergences in Statistical Learning Theory. Classical VC Theory for Binary Classification.  Scribe
     
Lecture 12 Project presentations