Slides 2014


The 2014 course consists of the following topics

Lecture 1 “Objects in Space”: Definitions of norms, inner products, and metrics for vector, matrix and tensor objects.
  Basics of complexity theory.
Lecture 2 Maximum likelihood principle as a motivation for convex optimization. 
  Fundamental structures in convex analysis, such as cones, smoothness, and conjugation.
Lecture 3 Unconstrained, smooth minimization techniques.
  Gradient methods.
  Variable metric algorithms.
  Time-data tradeoffs in ML estimation.
Lecture 4 Convex geometry of linear inverse problems. 
  Structured data models (e.g., sparse and low-rank) and convex gauge functions and formulations that encourage these structures.
  Computational aspects of gauge functions.
Lecture 5
Composite convex minimization. Regularized M-estimators. 
  Time-data tradeoffs in linear inverse problems.
Lecture 6 Convex demixing.
  Statistical dimension.
  Phase transitions in convex minimization.
  Smoothing approaches for non-smooth convex minimization.
Lecture 7 Constrained convex minimization-I.
  Introduction to convex duality.
  Classical solution methods (the augmented Lagrangian method, alternating minimization algorithm, alternating direction method of multipliers, and the Frank-Wolfe method) and their deficiencie.
Lecture 8 Constrained convex minimization-II. 
  Variational gap characterizations and dual smoothing.
  Scalable, black-box optimization techniques.
  Time data-tradeoffs for linear inverse problems.
Lecture 9 Classical black-box convex optimization techniques. 
  Linear programming, semidefinite programming, and the interior point method (IPM).
  Hierarchies of classical formulations.
  Time and space complexity of the IPM.
Lecture 10 Time-data tradeoffs in machine learning.
Lecture 11 Convex methods for Big Data I: Randomized coordinate descent methods.
  The Page Rank problem and Nesterov’s solution.
  Composite formulations.
Lecture 12 Convex methods for Big Data II: Stochastic gradient descent methods.
  Least squares: conjugate gradients vs. a simple stochastic gradient method.
  Dual and gradient averaging schemes.
  Stochastic mirror descent.
Lecture 13 Randomized linear algebra routines for scalable convex optimization.
  Probabilistic algorithms for constructing approximate low-rank matrix decompositions.
  Subset selection approaches.
  Theoretical approximation guarantees.
Lecture 14 Role of parallel and distributed computing.  
  How to avoid communication bottlenecks and synchronization.
  Consensus methods.
  Memory lock-free, decentralized, and asynchronous algorithms.