Prof. Volkan Cevher
Convex optimization offers a unified framework in obtaining numerical solutions to data analytics problems with provable statistical guarantees of correctness at well-understood computational costs.
To this end, this course reviews recent advances in convex optimization and statistical analysis in the wake of Big Data. We provide an overview of the emerging convex data models and their statistical guarantees, describe scalable numerical solution techniques such as stochastic, first-order and primal-dual methods. Throughout the course, we put the mathematical concepts into action with large-scale applications from machine learning, signal processing, and statistics.
Throughout the course, we put the mathematical concepts into action with large-scale applications from machine learning, signal processing, and statistics.
By the end of the course, the students are expected to understand the so-called time-data tradeoffs in data analytics. In particular, the students must be able to
- Choose an appropriate convex formulation for a data analytics problem at hand
- Estimate the underlying data size requirements for the correctness of its solution
- Implement an appropriate convex optimization algorithm based on the available computational platform
- Decide on a meaningful level of optimization accuracy for stopping the algorithm
- Characterize the time required for their algorithm to obtain a numerical solution with the chosen accuracy
Previous coursework in calculus, linear algebra, and probability is required. Familiarity with optimization is useful.
The course consists of the following topics. The lecture outline below is informative and might be modified from one year to the next.
|Lecture 1||Introduction. The role of models and data. Maximum-likelihood formulations. Sample complexity bounds for estimation and prediction|
|Lecture 2||The role of computation. Challenges to optimization algorithms. Optimality measures. Structures in optimization. Gradient descent. Convergence rate of gradient descent.|
|Lecture 3||Optimality of convergence rates. Accelerated gradient descent. Concept of total complexity. Stochastic gradient descent.|
|Lecture 4||Concise signal models. Compressive sensing. Sample complexity bounds for estimation and prediction. Challenges to optimization algorithms for non-smooth optimization.|
Introduction to proximal-operators. Proximal gradient methods. Linear minimization ora- cles. Conditional gradient method for constrained optimization.
|Lecture 6||Time-data trade-offs. Variance reduction for improving trade-offs.|
|Lecture 7||A mathematical introduction to deep learning. Challenges in deep learning theory and practice.|
|Lecture 8||Double descent curves and over-parameterization. Implicit regularization.|
|Lecture 9||Structures in non-convex optimization. Optimality measures. Escaping saddle points. Adaptive gradient methods.|
|Lecture 10||Adversarial machine learning and generative adversarial networks (GANs). Wasserstein GAN. Difficulty of minimax optimization.|
|Lecture 11||Primal-dual optimization I: Fundamentals of minimax problems. Pitfalls of gradient descent-ascent approach.|
|Lecture 12||Primal-dual optimization II: Extragradient method. Chambolle-Pock algorithm. Stochastic primal-dual methods.|
|Lecture 13||Primal-dual optimization III: Lagrangian gradient methods. Lagrangian conditional gradi- ent methods.|