Instructor: Dmitriy Drusvyatskiy
Lecture: MW 2:30-3:50 PM
Office hours: W 1:00-2:00 PM and by appointment
Office: PDL C-434
Email: ddrusv at uw.edu
lecture time: Monday and Wednesday 2:30-3:50 PM
lecture location: Raitt Hall (RAI) 116
This is an introductory course in high-dimensional probability and statistical learning. The main focus will be on the concentration phenomenon in high dimensions and its consequences for inference and classification from limited data. In particular, we will use the developed techniques to analyze algorithms for various statistical inverse problems, such as community detection, sparse recovery, low-rank matrix completion, phase retrieval, etc. This course is appropriate for anyone with a working knowledge of linear algebra, probability, and mathematical analysis.
We will primarily use elements from the following two texts:
Roman Vershynin, “High-dimensional probability: An introduction with applications in data science.”
Martin J. Wainwright, “ High-dimensional statistics: A non-asymptotic viewpoint.”
Weakly problem sets with solutions written in LaTex (100% of the grade). Since, there is no TA for this course, the grading will be on the scale:
0 (did not much work, if any),
1 (did at least 50% of the homework),
2 (did at least 80% of the homework).
You may work together on problem sets, but you must write up your own solutions. You must also cite any resources which helped you obtain your solutions.
The following is a very rough schedule for the first 6-7 weeks of the course. We will decide on what to cover in the last three weeks, depending on the students’ interests.
Lectures 1-3: Basic Concentration (Notes)
Chernoff bounds
Sub-Gaussian Random Variables and Hoeffding's Inequality
Sub-Exponential Random Variables and Bernstein's Inequality
Application: dimensionality reduction with Johnson-Lindenstrauss lemma
McDiarmid's Inequality
Robust Mean Estimation (median of means)
Lectures 3: Random Vectors in High Dimensions (Notes)
Concentration of the norm
Isotropic Distributions
Similarity of Normal and Spherical Distributions
Sub-Gaussian and Sub-Exponential Random Vectors
Lectures 4-6: Uniform Laws and Introduction to Statistical Learning (Notes)
What is sample of complexity of learning?
Uniform law via Rademacher complexity
Glivenko-Cantelli Theorem
Upper bounds on Rademacher complexity
Sample complexity of binary classification with VC dimension
Learning linear classes
Stable learning rules generalize
Learning without uniform convergence with convex losses
Lectures 7-11: Metric entropy and Matrix Concentration (Notes)
Nets, Covering numbers, and the metric entropy
Eigenvalaues and Singular values of sub-Gaussian random matrices
Matrix concentration
Controlling the operator norm of a random matrix by row/column norms
Applications: community detection, Gaussian mixture model, (sparse) covariance estimation, matrix completion
Lectures 12: Quadratic forms and mean estimation (Notes)
mean estimation for multivariate Gaussians
Hanson-Wright inequality
norm concentration without isotropy
distance of a random point to a subspace
Lectures 13-15: Sub-Gaussian Processes (Notes)
Concentration of Lipschitz functions and the isoparametric inequality
Basics of Gaussian processes
Gaussian comparison inequalities: Slepian and Sudakov-Fernique
Sudakov's minoration inequality
Gaussian width and random projections of sets
Chaining and Dudley's integral inequality
Improved Uniform Laws (VC dimension and Glivenko-Cantelli)
Generic chaining bound and Talagrand's comparison inequality
Chavet Inequality
Lectures 16-17: Matrix deviation inequality and consequences (Notes)
Matrix deviation inequality
Random Sections
Lectures 17-19: Sparse and low-rank recovery (Notes)
High dimensional signal recovery
Signal recovery based on the M* bound
Exact recovery based on the escape theorem
Deterministic conditions for recovery
Recovery with noise (LASSO/BPDN)
HW 1 (due Friday October 4th (slide it under my door)):
Reading: Wainwright (Chapter 1) and Vershynin (Chapter 2, Sections 3.1-3.4)
Optional Reading: Wainwright (Chapter 2) and Vershynin (Sections 3.5-3.8)
Exercises: Vershynin (1.3.3, 2.2.9, 2.5.10, 2.5.11, 3.3.5, 3.4.4)
HW 2 (due Friday October 25th (slide it under my door)):
Reading: Wainwright (Chapter 4)
Exercises: Wainwright (4.1, 4.2, 4.3, 4.6, 4.15, 4.17)
HW 3 (due Friday November 8th (slide it under my door)):
Reading: Wainwright (Sections 5.1, 6.4, 6.5), Vershynin (Chapter 4 and Sections 5.4-5.6). There is a lot overlap in the two books, you can skim the overlapping sections.
Exercises: Vershynin (4.4.6, 4.5.2, 4.6.2, 4.6.4, 4.7.3, 4.7.6, 5.4.11, 5.4.15)
HW 4 (due Friday November 22nd (slide it under my door)):
Reading: Vershynin (Sections 5.1-5.3, Chapter 6, Sections 7.1-7.4).
Exercises: Vershynin (5.1.8, 5.1.13, 5.1.14, 5.2.3, 7.2.4, 7.2.6, 7.2.12, 7.4.2)
HW 5 (due Friday December 6th (slide it under my door)):
Reading: Vershynin (Sections 7.5-7.8, Chapter 8).
Exercises: Vershynin (7.6.6, 7.6.9, 8.17, 8.5.6, 8.64, 8.6.5)
HW 6 (due Monday December 16th (slide it under my door)):
Reading: Vershynin (Chapters 9, 10), Wainwright (Sections 7.1-7.4).
Exercises: Vershynin (9.1.8, 9.4.8, 10.4.8, 10.5.12)