1. Concentration for sums of random matrices
Let be a random real matrix of size . In other words, we have some probability distribution on the space of all matrices, and we let be a matrix obtained by sampling from that distribution. Alternatively, we can think of as a matrix whose entries are real-valued random variables (that are not necessarily independent).
As usual, the expectation of is simply the weighted average of the possible matrices that could be, i.e., . Alternatively, we can think of as matrix whose entries are the expectations of the entries of .
Many concentration results are known for matrices whose entries are independent random variables from certain real-valued distributions (e.g., Gaussian, subgaussian, etc.) In fact, in Lecture 8 on Compressed Sensing, we proved concentration of the singular values of a matrix whose entries are independent Gaussians. In this lecture, we will look at random matrices whose entries are not independent, and we will obtain concentration results by summing multiple independent copies of those matrices.
1.1. The Ahlswede-Winter Inequality
The Chernoff bound is a very powerful tool for proving concentration for sums of independent, real-valued random variables. Today we will prove the Ahlswede-Winter inequality, which is a generalization of the Chernoff bound for proving concentration for sums of independent, matrix-valued random variables.
Let be random, independent, symmetric matrices of size . Define the partial sums . We would like to analyze the probability that all eigenvalues of are at most (i.e., ). For any , this is equivalent to all eigenvalues of being at most (i.e., ). If this event fails to hold then then certainly , since all eigenvalues of are non-negative. Thus we have bounded the probability that some eigenvalue of is greater than as follows:
Now let us observe a useful property of the trace. Since it is linear, it commutes with expectation:
The proof of the Ahlswede-Winter inequality is very similar to the proof of the Chernoff bound; one just has to be a bit careful to do the matrix algebra properly. As in the proof of the Chernoff bound, the main technical step is to bound the expectation in (1) by a product of expectations that each involve a single , because those individual expectations are much easier to analyze. This is where the Golden-Thompson inequality (Theorem 17 in the Notes on Symmetric Matrices) is needed.
where the last inequality follows from Corollary 14 in the Notes on Symmetric Matrices. Applying this inequality inductively, we get
since and . Combining this with (1), we obtain