## Lecture 13: The Ahlswede-Winter Inequality

1. Useful versions of the Ahlswede-Winter Inequality

Theorem 1 Let ${Y}$ be a random, symmetric, positive semi-definite ${d \times d}$ matrix such that ${{\mathrm E}[ Y ] = I}$. Suppose ${\lVert Y \rVert \leq R}$ for some fixed scalar ${R \geq 1}$. Let ${Y_1, \ldots, Y_k}$ be independent copies of ${Y}$ (i.e., independently sampled matrices with the same distribution as ${Y}$). For any ${\epsilon \in (0,1)}$, we have

$\displaystyle {\mathrm{Pr}}\Bigg[ (1-\epsilon) I \:\preceq\: \frac{1}{k} \sum_{i=1}^k Y_i \:\preceq\: (1+\epsilon) I \Bigg] ~\geq~ 1 - 2d \cdot \exp( - \epsilon^2 k / 4 R ).$

This event is equivalent to the sample average ${\frac{1}{k} \sum_{i=1}^k Y_i}$ having minimum eigenvalue at least ${1-\epsilon}$ and maximum eigenvalue at most ${1+\epsilon}$.

Proof: We apply the Ahlswede-Winter inequality with ${ X_i = \big(Y_i - {\mathrm E}[ Y_i ] \big) / R}$. Note that ${{\mathrm E}[ X_i ] = 0}$, ${\lVert X_i \rVert \leq 1}$, and

$\displaystyle \begin{array}{rcl} {\mathrm E}[ X_i^2 ] &=& \frac{1}{R^2} {\mathrm E}\big[ \big(Y_i - {\mathrm E}[Y_i] )^2 \big] \\ &=& \frac{1}{R^2} \Big( {\mathrm E}[ Y_i^2 ] - 2 {\mathrm E}[Y_i]^2 + {\mathrm E}[Y_i]^2 \Big) \\ &\preceq& \frac{1}{R^2} {\mathrm E}[ Y_i^2 ] \qquad(\mathrm{since}~ {\mathrm E}[ Y_i ]^2 \succeq 0)\\ &\preceq& \frac{1}{R^2} {\mathrm E}[ \lVert Y_i \rVert \cdot Y_i ] \\ &\preceq& \frac{R}{R^2} {\mathrm E}[ Y_i ] \end{array}$

Finally, since ${0 \preceq {\mathrm E}[Y_i] \preceq I}$, we get

$\displaystyle \lambda_{\mathrm{max}} \big( {\mathrm E}[X_i^2] \big) ~\leq~ 1/R. \ \ \ \ \ (1)$

Now we use Claim 15 from the Notes on Symmetric Matrices, together with the inequalities

$\displaystyle \begin{array}{rcl} 1 + x &\leq& e^x \quad\forall x \in {\mathbb R} \\ e^x &\leq& 1 + x + x^2 \quad\forall x \in [-1,1]. \end{array}$

Since ${\lVert X_i \rVert \leq 1}$, for any ${\lambda \in [0,1]}$, we have ${ e^{\lambda X_i} \preceq I + \lambda X_i + \lambda^2 X_i^2 }$, and so

$\displaystyle {\mathrm E}[ e^{\lambda X_i} ] ~\preceq~ {\mathrm E}[ I + \lambda X_i + \lambda^2 X_i^2 ] ~\preceq~ I + \lambda^2 {\mathrm E}[ X_i^2 ] ~\preceq~ e^{ \lambda^2 {\mathrm E}[ X_i^2 ] }.$

Thus by (1) we have

$\displaystyle \lVert {\mathrm E}[ e^{\lambda X_i} ] \rVert ~\leq~ \lVert e^{ \lambda^2 {\mathrm E}[ X_i^2 ] } \rVert ~\leq~ e^{ \lambda^2 / R }.$

The same analysis also shows that ${\lVert {\mathrm E}[ e^{-\lambda X_i} ] \rVert \leq e^{ \lambda^2 / R }}$. Substituting these two bounds into the basic Ahlswede-Winter inequality from the previous lecture, we obtain

$\displaystyle {\mathrm{Pr}}\Bigg[~ \Big\lVert \sum_{i=1}^k \frac{1}{R} \big(Y_i - {\mathrm E}[Y_i] \big) \Big\rVert >t ~\Bigg] ~\leq~ 2d \cdot e^{-\lambda t} \prod_{i=1}^k e^{ \lambda^2 / R} ~=~ 2d \cdot \exp( -\lambda t + k \lambda^2 / R ).$

Substituting ${t = k \epsilon / R}$ and ${\lambda = \epsilon/2}$ we get

$\displaystyle {\mathrm{Pr}}\Bigg[~ \Big\lVert \frac{1}{R} \sum_{i=1}^k Y_i - \frac{k}{R} {\mathrm E}[Y_i] \Big\rVert > \frac{k \epsilon}{R} ~\Bigg] ~\leq~ 2d \cdot \exp( - k \epsilon^2 / 4R ).$

Multiplying by ${R/k}$ and using the fact that ${{\mathrm E}[Y_i]=I}$, we have bounded the probability that any eigenvalue of the sample average matrix ${\sum_{i=1}^k Y_i/k}$ is less than ${1-\epsilon}$ or greater than ${1+\epsilon}$. $\Box$

Corollary 2 Let ${Z}$ be a random, symmetric, positive semi-definite ${d \times d}$ matrix. Define ${U := {\mathrm E}[ Z ]}$ and suppose ${Z \preceq R \cdot U}$ for some scalar ${R \geq 1}$. Let ${Z_1, \ldots, Z_k}$ be independent copies of ${Z}$. For any ${\epsilon \in (0,1)}$, we have

$\displaystyle {\mathrm{Pr}}\Bigg[ (1-\epsilon) U \:\preceq\: \frac{1}{k} \sum_{i=1}^k Z_i \:\preceq\: (1+\epsilon) U \Bigg] ~\geq~ 1 - 2d \cdot \exp( - \epsilon^2 k / 4 R ).$

Proof: Let ${U^{+/2} := (U^+)^{1/2}}$ denote the square root of the pseudoinverse of ${U}$. Let ${I_{\mathrm{im}~U}}$ denote the orthogonal projection on the image of ${U}$. Define the random, positive semi-definite matrices

$\displaystyle Y ~:=~ U^{+/2} \cdot Z \cdot U^{+/2} \qquad\mathrm{and}\qquad Y_i ~:=~ U^{+/2} \cdot Z_i \cdot U^{+/2}.$

Because ${Z_i \succeq 0}$ and ${U = {\mathrm E}[\sum_i Z_i]}$, we have ${\mathrm{im}(Z_i) \subseteq \mathrm{im}(U)}$. So Claim 16 in Notes on Symmetric Matrices implies

$\displaystyle (1-\epsilon) U \:\preceq\: \frac{1}{k} \sum_{i=1}^k Z_i \:\preceq\: (1+\epsilon) U \qquad\Longleftrightarrow\qquad (1-\epsilon) I_{\mathrm{im}~U} \:\preceq\: \frac{1}{k} \sum_{i=1}^k Y_i \:\preceq\: (1+\epsilon) I_{\mathrm{im}~U}.$

We would like to use Theorem 1 to obtain our desired bound. We just need to check that the hypotheses of the theorem are satisfied. By Fact 6 from the Notes on Symmetric Matrices, we have

$\displaystyle Y ~=~ U^{+/2} \cdot Z \cdot U^{+/2} ~\preceq~ U^{+/2} \cdot (R \cdot U) \cdot U^{+/2} ~=~ R \cdot I_{\mathrm{im}~U},$

showing that ${\lVert Y \rVert \leq R}$. Next,

$\displaystyle {\mathrm E}[Y] ~=~ U^{+/2} \cdot {\mathrm E}[Z] \cdot U^{+/2} ~=~ U^{+/2} \cdot U \cdot U^{+/2} ~=~ I_{\mathrm{im}~U}.$

So the hypotheses of Theorem 1 are almost satisfied, with the small issue that ${{\mathrm E}[Y]}$ is not actually the identity, but merely the identity on the image of ${U}$. But, one may check that the proof of Theorem 1 still goes through as long as every eigenvalue of ${{\mathrm E}[Y]}$ is either ${0}$ or ${1}$, i.e., ${{\mathrm E}[Y]}$ is an orthogonal projection matrix. The details are left as an exercise. $\Box$