A Table of Exponential Inequalities
August 23, 2023
\[\newcommand{\Pr}{\mathbb{P}} \newcommand{\E}{\mathbb{E}} \newcommand{\Re}{\mathbb{R}} \newcommand{\Xbar}{\overline{X}} \newcommand{\Var}{\mathbb{V}}\]
I forget useful exponential inequalities all the time and I’m sick of looking them up. (Pf) stands for proof and will jump you to the proof of the given inequality. If I’ve omitted your favorite inequality let me know.
I considered writing a big list of concentration inequalities, but then I realized that most of the time what I really wanted was the inequality underlying the concentration inequality, which is typically some sort of exponential inequality. So here we are. Plus, I do a lot of work with martingales, and writing things out like I’ve done here makes obvious the resulting supermartingale.
Are there papers that do this more formally and completely? Yes, e.g., this one. But I needed something for myself that was easy and interpretable.
Throughout, let \(X\) be a random variable with mean \(\mu\). Set \(\Xbar = X-\mu\).
Name | Condition | Bound |
---|---|---|
Hoeffding (Pf) | \(X\in [a,b]\) | \(\E\exp\{\lambda \Xbar - \frac{\lambda^2}{8}(b-a)^2\}\leq 1\) |
Self-bounding (Pf) | MGF exists | \(\forall \lambda\in\Re\quad \E \exp\{\lambda X - \log \E \exp(\lambda X)\}= 1\) |
Bennett (Pf) | \(\vert X \vert \leq H\) \(\E[X^2]\leq v^2<\infty\) | \(\forall\lambda\in (0,\frac{1}{v})\quad \E\exp\{-\lambda\Xbar - \frac{\mu^2}{H^2}(e^{\lambda H} - \lambda H -1)\}\leq 1\) |
One-sided Bernstein I (Pf) | \(X \leq H\) and \(\E[X^2]<\infty\) | \(\forall \vert\lambda \vert\leq \frac{1}{2H}\quad \E\exp\{\lambda \Xbar - (e-2)\lambda^2\Var(X)\}\leq 1\) |
One-sided Bernstein II (Pf) | \(X\geq 0\) and \(\E[X^2]<\infty\) | \(\forall \lambda>0 \quad \E\exp\{-\lambda\Xbar - \lambda^2\E[X^2]/2\}\leq 1\) |
Delyon (Pf) | \(\E[X^2] < \infty\) | \(\forall \lambda\in \Re\quad \E[\exp\{\lambda\Xbar - \frac{\lambda^2}{6}(\Xbar^2 - 2\E[\Xbar^2])\}]\leq 1\) |
Fan (Pf) | \(X\geq -1\), \(\E[X]\leq 0\) | \(\forall \lambda\in[0,1)\quad \E\exp\{\lambda X + X^2(\log(1 - \lambda) + \lambda)\}\leq 1\) |
Catoni (Pf) | \(\E[\vert X\vert^p]<\infty, p>1\) | \(\forall\lambda>0\quad \E\exp\{\phi_p(\lambda X) - \frac{\lambda^p}{p}\E[\vert X\vert^p]\} \leq 1\) where \(\phi_p(x) \leq \log(1 + x + \vert x\vert^p/p)\) |
Proof of Hoeffding
Just note that a bounded random variable \(X\in[a,b]\) is \((b-a)/2\)-sub-Gaussian, and apply the definition of sub-Gaussianity. It can also be proved more directly, see wikipedia.
Proof of Self-Bounding
Clear by definition:
\[\begin{align} \E\exp\{\lambda X - \log \E \exp(\lambda X)\} &= \E\exp\{\lambda X\}\exp\{-\log \E\exp\{\lambda X\}\} \\ &= \E\exp\{\lambda X\}\exp\log(\E\exp\{\lambda X\})^{-1} = 1. \end{align}\]Proof of Bennett
Set \(\psi(x) = e^x - x -1\). Notice that the function \(\psi(x)/x^2\) is monotonically increasing. (Technically, at \(x=0\), one should continuously extend the function; the left and right limits are the same so this works out. Plot the function if you don’t believe me.) Since \(X\leq H\), we have
\[\frac{1}{(\lambda X)^2}\psi(\lambda X) \leq \frac{1}{(\lambda H)^2}\psi(\lambda H).\]Rearranging,
\[\begin{equation} \label{eq:bennett-I} e^{\lambda X} \leq \frac{X^2}{H^2}\psi(\lambda H) + \lambda X + 1, \end{equation}\]so
\[\begin{equation} \label{eq:bennett-ii} \E e^{\lambda X} \leq \frac{\E X^2}{H^2}\psi(\lambda H) + \lambda \mu + 1. \tag{1} \end{equation}\]Let \(v>0\) be a bound on the second moment, i.e., \(\E[X^2]\leq v^2<\infty\). Then \(\mu^2 \leq v^2\) so \(|\mu|\leq v\).
Using the inequality \(e^x \geq 1 + x\) for all \(x\), notice that \(\psi(x)\geq 0\), so
\(\frac{\mu^2}{H^2}\psi(\lambda H) + \lambda\mu \geq \lambda\mu \geq -\lambda v> -1\) if \(\lambda< 1/v\). Therefore, taking logs of both sides of \eqref{eq:bennett-ii} is well-defined. Deploying the inequality \(\log(1 +x )\leq x\) for \(x>-1\), we have
Adding \(-\lambda \mu = -\log e^{\lambda \mu}\) to both sides, we obtain
\[\log \E \frac{e^{\lambda X}}{e^{\lambda \mu}} \leq \frac{v^2}{H^2}\psi(\lambda H) + \lambda\mu - \log \E e^{\lambda X} \leq \frac{v^2}{H^2}\psi(\lambda H),\]which, after exponentiating and rearranging, is the desired inequality.
Proof of one-sided Bernstein I
The inequality \(e^x \leq 1 + x + (e-2)x^2\) holds for all \(x\leq 1\). Applying this with \(x = \lambda\Xbar\) and taking expectations we obtain
\[\begin{align} \E[e^{\lambda \Xbar}] \leq 1 + \E[\lambda \Xbar] + (e-2)\lambda^2 \Var(X) = 1 + (e-2)\lambda^2 \Var(X) \leq \exp\{(e-2) \lambda^2\Var(X)\}. \end{align}\]Rearranging gives the result. Note that we needed the condition on \(\lambda\) to ensure that \(\lambda \Xbar\leq 1\): \(\lambda\Xbar \leq \vert\lambda\vert \vert X-\mu\vert\leq \vert\lambda \vert 2H \leq 1\).
Proof of one-sided Bernstein II
Let \(Z=-X\) and put \(\psi(s)= e^s - s -1\). Set
\[g(s) = \begin{cases} \psi(s)/s^2,&s\neq 0,\\ 1/2,&s=0. \end{cases}\]Note that \(g(s)\) simply defines the continuous extension of \(\psi(s)/s^2\) at \(s=0\). Indeed, \(\lim_{s\to0^+} \psi(s)/s^2=\lim_{s\to0^-}\psi(s)/s^2=1/2\). Note also \(g(s)\) is an increasing function for all \(s\in\Re\). Therefore, for all \(s\leq 0\), \(\psi(s) = s^2 g(s) \leq s^2 g(0) = \frac{s^2}{2}.\) Since \(Z\leq 0\) and \(\lambda>0\), we may take \(s=\lambda Z\) to obtain \(\phi(\lambda Z) \leq (\lambda Z)^2/2\). Thus, \(\E[e^{\lambda Z}] - \lambda \E[Z] - 1 \leq \frac{\lambda^2}{2}\E[Z^2],\) and
\[\begin{align} \E[\exp(\lambda(Z-\E[Z]))] &\leq e^{-\lambda \E[Z]} ( 1 + \lambda \E[Z] + \lambda^2\E[Z^2]/2) \\ &\leq e^{-\lambda \E[Z]}\exp(\lambda \E[Z] + \lambda^2\E[Z^2]/2) = \exp(\lambda^2\E[Z^2]/2). \end{align}\]Replacing \(Z\) with \(-X\) completes the proof.
Proof of Delyon
Delyon (2009) shows that for all \(x\in \Re\), \(\E[\exp( x - x^2/6)] \leq 1 + x + x^2/3\). Setting \(x = \lambda\Xbar\) and applying expectations gives
\[\begin{align} \E[\exp\{\lambda\Xbar - \lambda^2\Xbar^2/6\}] &\leq 1 + \lambda\E[\Xbar] + \lambda^2\E[\Xbar^2]/3 \\ &= 1 + \lambda^2\E[\Xbar^2]/3 \leq \exp\{\E[\lambda^2\Xbar^2]/3\}, \end{align}\]using that \(1+x\leq e^x\). Rearranging gives the result.
Proof of Fan
See original Fan et al. paper here. Note that the function
\[f(x) = \frac{\log(1+x) -x}{x^2/2},\]is non-decreasing for all \(x>-1\). (Continuously extend the function at \(x=0\).) Consider \(x=-\lambda\) and \(y = \lambda X\). Note that \(\lambda X\geq -\lambda >-1\). Therefore \(f(x)\leq f(y)\) which rearranges to read
\[f(-\lambda) \leq f(\lambda X) = \frac{\log(1 + \lambda X) -\lambda X}{X^2/2},\]which in turn rearranges to
\[\lambda X + X^2(\log(1-\lambda) + \lambda)\leq \log(1 + \lambda X).\]Exponentiating and taking expectations gives
\[\E\exp\{\lambda X + X^2(\log(1-\lambda) +\lambda)\}\leq \E[1 + \lambda X] \leq 1,\]as claimed.
Proof of Catoni
Let \(\E\vert X\vert^p\leq v<\infty\). By definition of \(\phi_p\),
\[\begin{align} \E\exp\bigg\{\phi_p(\lambda \Xbar) - \frac{\lambda^pv^p}{p} \bigg\} &\leq \E\bigg(1 + \lambda \Xbar + \frac{\vert\lambda \Xbar\vert^p}{p}\bigg)\exp\bigg(- \frac{\lambda^pv^p}{p}\bigg) \\ &= \E\bigg(1 + \frac{\vert\lambda \Xbar\vert^p}{p}\bigg)\exp\bigg(- \frac{\lambda^pv^p}{p}\bigg)\\ &\leq \exp\bigg(\frac{\lambda^p \E\vert\Xbar\vert^p}{p}\bigg)\exp\bigg(- \frac{\lambda^pv^p}{p}\bigg)\leq 1, \end{align}\]as desired.
Back to all notes