The mixture approach to sub-Gaussian self-normalized bounds

August 08, 2025

In this section we show that the method of mixtures—the proof method used by \citet{abbasi2011improved} and \citet{pena2008self}—can also be used to prove Theorem~\ref{thm:sub-gaussian}. To repeat what was said in that section, existing proofs assume that $V_t$ is predictable, not adapted. Here we show that the same proof technique can accommodate an adapted variance process.

Let $(S_t,V_t)$ be a sub-$\psi_N$ process, implying that for all $\theta\in\Re^d$, [M_t(\theta) = \exp\left{ \la \theta, S_t\ra - \psi_N(1)\la \theta, V_t\theta\ra\right} \leq N_t(\theta),] where $(N_t(\theta))$ is a nonnegative supermartingale. (To see why we can consider all $\theta\in\Re^d$ instead of $\theta\in\dsphere$, see Section~\ref{proof:sub-gaussian} below.) Let $\nu$ be a Gaussian with mean 0 and covariance $U_0^{-1}$ and consider the process with increments \begin{equation} M_t = \int_{\Re^d} \exp{ \la \theta, S_t\ra - \psi_N(1)\la \theta, V_t\theta\ra }\nu(\d\theta).
\end{equation} To compute $M_t$, notice that we can write \begin{equation} \la\theta, S_t \ra - \psi_N(1) \la \theta, V_t\theta\ra = \frac{1}{2}|S_t|_{V_t^{-1}} - \frac{1}{2}|\theta - V_t^{-1}S_t|_{V_t}^2, \end{equation} and \begin{equation} | \theta - V_t^{-1}S_t|_{V_t}^2 +\la \theta, U_0\theta\ra = |\theta - (U_0 + V_t)^{-1}S_t|_{U_0 + V_t}^2 -2|S_t|_{V_t^{-1}}^{2} +2|S_t|_{(U_0 + V_t)^{-1}}^2. \end{equation} Hence, writing out the density of $\nu$,
\begin{align} M_t &= \frac{\exp(\frac{1}{2}|S_t|_{V_t^{-1}}) }{\sqrt{2\pi \det(U_0^{-1})}} \int_{\Re^d} \exp\left(- \frac{1}{2}| \theta - V_t^{-1}S_t|_{V_t}^2 - \frac{1}{2}\la \theta, U_0\theta\ra\right) \d\theta \ %&= \frac{\exp(\frac{1}{2}|S_t|_{V_t^{-1}}) }{\sqrt{2\pi \det(U_0^{-1})}} \int_{\Re^d} \exp\left(- \frac{1}{2}\left(|\theta - (U_0 + V_t)^{-1}S_t|_{U_0 + V_t}^2 +|S_t|_{V_t^{-1}}^{2} -|S_t|_{(U_0 + V_t)^{-1}}^2\right)\right) \d\theta \ &= \frac{\exp(\frac{1}{2}|S_t|_{(U_0 + V_t)^{-1}}^2)}{\sqrt{2\pi \det(U_0^{-1})}} \int_{\Re^d} \exp\left(- \frac{1}{2}|\theta - (U_0 + V_t)^{-1}S_t|_{U_0 + V_t}^2 \right) \d\theta
&= \frac{\exp(\frac{1}{2}|S_t|_{(U_0 + V_t)^{-1}}^2)}{\sqrt{2\pi \det(U_0^{-1})}} \sqrt{2\pi \det((U_0 + V_t)^{-1}} \ &= \sqrt{\frac{\det(U_0)}{\det(U_0 + V_t)}} \exp\left(\frac{1}{2}|S_t|_{(U_0 + V_t)^{-1}}^2\right). \end{align} Since $\int N_t(\theta) \d\nu(\theta)$ is a supermartingale by Fubini’s theorem, $M_t$ remains upper bounded by a nonnegative supermartingale. We may thus apply Ville’s inequality to obtain $P(M_\tau \geq 1/\delta) \leq \E[M_1]\delta\leq \delta$. In other words, with probability $1-\delta$, $\log M_\tau \leq 1/\delta$, which translates to [ \frac{1}{2}|S_\tau|^2{(U_0 + V\tau)^{-1}} \leq \frac{1}{2}\log\left(\frac{\det (U_0 + V_\tau)}{\det U_0}\right) + \log(1/\delta), ] which is precisely Theorem~\ref{thm:sub-gaussian}.

Back to all notes