Statistical Inference

By Vlad Feinberg - August 17, 2020

This week, we tackle how, when testing multiple hypotheses, we need to be careful about the multiple comparisons problem due to randomness in the hypothesis evaluation procedure. False discovery rate (FDR) control provides a new framework for what we should optimize for in this setting, while controlling the error incurred from multiplicity.

**Materials**

Martingale proof of Benjamini-Hochberg (BH) from Storey, Taylor, and Sigmund 2004 (for this reading, set $\lambda=0$ and only focus on Theorems 1 and 2)

**Why Benjamini-Hochberg?**

- Classical statistical control for false discoveries, such as controlling family-wise error rate, demands limiting the average count of false discoveries in an experimental procedure.
- Strategies for FWER like Holm-Bonferroni are very conservative, but by changing the objective from a count to a rate, namely, that the proportion of false discoveries among all discoveries made should be small, Benjamini and Hochberg were able to find a more powerful method.
- Since the kind of control differs, but is still an interpretable guarantee which may be useful to experimenters, it's important to keep FDR in mind when looking for discoveries in more lenient settings.

**Nuggets**

STS Theorem 1. Mean-based estimator upper bounds FDR.

- Suppose we rejected p-values at a constant level $t$. For $m$ hypotheses we'd expect $mt$ rejections at most (if they're all null). Then simply dividing this average by $R_t$, the number of rejected hypotheses with p-values less than $t$, bounds the
*actual*average FDR $\mathbb{E} [V_t/R_t]$, where $V_t$ is the true number of rejected nulls, by a convexity argument and Jensen's inequality.

BH Procedure

- Based on the estimator above $\mathbb{E}\hat{\mathrm{FDR}}_t=\mathbb{E}[mt/R_t]\ge \mathbb{E}[V_t/R_t]$, one reasonable guess for an adaptive rejection rate that controls FDR at level $\alpha$ would be $\hat t = \sup\{t|\hat{\mathrm{FDR}}_{t}\le \alpha\}$, since we're controlling an upper bound.
- By a visual proof (see the notes), rejecting at level $\hat {t}$ is equivalent to the usual BH algorithm.

STS Theorem 2. BH actually controls FDR $\mathrm{FDR}=\mathbb{E}[V_{\hat t}/R_{\hat t}]\le\alpha$ when p-values are independent.

- This isn't trivial because Theorem 1 only holds for nonrandom $t$.
- The ratio between true type-I errors and their average $M_t=V_t/(mt)$ is a martingale in reverse time as $t$ falls from 1 to 0 due to the independence mentioned above. As a result, by Optional Stopping Theorem $\mathbb{E}M_{\hat t}=\mathbb{E} M_1\le 1$.
- Noticing that we can rewrite the true $\mathrm{FDR}=\mathbb{E}[\hat{\mathrm{FDR}}_{\hat t}M_t]$ since the first term is uniformly dominated by $\alpha$ by construction and the second by 1, we have control!

**Raw Notes**

*Corrections*

Definition of a stopping time $\tau$ is that $\mathbb{E}[1\{\tau\le t\}|\mathcal{F}_t]=1\{\tau\le t\}$ not $\tau$

*If you like applying these kinds of methods practical ML problems, join our team.*