Statistical Inference

Fixed-X knockoffs | November 9, 2020

By John Hallman - November 9, 2020

This week we cover a novel method for False Discovery Rate control in variable selection — the Fixed-X Knockoff filter by Rina Barber and Emmanuel Candès.

Materials

Why knockoff filters?

  • Knockoff filters are a new collection of procedures that control FDR, and what's really cool is that they generally work under a very different model than traditional procedures. In particular, the Model-X Knockoff filter that build on fixed-X knockoffs work for a much broader range of distributions and do not even require the calculation of p-values! In exchange, they require greater knowledge about the generating distribution of the covariates.
  • Knockoff filters was a candidate for our FDR control procedure earlier this year, and it is still possible that we may find a use case for this in the future.

Nuggets

  • Theorem 3: The following Knockoff+ procedure given in the paper above controls FDR at a given level qq:
    • For features X=[X1,,Xp]X = [X_1, \ldots, X_p], where XX=ΣX^{\top}X = \Sigma and XX normalized so Σjj=1,j\Sigma_{jj} = 1, \forall j, construct knockoffs X~j\tilde{X}_j for each XjX_j such that:
      • X~X~=Σ\tilde{X}^{\top} \tilde{X} = \Sigma
      • XX~=Σdiag(s)X^{\top} \tilde{X} = \Sigma - \text{diag}(s) for some vector ss.
      • Intuition: we want to construct knockoff features that mimic the correlation structure of the original variables, while removing their effect on the response yy.
    • Compute test statistics W=(W1,,Wp)W = (W_1, \ldots, W_p) for each (feature, knockoff) pair with the following properties:
      • Sufficiency: WW only depends on X,yX, y via the Gram matrix and feature response, i.e.f\exists f such that W=f([X,X~][X,X~],[X,X~]y)W = f([X, \tilde{X}]^{\top} [X, \tilde{X}], [X, \tilde{X}]^{\top}y)
      • Anti-symmetry: Wj([X,X~]swap(S),y)=±Wj([X,X~],y)W_j([X, \tilde{X}]_{swap(S)}, y) = \pm W_j([X, \tilde{X}], y) with + if jSj \notin S and - if jSj \in S, where [X,X~]swap(S)[X, \tilde{X}]_{swap(S)} denotes for any subset S[p]S \subseteq [p] the swapping of each column/feature XjX_j with its knockoff X~j\tilde{X}_j for all jSj \in S.
      • Intuition: the above properties ensure that for null features, our test statistics cannot distinguish the original features from their knockoffs, and in particular the sign of the statistic is ±1\pm1 with equal probability.
    • Compute threshold T=min{tW:1+{j:Wjt}{j:Wjt}1q}T = \min \left\{ t \in W : \frac{1 + |\{j : W_j \leq -t\}|}{|\{j : W_j \geq t\}| \lor 1 } \leq q \right\} and select all features for which WjTW_j \geq T.
      • Intuition: by selecting variables while controlling for the number of negative statistics, we control the number of null variables due to the intuition explained above.

Raw Notes

fixed-knockoffs.pdf

If you like applying these kinds of methods practical ML problems, join our team.


Read more

SCANN | August 3, 2020

This week, we take a look at ScaNN (Scalable Nearest Neighbors), a method for efficient vector similarity search at scale.

Read more

Quasi-Newton part one: Wolfe conditions | October 13, 2020

The first part in a series of papers we're reading on Quasi-Newton methods, which Sisu relies on for optimizing key subproblems when running workflows.

Read more