By John Hallman - April 6, 2021
This week we discuss Model-X Knockoffs by Emmanuel Candes and company, an extension of their previous paper Fixed-X Knockoffs which we covered a few months back.
Why Model-X Knockoff filters?
(1) Knockoff generation — for MX Knockoffs to work, the knockoff variables must be generated such that the exchangeability property and independence holds w.r.t the original and the knockoff variables:
Where the refers to switching columns between the original and the knockoff variables for each index . Note that the paper says little about how to generate that satisfies the above, which is easier said than done.
(2) Feature scoring — for MX Knockoffs to work, the feature scoring procedures for each variable must be anti-symmetric, which means that , with + if and - if .
The paper finds that lasso difference (and most other lasso-based scoring mechanisms) works quite well here.
(3) If (1) and (2) holds, then we can select variables while controlling FDR at the -level if we pick all variables for which their feature scores are above the threshold
Intuitively, this controls FDR because we can use the number of variables with negative feature scores as an approximation for the number of false discoveries, since this corresponds to a knockoff variable being considered "more significant" than the original variable.
Briefly, given samples , and a feature statistic for some index , consider the following procedure for accepting/rejecting variable :
For , create a new matrix by replacing the th column by replacing each with a new value sampled from the distribution .
Then, compute the (one-sided) p-value
Informally, we measure the regularized proportion of times the original matrix scores lower w.r.t. than the "knockoff" matrices .
This procedure is interesting because it's a much more straightforward approach to knockoff variable selection, although it doesn't deal with variable correlation and generally suffers from performance issues.
If you like applying these kinds of methods practical ML problems, join our team.