TY - GEN

T1 - Sparse regression as a sparse eigenvalue problem

AU - Moghaddam, Baback

AU - Gruber, Amit

AU - Weiss, Yair

AU - Avidan, Shai

PY - 2008

Y1 - 2008

N2 - We extend the l0-norm "subspectral" algorithms developed for sparse-LDA [5] and sparse-PCA [6] to more general quadratic costs such as MSE in linear (or kernel) regression. The resulting "Sparse Least Squares" (SLS) problem is also NP-hard, by way of its equivalence to a rank-1 sparse eigenvalue problem. Specifically, for minimizing general quadratic cost functions we use a highly-efficient method for direct eigenvalue computation based on partitioned matrix inverse techniques that leads to × 103 speed-ups over standard eigenvalue decomposition. This increased efficiency mitigates the O(n4) complexity that limited the previous algorithms' utility for high-dimensional problems. Moreover, the new computation prioritizes the role of the less-myopic backward elimination stage which becomes even more efficient than forward selection. Similarly, branch-and-bound search for Exact Sparse Least Squares (ESLS) also benefits from partitioned matrix techniques. Our Greedy Sparse Least Squares (GSLS) algorithm generalizes Natarajan's algorithm [9] also known as Order-Recursive Matching Pursuit (ORMP). Specifically, the forward pass of GSLS is exactly equivalent to ORMP but is more efficient, and by including the backward pass, which only doubles the computation, we can achieve a lower MSE than ORMP. In experimental comparisons with LARS [3], forward-GSLS is shown to be not only more efficient and accurate but more flexible in terms of choice of regularizaron.

AB - We extend the l0-norm "subspectral" algorithms developed for sparse-LDA [5] and sparse-PCA [6] to more general quadratic costs such as MSE in linear (or kernel) regression. The resulting "Sparse Least Squares" (SLS) problem is also NP-hard, by way of its equivalence to a rank-1 sparse eigenvalue problem. Specifically, for minimizing general quadratic cost functions we use a highly-efficient method for direct eigenvalue computation based on partitioned matrix inverse techniques that leads to × 103 speed-ups over standard eigenvalue decomposition. This increased efficiency mitigates the O(n4) complexity that limited the previous algorithms' utility for high-dimensional problems. Moreover, the new computation prioritizes the role of the less-myopic backward elimination stage which becomes even more efficient than forward selection. Similarly, branch-and-bound search for Exact Sparse Least Squares (ESLS) also benefits from partitioned matrix techniques. Our Greedy Sparse Least Squares (GSLS) algorithm generalizes Natarajan's algorithm [9] also known as Order-Recursive Matching Pursuit (ORMP). Specifically, the forward pass of GSLS is exactly equivalent to ORMP but is more efficient, and by including the backward pass, which only doubles the computation, we can achieve a lower MSE than ORMP. In experimental comparisons with LARS [3], forward-GSLS is shown to be not only more efficient and accurate but more flexible in terms of choice of regularizaron.

UR - http://www.scopus.com/inward/record.url?scp=52949139076&partnerID=8YFLogxK

U2 - 10.1109/ITA.2008.4601051

DO - 10.1109/ITA.2008.4601051

M3 - Conference contribution

AN - SCOPUS:52949139076

SN - 1424426707

SN - 9781424426706

T3 - 2008 Information Theory and Applications Workshop - Conference Proceedings, ITA

SP - 219

EP - 225

BT - 2008 Information Theory and Applications Workshop - Conference Proceedings, ITA

T2 - 2008 Information Theory and Applications Workshop - ITA

Y2 - 27 January 2008 through 1 February 2008

ER -