Abstract
Digital databases can be represented by matrices, where rows (say) correspond to numerical sensors readings, or features, and columns correspond to data points. Recent data analysis methods describe the local geometry of the data points using a weighted affinity graph, whose vertices correspond to data points. We consider two geometries, or graphs – one on the rows and one on the columns, such that the data matrix is smooth with respect to the “tensor product” of the two geometries. This is achieved by an iterative procedure that constructs a multiscale partition tree on each graph. We use the recently introduced notion of Haar-like bases induced by the trees to obtain Tensor-Haar-like bases for the space of matrices, and show that an ℓp entropy conditions on the expansion coefficients of the database, viewed as a function on the product of the geometries, imply both smoothness and efficient reconstruction. We apply this methodology to analyze, de-noise and compress a term-document database. We use the same methodology to compress matrices of potential operators of unknown charge distribution geometries and to organize Laplacian eigenvectors, where the data matrix is the “expansion in Laplace eigenvectors” operator.
Original language | English |
---|---|
Title of host publication | Applied and Numerical Harmonic Analysis |
Publisher | Springer International Publishing |
Pages | 161-197 |
Number of pages | 37 |
Edition | 9780817680947 |
DOIs | |
State | Published - 2011 |
Externally published | Yes |
Publication series
Name | Applied and Numerical Harmonic Analysis |
---|---|
Number | 9780817680947 |
ISSN (Print) | 2296-5009 |
ISSN (Electronic) | 2296-5017 |
Bibliographical note
Publisher Copyright:© 2011, Springer Science+Business Media, LLC.
Keywords
- Data Matrix
- Fast Multipole Method
- Iterative Procedure
- Partition Tree
- Potential Operator