Abstract
If we pick n random points uniformly in and connect each point to its nearest neighbors, where is the dimension and is a constant depending on the dimension, then it is well known that the graph is connected with high probability. We prove that it suffices to connect every point to points chosen randomly among its nearest neighbors to ensure a giant component of size with high probability. This construction yields a much sparser random graph with instead of edges that has comparable connectivity properties. This result has non-trivial implications for problems in data science where an affinity matrix is constructed: instead of connecting each point to its k nearest neighbors, one can often pick random points out of the k nearest neighbors and only connect to those without sacrificing quality of results. This approach can simplify and accelerate computation; we illustrate this with experimental results in spectral clustering of large-scale datasets.
Original language | English |
---|---|
Pages (from-to) | 458-476 |
Number of pages | 19 |
Journal | Journal of Applied Probability |
Volume | 57 |
Issue number | 2 |
DOIs | |
State | Published - 1 Jun 2020 |
Externally published | Yes |
Bibliographical note
Publisher Copyright:©
Keywords
- connectivity
- Keywords: k-nn graph
- random graph
- sparsification