TY - JOUR
T1 - Memory Complexity of Estimating Entropy and Mutual Information
AU - Berg, Tomer
AU - Ordentlich, Or
AU - Shayevitz, Ofer
N1 - Publisher Copyright:
© 2025 IEEE.
PY - 2025
Y1 - 2025
N2 - We observe an infinite sequence of independent identically distributed random variables X1,X2,... drawn from an unknown distribution p over [n], and our goal is to estimate the entropy H(p) = -E[log p(X)] within an ϵ-additive error. To that end, at each time point we are allowed to update a finite-state machine with S states, using a possibly randomized but time-invariant rule, where each state of the machine is assigned an entropy estimate. Our goal is to characterize the minimax memory complexity S∗ of this problem, which is the minimal number of states for which the estimation task is feasible with probability at least 1 - δ asymptotically, uniformly in p. Specifically, we show that there exist universal constants C1 and C2 such that S∗ ≤ C1 · n(log n)4/ϵ2δ for ϵ not too small, and S∗ ≥ C2 · max{n, log n/ϵ} for ϵ not too large. The upper bound is proved using approximate counting to estimate the logarithm of p, and a finite memory bias estimation machine to estimate the expectation operation. The lower bound is proved via a reduction of entropy estimation to uniformity testing. We also apply these results to derive bounds on the memory complexity of mutual information estimation.
AB - We observe an infinite sequence of independent identically distributed random variables X1,X2,... drawn from an unknown distribution p over [n], and our goal is to estimate the entropy H(p) = -E[log p(X)] within an ϵ-additive error. To that end, at each time point we are allowed to update a finite-state machine with S states, using a possibly randomized but time-invariant rule, where each state of the machine is assigned an entropy estimate. Our goal is to characterize the minimax memory complexity S∗ of this problem, which is the minimal number of states for which the estimation task is feasible with probability at least 1 - δ asymptotically, uniformly in p. Specifically, we show that there exist universal constants C1 and C2 such that S∗ ≤ C1 · n(log n)4/ϵ2δ for ϵ not too small, and S∗ ≥ C2 · max{n, log n/ϵ} for ϵ not too large. The upper bound is proved using approximate counting to estimate the logarithm of p, and a finite memory bias estimation machine to estimate the expectation operation. The lower bound is proved via a reduction of entropy estimation to uniformity testing. We also apply these results to derive bounds on the memory complexity of mutual information estimation.
KW - entropy estimation
KW - finite memory algorithms
KW - Memory complexity
KW - mutual information estimation
KW - sample complexity
UR - http://www.scopus.com/inward/record.url?scp=86000462622&partnerID=8YFLogxK
U2 - 10.1109/TIT.2025.3547871
DO - 10.1109/TIT.2025.3547871
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???
AN - SCOPUS:86000462622
SN - 0018-9448
JO - IEEE Transactions on Information Theory
JF - IEEE Transactions on Information Theory
ER -