TY - GEN
T1 - Balls and bins
T2 - 2011 IEEE 52nd Annual Symposium on Foundations of Computer Science, FOCS 2011
AU - Celis, L. Elisa
AU - Reingold, Omer
AU - Segev, Gil
AU - Wieder, Udi
PY - 2011
Y1 - 2011
N2 - A fundamental fact in the analysis of randomized algorithms is that when n balls are hashed into n bins independently and uniformly at random, with high probability each bin contains at most O(log n / log(log n)) balls. In various applications, however, the assumption that a truly random hash function is available is not always valid, and explicit functions are required. In this paper we study the size of families (or, equivalently, the description length of their functions) that guarantee a maximal load of O(log n / log(log n)) with high probability, as well as the evaluation time of their functions. Whereas such functions must be described using Ω(log n) bits, the best upper bound was formerly O(log 2 n / log(log n)) bits, which is attained by O(log n / log(log n))-wise independent functions. Traditional constructions of the latter offer an evaluation time of O(log n / log(log n)), which according to Siegel's lower bound [FOCS '89] can be reduced only at the cost of significantly increasing the description length. We construct two families that guarantee a maximal load of O(log n / log(log n)) with high probability. Our constructions are based on two different approaches, and exhibit different trade-offs between the description length and the evaluation time. The first construction shows that O(log n / log(log n))-wise independence can in fact be replaced by "gradually increasing independence", resulting in functions that are described using O(log n log(log n)) bits and evaluated in time O(log n log(log n)). The second construction is based on derandomization techniques for space-bounded computations combined with a tailored construction of a pseudorandom generator, resulting in functions that are described using O(log (3/2) n) bits and evaluated in time O(√log n)). The latter can be compared to Siegel's lower bound stating that O(log n / log(log n))-wise independent functions that are evaluated in time O(√log n)) must be described using Ω(2 √log n) bits.
AB - A fundamental fact in the analysis of randomized algorithms is that when n balls are hashed into n bins independently and uniformly at random, with high probability each bin contains at most O(log n / log(log n)) balls. In various applications, however, the assumption that a truly random hash function is available is not always valid, and explicit functions are required. In this paper we study the size of families (or, equivalently, the description length of their functions) that guarantee a maximal load of O(log n / log(log n)) with high probability, as well as the evaluation time of their functions. Whereas such functions must be described using Ω(log n) bits, the best upper bound was formerly O(log 2 n / log(log n)) bits, which is attained by O(log n / log(log n))-wise independent functions. Traditional constructions of the latter offer an evaluation time of O(log n / log(log n)), which according to Siegel's lower bound [FOCS '89] can be reduced only at the cost of significantly increasing the description length. We construct two families that guarantee a maximal load of O(log n / log(log n)) with high probability. Our constructions are based on two different approaches, and exhibit different trade-offs between the description length and the evaluation time. The first construction shows that O(log n / log(log n))-wise independence can in fact be replaced by "gradually increasing independence", resulting in functions that are described using O(log n log(log n)) bits and evaluated in time O(log n log(log n)). The second construction is based on derandomization techniques for space-bounded computations combined with a tailored construction of a pseudorandom generator, resulting in functions that are described using O(log (3/2) n) bits and evaluated in time O(√log n)). The latter can be compared to Siegel's lower bound stating that O(log n / log(log n))-wise independent functions that are evaluated in time O(√log n)) must be described using Ω(2 √log n) bits.
UR - http://www.scopus.com/inward/record.url?scp=84863334551&partnerID=8YFLogxK
U2 - 10.1109/FOCS.2011.49
DO - 10.1109/FOCS.2011.49
M3 - ???researchoutput.researchoutputtypes.contributiontobookanthology.conference???
AN - SCOPUS:84863334551
SN - 9780769545714
T3 - Proceedings - Annual IEEE Symposium on Foundations of Computer Science, FOCS
SP - 599
EP - 608
BT - Proceedings - 2011 IEEE 52nd Annual Symposium on Foundations of Computer Science, FOCS 2011
Y2 - 22 October 2011 through 25 October 2011
ER -