TY - GEN
T1 - Metrics for mass-count disparity
AU - Feitelson, Dror G.
PY - 2006
Y1 - 2006
N2 - Mass-count disparity is the technical underpinning of the "mice and elephants" phenomenon - That most samples are small, but a few are huge - which may be the most important attribute of heavy-tailed distributions. We propose to visualize this phenomenon by plotting the conventional distribution and the mass distribution together in the same plot. This then leads to a natural quantification of the effect based on the distance between the two distributions. Such a quantification addresses this important phenomenon directly, taking the full distribution into account, rather than focusing on the mathematical properties of the tail of the distribution. In particular, it shows that the Pareto distribution with tail index 1 < a < 2 actually has a relatively low mass-count disparity; the effects often observed are the result of combining some other distribution with a Pareto tail.
AB - Mass-count disparity is the technical underpinning of the "mice and elephants" phenomenon - That most samples are small, but a few are huge - which may be the most important attribute of heavy-tailed distributions. We propose to visualize this phenomenon by plotting the conventional distribution and the mass distribution together in the same plot. This then leads to a natural quantification of the effect based on the distance between the two distributions. Such a quantification addresses this important phenomenon directly, taking the full distribution into account, rather than focusing on the mathematical properties of the tail of the distribution. In particular, it shows that the Pareto distribution with tail index 1 < a < 2 actually has a relatively low mass-count disparity; the effects often observed are the result of combining some other distribution with a Pareto tail.
UR - http://www.scopus.com/inward/record.url?scp=34547616113&partnerID=8YFLogxK
M3 - ???researchoutput.researchoutputtypes.contributiontobookanthology.conference???
AN - SCOPUS:34547616113
SN - 0769525733
SN - 9780769525730
T3 - Proceedings - IEEE Computer Society's Annual International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunications Systems, MASCOTS
SP - 61
EP - 68
BT - Proceedings - 14th IEEE International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems, MASCOTS 2006
T2 - 14th IEEE International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems, MASCOTS 2006
Y2 - 11 September 2006 through 14 September 2006
ER -