Abstract
We draw attention to how, in the name of protecting the confidentiality of personal data, national statistical agencies have limited public access to spatial data on COVID-19. We also draw attention to large disparities in the way that access has been limited. In doing so, we distinguish between absolute confidentiality in which the probability of detection is 1, relative confidentiality where this probability is less than 1, and collective confidentiality, which refers to the probability of detection of at least one person. In spatial data, the probability of personal detection is less than 1, and the probability of collective detection varies directly with this probability and COVID-19 morbidity. Statistical agencies have been concerned with relative and collective confidentiality, which they implement using the techniques of truncation, where spatial data are not made public for zones with small populations, and censoring, where exact data are not made public for zones where morbidity is small. Granular spatial data are essential for epidemiological research into COVID-19. We argue that in their reluctance to make these data available to the public, data security officers (DSO) have unreasonably prioritized data protection over freedom of information. We also argue that by attaching importance to relative and collective confidentiality, they have over-indulged in data truncation and censoring. We highlight the need for legislation concerning relative and collective confidentiality, and regulation of DSO practices regarding data truncation and censoring.
Original language | English |
---|---|
Pages (from-to) | 791-809 |
Number of pages | 19 |
Journal | Journal of Official Statistics |
Volume | 37 |
Issue number | 4 |
DOIs | |
State | Published - 1 Dec 2021 |
Bibliographical note
Publisher Copyright:© 2020 Michael Beenstock et al., published by Sciendo.
Keywords
- Spatial COVID-19 data
- collective confidentiality
- data censoring
- data truncation
- relative confidentiality