A Practical Method to Reduce Privacy Loss When Disclosing Statistics Based on Small Samples

29 Pages Posted: 12 Mar 2019 Last revised: 8 Jan 2023

See all articles by Raj Chetty

Raj Chetty

Harvard University

John Friedman

Brown University

Date Written: March 2019

Abstract

We develop a simple method to reduce privacy loss when disclosing statistics such as OLS regression estimates based on samples with small numbers of observations. We focus on the case where the dataset can be broken into many groups (“cells”) and one is interested in releasing statistics for one or more of these cells. Building on ideas from the differential privacy literature, we add noise to the statistic of interest in proportion to the statistic's maximum observed sensitivity, defined as the maximum change in the statistic from adding or removing a single observation across all the cells in the data. Intuitively, our approach permits the release of statistics in arbitrarily small samples by adding sufficient noise to the estimates to protect privacy. Although our method does not offer a formal privacy guarantee, it generally outperforms widely used methods of disclosure limitation such as count-based cell suppression both in terms of privacy loss and statistical bias. We illustrate how the method can be implemented by discussing how it was used to release estimates of social mobility by Census tract in the Opportunity Atlas. We also provide a step-by-step guide and illustrative Stata code to implement our approach.

Suggested Citation

Chetty, Raj and Friedman, John, A Practical Method to Reduce Privacy Loss When Disclosing Statistics Based on Small Samples (March 2019). NBER Working Paper No. w25626, Available at SSRN: https://ssrn.com/abstract=3350397

Raj Chetty (Contact Author)

Harvard University ( email )

1875 Cambridge Street
Cambridge, MA 02138
United States

John Friedman

Brown University ( email )

Box 1860
Providence, RI 02912
United States

Do you have negative results from your research you’d like to share?

Paper statistics

Downloads
20
Abstract Views
345
PlumX Metrics