Identification and Formal Privacy Guarantees

70 Pages Posted: 17 Jul 2020 Last revised: 12 Oct 2022

See all articles by Tatiana Komarova

Tatiana Komarova

Department of Economics, University of Manchester

Denis Nekipelov

University of Virginia - Department of Economics

Date Written: October 11, 2022

Abstract

The reliance of empirical economic research on highly sensitive individual datasets and the increasing availability of public individual-level data that comes. e.g., from social networks, public government records and directories creates privacy risks as adversaries may potentially de-identify anonymized records in sensitive research datasets. To deal with such risks, the computer science research proposed differential privacy (DP) -- a formal criterion for the evaluation of non-disclosure guarantees for released statistics and the related methodology to ensure such guarantees.

While previous work on DP focused on DP guarantees for specific data statistics, its impact on identification of parameters of interest determined from the population distribution has not been studied. This paper bridges this gap.

In this paper we find that there is a broad class of population parameters that are not identified. Moreover, those parameters are not even partially identified, i.e. one cannot construct a set that would contain their population values. Population parameters of interest can be only characterized as elements random sets which requires the application of the toolkit of the random set theory to analyze their population properties. Identification becomes possible if the target parameter can be deterministically mapped within the random set. In that case, a full exploration of the support of the distribution of the random set of the weak limits of differentially private estimators can allow the data curator to select a sequence of instances of differentially private estimators that is guaranteed to converge to the target parameter in probability. We provide a decision-theoretic approach to this selection.

Our results indicate that expansion of formal privacy guarantees to socio-economic datasets requires further work on integrating data analysis with results and concepts from the random set theory as well as techniques for partial identification and inference.

Keywords: Differential privacy, average treatment effect, regression discontinuity,; random sets, identification

JEL Classification: C35, C14, C25, C13

Suggested Citation

Komarova, Tatiana and Nekipelov, Denis, Identification and Formal Privacy Guarantees (October 11, 2022). Available at SSRN: https://ssrn.com/abstract=3635824 or http://dx.doi.org/10.2139/ssrn.3635824

Tatiana Komarova (Contact Author)

Department of Economics, University of Manchester ( email )

Arthur Lewis Building
Oxford Road
Manchester, M13 9PL
United Kingdom

Denis Nekipelov

University of Virginia - Department of Economics ( email )

237 Monroe Hall
P.O. Box 400182
Charlottesville, VA 22904-418
United States

Do you have negative results from your research you’d like to share?

Paper statistics

Downloads
198
Abstract Views
1,009
Rank
279,625
PlumX Metrics