Shades of Gray: Seeing the Full Spectrum of Practical Data De-Identification

37 Pages Posted: 3 Apr 2016 Last revised: 20 May 2016

See all articles by Jules Polonetsky

Jules Polonetsky

Future of Privacy Forum

Omer Tene

International Association of Privacy Professionals (IAPP)

Kelsey Finch

Future of Privacy Forum

Date Written: April 1, 2016

Abstract

One of the most hotly debated issues in privacy and data security is the notion of identifiability of personal data and its technological corollary, de-identification. De-identification is the process of removing personally identifiable information from data collected, stored and used by organizations. Once viewed as a silver bullet allowing organizations to reap the benefits of data while minimizing privacy and data security risks, de-identification has come under intense scrutiny with academic research papers and popular media reports highlighting its shortcomings.

At the same time, organizations around the world necessarily continue to rely on a wide range of technical, administrative and legal measures to reduce the identifiability of personal data to enable critical uses and valuable research while providing protection to individuals’ identity and privacy.

The debate around the contours of the term personally identifiable information, which triggers a set of legal and regulatory protections, continues to rage. Scientists and regulators frequently refer to certain categories of information as “personal” even as businesses and trade groups define them as “de-identified” or “non-personal.” The stakes in the debate are high. While not foolproof, de-identification techniques unlock value by enabling important public and private research, allowing for the maintenance and use – and, in certain cases, sharing and publication – of valuable information, while mitigating privacy risk.

This paper proposes parameters for calibrating legal rules to data depending on multiple gradations of identifiability, while also assessing other factors such as an organization’s safeguards and controls, as well as the data’s sensitivity, accessibility and permanence. It builds on emerging scholarship that suggests that rather than treat data as a black or white dichotomy, policymakers should view data in various shades of gray; and provides guidance on where to place important legal and technical boundaries between categories of identifiability. It urges the development of policy that creates incentives for organizations to avoid explicit identification and deploy elaborate safeguards and controls, while at the same time maintaining the utility of data sets.

Keywords: privacy, data protection, anonymity, de-identification, personal data, PII

JEL Classification: K10, K20, K30

Suggested Citation

Polonetsky, Jules and Tene, Omer and Finch, Kelsey, Shades of Gray: Seeing the Full Spectrum of Practical Data De-Identification (April 1, 2016). Santa Clara Law Review, Forthcoming, Available at SSRN: https://ssrn.com/abstract=2757709

Jules Polonetsky

Future of Privacy Forum ( email )

United States

Omer Tene (Contact Author)

International Association of Privacy Professionals (IAPP) ( email )

Pease International Tradeport
75 Rochester Ave., Suite 4
Portsmouth, NH 03801
United States

Kelsey Finch

Future of Privacy Forum ( email )

United States

Do you have negative results from your research you’d like to share?

Paper statistics

Downloads
1,405
Abstract Views
6,696
Rank
25,717
PlumX Metrics