Human Rights Texts: Converting Human Rights Primary Source Documents into Data

28 Pages Posted: 8 Oct 2014 Last revised: 14 Sep 2015

See all articles by Christopher J. Fariss

Christopher J. Fariss

University of Michigan at Ann Arbor - Department of Political Science

Fridolin Linder

Pennsylvania State University

Zachary Jones

Pennsylvania State University

Charles Crabtree

University of Michigan - Political Science, Students

Megan Biek

Pennsylvania State University

Ana-Sophia Ross

Pennsylvania State University

Taranamol Kaur

University of California, San Diego (UCSD)

Michael Tsai

University of California, San Diego (UCSD)

Date Written: May 16, 2015

Abstract

We introduce and make publicly available a large corpus of digitized primary source human rights documents which are published annually by monitoring agencies that include Amnesty International, Human Rights Watch, the Lawyers Committee for Human Rights, and the United States Department of State. In addition to the digitized text, we also make available and describe document-term matrices, which are datasets that systematically organize the word counts from each unique document by each unique term within the corpus of human rights documents. To contextualize the importance of this corpus, we describe the development of coding procedures in the human rights community and several existing categorical indicators that have been created by human coding of the human rights documents contained in the corpus. We then discuss how the new human rights corpus and the existing human rights datasets can be used with a variety of statistical analyses and machine learning algorithms to help scholars understand how human rights practices and reporting have evolved over time. We close with a discussion of our plans for dataset maintenance, updating, and availability.

Suggested Citation

Fariss, Christopher J. and Linder, Fridolin and Jones, Zachary and Crabtree, Charles and Biek, Megan and Ross, Ana-Sophia and Kaur, Taranamol and Tsai, Michael, Human Rights Texts: Converting Human Rights Primary Source Documents into Data (May 16, 2015). Available at SSRN: https://ssrn.com/abstract=2502980 or http://dx.doi.org/10.2139/ssrn.2502980

Christopher J. Fariss (Contact Author)

University of Michigan at Ann Arbor - Department of Political Science ( email )

Ann Arbor, MI 48109
United States

Fridolin Linder

Pennsylvania State University

University Park
State College, PA 16802
United States

Zachary Jones

Pennsylvania State University ( email )

University Park
State College, PA 16802
United States

Charles Crabtree

University of Michigan - Political Science, Students ( email )

Ann Arbor, MI 48109
United States

Megan Biek

Pennsylvania State University ( email )

University Park
State College, PA 16802
United States

Ana-Sophia Ross

Pennsylvania State University ( email )

University Park
State College, PA 16802
United States

Taranamol Kaur

University of California, San Diego (UCSD) ( email )

9500 Gilman Drive
Mail Code 0502
La Jolla, CA 92093-0112
United States

Michael Tsai

University of California, San Diego (UCSD) ( email )

9500 Gilman Drive
Mail Code 0502
La Jolla, CA 92093-0112
United States

Do you have negative results from your research you’d like to share?

Paper statistics

Downloads
156
Abstract Views
1,980
Rank
340,106
PlumX Metrics