Translation from Narrative Text to Standard Codes Variables with Stata
20 Pages Posted: 14 Jul 2009 Last revised: 3 Sep 2009
Date Written: July 10, 2009
Abstract
This paper describes screening, a new Stata command for data-management that can be used to examine the content of complex narrative text variables to identify one or more user-defined keywords. Thus the command is particularly useful to deal with string data type contaminated with abbreviations and/or mistakes. Although the main duty of screening is identification, a rich set of options allow a direct translation from the original narrative string to a user-defined standard coding scheme. Moreover the command is flexible enough to facilitate the merging of information from different sources and to extract substring from the desired variable.
Keywords: screening, keyword-matching, narrative text variables, standard coding schemes
JEL Classification: C81, C87, C88
Suggested Citation: Suggested Citation