Discovering Context: Classifying Tweets Through a Semantic Transform Based on Wikipedia
FOUNDATIONS OF AUGMENTED COGNITION: DIRECTING THE FUTURE OF ADAPTIVE SYSTEMS, HCI International July 9-14, 2011, Orlando, FL
10 Pages Posted: 17 Apr 2012
Date Written: July 1, 2011
Abstract
By mapping messages into a large context, we can compute the distances between them, and then classify them. We test this conjecture on Twitter messages: Messages are mapped onto their most similar Wikipedia pages, and the distances between pages are used as a proxy for the distances between messages. This technique yields more accurate classification of a set of Twitter messages than alternative techniques using string edit distance and latent semantic analysis.
Keywords: text classification, Wikipedia, semantics, context, cognition, latent semantic analysis
Suggested Citation: Suggested Citation