Data Exploration by Representative Region Selection: Axioms and Convergence

Alexander S. Estes , Michael O. Ball , David J. Lovell (2021) Data Exploration by Representative Region Selection: Axioms and Convergence. Mathematics of Operations Research 46(3):970-1007.

44 Pages Posted: 9 Sep 2019 Last revised: 20 Oct 2021

See all articles by Alexander Estes

Alexander Estes

University of Maryland

Michael O. Ball

University of Maryland - Decision and Information Technologies Department

David Lovell

University of Maryland

Date Written: September 6, 2019

Abstract

We present a new type of unsupervised learning problem in which we find a small set of representative regions that approximates a larger dataset. These regions may be presented to a practitioner along with additional information in order to help the practitioner explore the data set. An advantage of this approach is that it does not rely on cluster structure of the data. We formally define this problem, and we present axioms that should be satisfied by functions that measure the quality of representatives. We provide a quality function that satisfies all of these axioms. Using this quality function, we formulate two optimization problems for finding representatives. We provide convergence results for a general class of methods, and we show that these results apply to several specific methods, including methods derived from the solution of the optimization problems formulated in this paper. We provide an example of how representative regions may be used to explore a data set.

Keywords: Representative Region Selection, Unsupervised learning, Data exploration, Density estimation, Consistency

JEL Classification: C02, C13, C44, C55

Suggested Citation

Estes, Alexander and Ball, Michael O. and Lovell, David, Data Exploration by Representative Region Selection: Axioms and Convergence (September 6, 2019). Alexander S. Estes , Michael O. Ball , David J. Lovell (2021) Data Exploration by Representative Region Selection: Axioms and Convergence. Mathematics of Operations Research 46(3):970-1007., Available at SSRN: https://ssrn.com/abstract=3005997 or http://dx.doi.org/10.2139/ssrn.3005997

Alexander Estes (Contact Author)

University of Maryland ( email )

College Park
College Park, MD 20742
United States

Michael O. Ball

University of Maryland - Decision and Information Technologies Department ( email )

Robert H. Smith School of Business
4313 Van Munching Hall
College Park, MD 20815
United States
301-405-2227 (Phone)
301-405-8655 (Fax)

David Lovell

University of Maryland ( email )

College Park
College Park, MD 20742
United States

Do you have negative results from your research you’d like to share?

Paper statistics

Downloads
96
Abstract Views
728
Rank
492,371
PlumX Metrics