LexNLP: Natural Language Processing and Information Extraction For Legal and Regulatory Texts

7 Pages Posted: 21 Jun 2018

See all articles by Michael James Bommarito

Michael James Bommarito

273 Ventures; Licensio, LLC; Stanford Center for Legal Informatics; Michigan State College of Law; Bommarito Consulting, LLC

Daniel Martin Katz

Illinois Tech - Chicago Kent College of Law; Bucerius Center for Legal Technology & Data Science; Stanford CodeX - The Center for Legal Informatics; 273 Ventures

Eric Detterman

LexPredict, LLC

Date Written: June 6, 2018

Abstract

LexNLP is an open source Python package focused on natural language processing and machine learning for legal and regulatory text. The package includes functionality to (i) segment documents, (ii) identify key text such as titles and section headings, (iii) extract over eighteen types of structured information like distances and dates, (iv) extract named entities such as companies and geopolitical entities, (v) transform text into features for model training, and (vi) build unsupervised and supervised models such as word embedding or tagging models. LexNLP includes pre-trained models based on thousands of unit tests drawn from real documents available from the SEC EDGAR database as well as various judicial and regulatory proceedings. LexNLP is designed for use in both academic research and industrial applications.

Keywords: natural language processing, legal, regulatory, machine learning, segmentation, extraction, open source, Python

JEL Classification: C19, C53, C55, C38, C45, C63, C88

Suggested Citation

Bommarito, Michael James and Katz, Daniel Martin and Detterman, Eric, LexNLP: Natural Language Processing and Information Extraction For Legal and Regulatory Texts (June 6, 2018). Available at SSRN: https://ssrn.com/abstract=3192101 or http://dx.doi.org/10.2139/ssrn.3192101

Licensio, LLC ( email )

Okemos, MI 48864
United States

Stanford Center for Legal Informatics ( email )

559 Nathan Abbott Way
Stanford, CA 94305-8610
United States

Michigan State College of Law ( email )

318 Law College Building
East Lansing, MI 48824-1300
United States

Bommarito Consulting, LLC ( email )

MI 48098
United States

Daniel Martin Katz (Contact Author)

Illinois Tech - Chicago Kent College of Law ( email )

565 W. Adams St.
Chicago, IL 60661-3691
United States

HOME PAGE: http://www.danielmartinkatz.com/

Bucerius Center for Legal Technology & Data Science ( email )

Jungiusstr. 6
Hamburg, 20355
Germany

HOME PAGE: http://legaltechcenter.de/

Stanford CodeX - The Center for Legal Informatics ( email )

559 Nathan Abbott Way
Stanford, CA 94305-8610
United States

HOME PAGE: http://law.stanford.edu/directory/daniel-katz/

273 Ventures ( email )

HOME PAGE: http://273ventures.com

Eric Detterman

LexPredict, LLC ( email )

MI
United States

Do you have negative results from your research you’d like to share?

Paper statistics

Downloads
2,263
Abstract Views
8,864
Rank
12,368
PlumX Metrics