Master's Thesis Johannes Muhr
Design, Prototypical Implementation, and Evaluation of an Active Machine Learning Service in the Context of Legal Text Classification
Abstract
In the contemporary era, great quantities of legal texts are produced, stored digitally, and retrieved for work later, to the extent that manual classification of these documents, and the manual processing of the content, has become unfeasible. This study provides support for this business need by implementing a microservice (LexML) for legal document and norm classification, which applies the concept of active machine learning. Following the evaluation of possible solutions for (legal) text classification and (active) machine learning in the existing literature, LexML was implemented using Apache Spark MLlib as the machine learning framework. Within the scope of this study, the existing functionality of the legal data-science environment called Lexia was utilized. Various cllabelledassifiers and query strategies were implemented and evaluated using German legal data. Overall, active learning strategies outperform traditional machine learning in terms of the speed of learning and maximum accuracy. The results of the document and norm classification experiments vary greatly: while for document classification, Naïve Bayes and Multi-Layer Perceptron outperform Logistic Regression, the latter is undoubtedly superior to the other two for norm classification.
| Attribute | Value |
|---|---|
| Title (de) | Design, Prototypische Implementierung und Evaluation eines Active Machine Learning Dienstes im Kontext der Klassifizierung von Rechtstexten |
| Title (en) | Design, Prototypical Implementation, and Evaluation of an Active Machine Learning Service in the Context of Legal Text Classification |
| Project | Lexalyze - Interdisciplinary Research Program |
| Type | Master's Thesis |
| Status | completed |
| Student | Johannes Muhr |
| Advisor | Dr. Bernhard Waltl |
| Supervisor | Prof. Dr. Florian Matthes |
| Start Date | 15.01.2017 |
| Sebis Contributor Agreement signed on | 07.12.2016 |
| Checklist filled | Yes |
| Submission date | 15.07.2017 |