Master's Thesis Tom Schamberger
Customizable Anonymization of German Legal Court Rulings using Domain-specific Named Entity Recognition
Abstract
In the legal domain, published court decisions play a vital role for legal researchers and developers of data-driven legal software. However, original court rulings usually contain sensitive information. Therefore, the publication of these documnents highly depends on the underlying anonymization process. The anonymization of legal documents is mainly done manually by trained employees and is generally considered an inefficient and error-prone process. Additionally, previos research has shown that generalized automated anonymization fails to adapt to vastly different anonymization standards of individual courts. Interviews with court employees from different courts immediately suggest that judges prefer to customize anonymization solutions in order to flexibly adapt to case-specific requirements.
In this work, we propose and evaluate a customizable approach to automatically anonymize legal court decisions using predefined configurations. This approach utilizes a trained machine learning model to detect special named entities in text paragraphs within original court rulings and masks sensitive named entity types according to the predefined rules. The detected entity types are specially chosen for this anonymization task and may be extended by future work.
| Attribute | Value |
|---|---|
| Title (de) | Anpassbare Anonymisierung deutscher Gerichtsurteile mithilfe domänenspezifischer Named Entity Recognition |
| Title (en) | Customizable Anonymization of German Legal Court Rulings using Domain-specific Named Entity Recognition |
| Project | Semantic Analysis of Court Rulings |
| Type | Master's Thesis |
| Status | completed |
| Student | Tom Schamberger |
| Advisor | Ingo Glaser |
| Supervisor | Prof. Dr. Florian Matthes |
| Start Date | 15.03.2021 |
| Sebis Contributor Agreement signed on | 15.03.2021 |
| Checklist filled | Yes |
| Submission date | 15.09.2021 |