Sentence Boundary Detection in German Legal Documents
Sentence boundary detection on German legal texts is a task which standardized NLP-systems have little or no ability to handle, since they are sometimes overburdened by more complex structures such as lists, paragraph structures and citations. In this paper we evaluate the performance of these systems and adapt methods directly to the legal domain.
We created an annotated dataset with over 50,000 sentences consisting of various German legal documents which can be utilized for further research within the community. Our neural networks and conditional random fields models show significantly higher performances on this data than the tested, already existing systems. Thus this paper contradicts the assumption that the problem of segmenting sentences is already solved.
| Attribute | Value |
|---|---|
| Address | Virtual |
| Authors | Ingo Glaser , Sebastian Moser , Prof. Dr. Florian Matthes |
| Citation | Glaser, I.; Moser, S.; Matthes, F.: Sentence Boundary Detection in German Legal Documents, ICAART: International Conference on Agents and Artificial Intelligence, Virtual, 2021 |
| Key | Gl21b |
| Research project | Semantic Analysis of Court Rulings |
| Title | Sentence Boundary Detection in German Legal Documents |
| Type of publication | Conference |
| Year | 2021 |
| Acronym | |
| Project | |
| Publication URL | |
| Team members |