TUM sebis at GermEval 2022 A Hybrid Model Leveraging Gaussian Processes and Fine-Tuned XLM-RoBERTa for German Text Complexity Analysis
The task of quantifying the complexity of written language presents an interesting endeavor, particularly in the opportunity that it presents for aiding language learners. In this pursuit, the question of what exactly about natural language contributes to its complexity (or lack thereof) is an interesting point of investigation. We propose a hybrid approach, utilizing shallow models to capture linguistic features, while leveraging a fine-tuned embedding model to encode the semantics of input text. By harmonizing these two methods, we achieve competitive scores in the given metric, and we demonstrate improvements over either singular method. In addition, we uncover the effectiveness of Gaussian processes in the training of shallow models for text complexity analysis.
| Attribute | Value |
|---|---|
| Address | Potsdam, Germany |
| Authors | Juraj Vladika , Stephen Meisenbacher |
| Citation | Juraj Vladika, Stephen Meisenbacher, and Florian Matthes. 2022. TUM sebis at GermEval 2022: A Hybrid Model Leveraging Gaussian Processes and Fine-Tuned XLM-RoBERTa for German Text Complexity Analysis. In Proceedings of the GermEval 2022 Workshop on Text Complexity Assessment of German Text, pages 51–56, Potsdam, Germany. Association for Computational Linguistics. |
| Key | Vl22a |
| Research project | |
| Title | TUM sebis at GermEval 2022: A Hybrid Model Leveraging Gaussian Processes and Fine-Tuned XLM-RoBERTa for German Text Complexity Analysis |
| Type of publication | Conference |
| Year | 2022 |
| Publication URL | https://aclanthology.org/2022.germeval-1.9/ |
| Acronym | |
| Project | |
| Team members |