Master's Thesis Meruyert Zhakpekova
Abstract:
The thesis investigates the use of Differential Privacy and Text Simplification techniques to protect the privacy of individuals while allowing for the further analysis or modeling of text data. The study addresses authorship attribution problems in domains where anonymity is critical, such as privacy-preserving data sharing or whistleblower protection. The main assumptions are that integrating Text Simplification models and Differential Privacy techniques improve privacy guarantees without compromising the original text's accuracy and minimizes the risk of re-identification. The study seeks to improve the understanding of the relationship between text simplification and privacy, thus enabling the development of more effective and robust models in the future.
Research questions:
1. How can the fine-tuning of large text simplification models be leveraged as a basis for authorship obfuscation?
2. To what extent is it feasible to integrate Differential Privacy techniques into the proposed pipeline, and at which stage is noise addition optimal?
3. How can the effectiveness of the proposed approach be evaluated from a privacy standpoint through both manual and automatic means?
| Attribute | Value |
|---|---|
| Title (de) | Autorenschaftsverschleierung mithilfe von Differential Privacy und Textvereinfachungstechniken |
| Title (en) | Authorship Obfuscation using Differential Privacy and Text Simplification Techniques |
| Project | |
| Type | Master's Thesis |
| Status | completed |
| Student | Meruyert Zhakpekova |
| Advisor | Stephen Meisenbacher |
| Supervisor | Prof. Dr. Florian Matthes |
| Start Date | 15.05.2023 |
| Sebis Contributor Agreement signed on | 27.04.2023 |
| Checklist filled | Yes |
| Submission date | 15.11.2023 |