Master's Thesis Jakob Zmrzlikar
Differential Privacy in Transformer Architectures
Abstract
This thesis explores Differential Privacy (DP) techniques in Transformer-based architectures. We identify a gap in existing research by focusing on privacy guarantees for the inference data as opposed to those for the training data. In order to ensure Differential Privacy of inference data, we propose two new privacy-preserving mechanisms for Transformer architectures, input perturbation and layer perturbation, based on adding random noise to the input data and to the encoder, respectively. We provide proofs of DP guarantees for inference data for both methods. Furthermore, we give insights into the practical limitations of these techniques on NLP tasks. We discover that when using the input perturbation mechanism, the model accuracy decreases with lower values of the privacy budget, but the magnitude of this decrease depends significantly on the underlying dataset. Additionally, we identify the necessity of using certifiably robust architectures for the layer perturbation method. We focus on Lipschitz continuous modification of the BERT model. We identify significant challenges with pretraining Lipchitz continuous BERT architectures. Moreover, we provide empirical estimates of Lipschitz constants of this model under certain conditions.
Research questions
- In what ways has DP been applied to transformer-based models in existing literature?
- How can DP be achieved via input perturbations for transformer-based NLP models and how does that compare to DP-SGD in terms of privacy guarantees they provide?
- How can DP be achieved via layer perturbations for transformer-based NLP models and how does that compare to DP-SGD in terms of privacy guarantees they provide?
- How do input and layer perturbations impact model accuracy on downstream tasks?
| Attribute | Value |
|---|---|
| Title (de) | Differential Privacy in Transformer-Architekturen |
| Title (en) | Differential Privacy in Transformer Architectures |
| Project | |
| Type | Master's Thesis |
| Status | completed |
| Student | Jakob Zmrzlikar |
| Advisor | Stephen Meisenbacher |
| Supervisor | Prof. Dr. Florian Matthes |
| Start Date | 15.05.2023 |
| Sebis Contributor Agreement signed on | 08.05.2023 |
| Checklist filled | Yes |
| Submission date | 15.11.2023 |