Master's Thesis Jakob Zmrzlikar

Differential Privacy in Transformer Architectures

Abstract

This thesis explores Differential Privacy (DP) techniques in Transformer-based architectures. We identify a gap in existing research by focusing on privacy guarantees for the inference data as opposed to those for the training data. In order to ensure Differential Privacy of inference data, we propose two new privacy-preserving mechanisms for Transformer architectures, input perturbation and layer perturbation, based on adding random noise to the input data and to the encoder, respectively. We provide proofs of DP guarantees for inference data for both methods. Furthermore, we give insights into the practical limitations of these techniques on NLP tasks. We discover that when using the input perturbation mechanism, the model accuracy decreases with lower values of the privacy budget, but the magnitude of this decrease depends significantly on the underlying dataset. Additionally, we identify the necessity of using certifiably robust architectures for the layer perturbation method. We focus on Lipschitz continuous modification of the BERT model. We identify significant challenges with pretraining Lipchitz continuous BERT architectures. Moreover, we provide empirical estimates of Lipschitz constants of this model under certain conditions.

Research questions

In what ways has DP been applied to transformer-based models in existing literature?
How can DP be achieved via input perturbations for transformer-based NLP models and how does that compare to DP-SGD in terms of privacy guarantees they provide?
How can DP be achieved via layer perturbations for transformer-based NLP models and how does that compare to DP-SGD in terms of privacy guarantees they provide?
How do input and layer perturbations impact model accuracy on downstream tasks?

Attribute	Value
Title (de)	Differential Privacy in Transformer-Architekturen
Title (en)	Differential Privacy in Transformer Architectures
Project
Type	Master's Thesis
Status	completed
Student	Jakob Zmrzlikar
Advisor	Stephen Meisenbacher
Supervisor	Prof. Dr. Florian Matthes
Start Date	15.05.2023
Sebis Contributor Agreement signed on	08.05.2023
Checklist filled	Yes
Submission date	15.11.2023

To top

Chair of Software Engineering for Business Information Systems

Prof. Dr. Florian Matthes