Open Topics
We offer multiple Bachelor/Master theses, Guided Research projects and IDPs in the area of data mining/machine learning. A nonexhaustive list of open topics is listed below.
If you are interested in a thesis or a guided research project, please send your CV and transcript of records to Prof. Stephan Günnemann via email and we will arrange a meeting to talk about the potential topics.
Trajectory Prediction for Traffic Data
Type: Interdisciplinary Project (IDP) / Hiwi / Guided Research / Master's Thesis
Prerequisites:
 Strong knowledge in machine learning
 Very good coding skills
 Proficiency with Python and deep learning frameworks (TensorFlow or PyTorch)
Description:
Machine Learning plays a pivotal role in the optimization of infrastructure planning, enabling datadriven decisionmaking for urban development. It allows for a detailed analysis of individual agent behaviors, facilitating informed interventions in city planning. By modeling individual trajectories on street network graphs, we can extract valuable insights into individual mobility patterns and congestion scenarios. This research project is centered around two questions: 1. Trajectory Modeling: Can machine learning techniques be employed to generate realistic trajectories on street network data? 2. Path Prediction: Given a source and destination pair within a street network, can we accurately predict the most likely path an agent would take? Our objective is to develop and apply geometric deep learning methods, e.g. simplicial complex networks, to various traffic datasets, with the ultimate aim of predicting routes based on recently collected mobility data in Munich.
Contact: Dominik Fuchsgruber
References:
Efficient Machine Learning: Pruning, Quantization, Distillation, and More  DAML x Pruna AI
Type: Master's Thesis / Guided Research / Hiwi
Prerequisites:
 Strong knowledge in machine learning
 Proficiency with Python and deep learning frameworks (TensorFlow or PyTorch)
Description:
The efficiency of machine learning algorithms is commonly evaluated by looking at target performance, speed and memory footprint metrics. Reduce the costs associated to these metrics is of primary importance for realworld applications with limited ressources (e.g. embedded systems, realtime predictions). In this project, you will work in collaboration with the DAML research group and the Pruna AI startup on investigating solutions to improve the efficiency of machine leanring models by looking at multiple techniques like pruning, quantization, distillation, and more.
Contact: Bertrand Charpentier
References:
 The Efficiency Misnomer
 A Gradient Flow Framework for Analyzing Network Pruning
 Distilling the Knowledge in a Neural Network
 A Survey of Quantization Methods for Efficient Neural Network Inference
Deep Generative Models
Type: Master Thesis / Guided Research
Prerequisites:
 Strong machine learning and probability theory knowledge
 Proficiency with Python and deep learning frameworks (TensorFlow or PyTorch)
 Knowledge of generative models and their basics (e.g., Normalizing Flows, Diffusion Models, VAE)
 Optional: Neural ODEs/SDEs, Optimal Transport, Measure Theory
Description:
With recent advances, such as Diffusion Models, Transformers, Normalizing Flows, Flow Matching, etc., the field of generative models has gained significant attention in the machine learning and artificial intelligence research community. However, many problems and questions remain open, and the application to complex data domains such as graphs, time series, point processes, and sets is often nontrivial. We are interested in supervising motivated students to explore and extend the capabilities of stateoftheart generative models for various data domains.
Contact: Marcel Kollovieh, Marten Lienen, David Lüdke
References:
Molecule Generation / Deep Graph Generation
Type: Mater Thesis / Guided Research
Prerequisites:
 Strong machine learning knowledge
 Proficiency with Python and deep learning frameworks (PyTorch or TensorFlow)
 Knowledge of graph neural networks (e.g. GCN, MPNN)
 No formal education in chemistry, physics or biology needed!
Description:
The generation of molecular structures through machine learning models has become increasingly important in various fields, such as drug discovery, material science, and chemistry. These models promise a more efficient exploration of the vast chemical space and generation of novel compounds with specific properties by leveraging their learned representations, potentially leading to the discovery of molecules with unique properties that would otherwise go undiscovered. Molecule generation, or more generally, deep graph generation lies at the intersection of generative models like diffusionbased models or variational autoencoders and graph representation learning through e.g. graph neural networks. The focus of projects surrounding this topic is model development with an emphasis on downstream tasks (e.g. molecule optimization) and a better understanding of the limitations of existing models.
Contact: Johanna Sommer, Leon Hetzel
References:
MAGNet: MotifAgnostic Generation of Molecules from Shapes
Learning to Extend Molecular Scaffolds with Structural Scaffolds
A Survey on Deep Graph Generation: Methods and Applications
Junction Tree Variational Autoencoder for Molecular Graph Generation
Graph Structure Learning
Type: Guided Research / Hiwi
Prerequisites:
 Strong machine learning knowledge
 Proficiency with Python and deep learning frameworks (PyTorch or TensorFlow)
 Knowledge of graph neural networks (e.g. GCN, MPNN)
 Optional: Knowledge of graph theory and mathematical optimization
Description:
Graph deep learning is a powerful ML concept that enables the generalisation of successful deep neural architectures to nonEuclidean structured data. Such methods have shown promising results in a vast range of applications spanning the social sciences, biomedicine, particle physics, computer vision, graphics and chemistry. One of the major limitations of most current graph neural network architectures is that they often rely on the assumption that the underlying graph is known and fixed. However, this assumption is not always true, as the graph may be noisy or partially and even completely unknown. In the case of noisy or partially available graphs, it would be useful to jointly learn an optimised graph structure and the corresponding graph representations for the downstream task. On the other hand, when the graph is completely absent, it would be useful to infer it directly from the data. This is particularly interesting in inductive settings where some of the nodes were not present at training time. Furthermore, learning a graph can become an end in itself, as the inferred structure can provide complementary insights with respect to the downstream task. In this project, we aim to investigate solutions and devise new methods to construct an optimal graph structure based on the available (unstructured) data.
Contact: Filippo Guerranti
References:
 A Survey on Graph Structure Learning: Progress and Opportunities
 Differentiable Graph Module (DGM) for Graph Convolutional Networks
 Learning Discrete Structures for Graph Neural Networks

NodeFormer: A Scalable Graph Structure Learning Transformer for Node Classification
Graph Neural Networks
Type: Master's thesis / Bachelor's thesis / guided research
Prerequisites:
 Strong machine learning knowledge
 Proficiency with Python and deep learning frameworks (TensorFlow or PyTorch)
 Knowledge of graph neural networks (e.g. GCN, MPNN)
 Knowledge of graph/network theory
Description:
Graph neural networks (GNNs) have recently achieved great successes in a wide variety of applications, such as chemistry, reinforcement learning, knowledge graphs, traffic networks, or computer vision. These models leverage graph data by updating node representations based on messages passed between nodes connected by edges, or by transforming node representation using spectral graph properties. These approaches are very effective, but many theoretical aspects of these models remain unclear and there are many possible extensions to improve GNNs and go beyond the nodes' direct neighbors and simple message aggregation.
Contact: Simon Geisler
References:
 Semisupervised classification with graph convolutional networks
 Relational inductive biases, deep learning, and graph networks
 Diffusion Improves Graph Learning
 Weisfeiler and leman go neural: Higherorder graph neural networks
 Reliable Graph Neural Networks via Robust Aggregation
Physicsaware Graph Neural Networks
Type: Master's thesis / guided research
Prerequisites:
 Strong machine learning knowledge
 Proficiency with Python and deep learning frameworks (JAX or PyTorch)
 Knowledge of graph neural networks (e.g. GCN, MPNN, SchNet)
 Optional: Knowledge of machine learning on molecules and quantum chemistry
Description:
Deep learning models, especially graph neural networks (GNNs), have recently achieved great successes in predicting quantum mechanical properties of molecules. There is a vast amount of applications for these models, such as finding the best method of chemical synthesis or selecting candidates for drugs, construction materials, batteries, or solar cells. However, GNNs have only been proposed in recent years and there remain many open questions about how to best represent and leverage quantum mechanical properties and methods.
Contact: Nicholas Gao
References:
 Directional Message Passing for Molecular Graphs
 Neural message passing for quantum chemistry
 Learning to Simulate Complex Physics with Graph Network
 Ab initio solution of the manyelectron Schrödinger equation with deep neural networks
 AbInitio Potential Energy Surfaces by Pairing GNNs with Neural Wave Functions
 Tensor field networks: Rotation and translationequivariant neural networks for 3D point clouds
Robustness Verification for Deep Classifiers
Type: Master's thesis / Guided research
Prerequisites:
 Strong machine learning knowledge (at least equivalent to IN2064 plus an advanced course on deep learning)
 Strong background in mathematical optimization (preferably combined with Machine Learning setting)
 Proficiency with python and deep learning frameworks (Tensorflow or Pytorch)
 (Preferred) Knowledge of training techniques to obtain classifiers that are robust against small perturbations in data
Description: Recent work shows that deep classifiers suffer under presence of adversarial examples: misclassified points that are very close to the training samples or even visually indistinguishable from them. This undesired behaviour constraints possibilities of deployment in safety critical scenarios for promising classification methods based on neural nets. Therefore, new training methods should be proposed that promote (or preferably ensure) robust behaviour of the classifier around training samples.
Contact: Aleksei Kuvshinov
References (Background):
References:
 Certified Adversarial Robustness via Randomized Smoothing
 Formal guarantees on the robustness of a classifier against adversarial manipulation
 Towards deep learning models resistant to adversarial attacks
 Provable defenses against adversarial examples via the convex outer adversarial polytope
 Certified defenses against adversarial examples
 Lipschitzmargin training: Scalable certification of perturbation invariance for deep neural networks
 Provable robustness of relu networks via maximization of linear regions
Uncertainty Estimation in Deep Learning
Type: Master's Thesis / Guided Research
Prerequisites:
 Strong knowledge in machine learning
 Strong knowledge in probability theory
 Proficiency with Python and deep learning frameworks (TensorFlow or PyTorch)
Description:
Safe prediction is a key feature in many intelligent systems. Classically, Machine Learning models compute output predictions regardless of the underlying uncertainty of the encountered situations. In contrast, aleatoric and epistemic uncertainty bring knowledge about undecidable and uncommon situations. The uncertainty view can be a substantial help to detect and explain unsafe predictions, and therefore make ML systems more robust. The goal of this project is to improve the uncertainty estimation in ML models in various types of task.
Contact: Tom Wollschläger, Dominik Fuchsgruber, Bertrand Charpentier
References:
 Can You Trust Your Model’s Uncertainty? Evaluating Predictive Uncertainty Under Dataset Shift
 Predictive Uncertainty Estimation via Prior Networks
 Posterior Network: Uncertainty Estimation without OOD samples via Densitybased PseudoCounts
 Evidential Deep Learning to Quantify Classification Uncertainty
 Weight Uncertainty in Neural Networks
Hierarchies in Deep Learning
Type: Master's Thesis / Guided Research
Prerequisites:
 Strong machine learning knowledge
 Proficiency with Python and deep learning frameworks (TensorFlow or PyTorch)
Description:
Multiscale structures are ubiquitous in real life datasets. As an example, phylogenetic nomenclature naturally reveals a hierarchical classification of species based on their historical evolutions. Learning multiscale structures can help to exhibit natural and meaningful organizations in the data and also to obtain compact data representation. The goal of this project is to leverage multiscale structures to improve speed, performances and understanding of Deep Learning models.
Contact: Marcel Kollovieh, Bertrand Charpentier
References:
 Tree Sampling Divergence: An InformationTheoretic Metricfor Hierarchical Graph Clustering
 Hierarchical Graph Representation Learning with Differentiable Pooling
 Gradientbased Hierarchical Clustering
 Gradientbased Hierarchical Clustering using Continuous Representations of Trees in Hyperbolic Space