Jobs / Theses

For any application, please write to jobs-gagneurlab(at)in.tum.de including your CV, transcripts if you're studying, publication list if any, and a brief yet convincing motivation relating your interest to specific research topics of us (either from our previous publications or projects described here). Moreover, provide your wished starting and end dates. Possibly, provide evidence of your programming skills (e.g. github repos, or by sending shareable code). Our team is international, please apply in English.

Postdoc

Check out the TUM Global Postdoc Fellowship. The TUM Global Postdoc Fellowship offers up to 10 international young scientists the opportunity to conduct research with a host at TUM for up to 2 years within the framework of the fellowship and thus continue their career at TUM. Deadline end of March of each year. Check the requirements and reach us out at jobs-gagneurlab(at)in.tum.de (same as above + project proposal).

PhD theses

Munich Center for Machine Learning (MCML)

As part of the MCML, we are looking for a PhD student to work on the projects of "Evolutionary Regulatory Genomics" or "Leveraging genetic interactions for phenotype prediction" (see below). Apply through the MCML website. Mind the deadlines.

Graduate School Munich School of Data Science (MuDS)

We are a core lab of MuDS, a graduate school to promote Data Science and its application. MuDS offers joint projects for PhD students, each designed by two partners – a domain-specific application partner and a methodological partner. This ensures that candidates receive methodological as well as application-specific training. Check the MuDS website for the application schedule.

Graduate School of Quantitative Biosciences Munich (QBM)

We are proud members of QBM, a graduate school funded by the German excellence initiative to promote quantitative biosciences. Students selected by QBM will get their own stipend, extra training from the graduate school, and conduct interdisciplinary research under the direction of two labs with complementary expertise. Check the QBM website for application schedule and specific projects.

IDP, Guided research, Bachelor, and Master theses

We are constantly seeking for highly motivated students in bioinformatics, physics and/or applied mathematics for projects ranging from IDP, which we can offer given our secondary affiliation in Medicine, to Master theses. Quantitative minds with a strong interest for biology, or biologists with computational skills and eager to understand biology at the genome level will fit our team. See this video for an overview of our research and projects: https://tinyurl.com/y7betk96.

Open projects

Apply to what you find most appealing to you. The objectives for most projects can be adjusted to fit an IDP, a Guided research, a Bachelor or a Master thesis. You will typically be mentored by a PhD student or Postdoc working on the topic.

Multi-tissue modeling of gene expression from DNA sequence by deep learning
The goal of this master thesis is to develop deep learning models that model the regulatory code and its modulation across tissues. A focus will be given on modeling human promoters.
Prediction of cancer driver genes
The goal of this master thesis is to to develop novel machine leanirng models integrating DNA and RNA-sequencing data to identify novel cancer driver genes. You will leverage a unique dataset of 5,000 genomes and transcriptomes from our collaborator MLL (https://www.mll.com/en/science/5000-genome-project.html).
Algorithms for the detection of aberrant expression as causes of rare disease
The goal of these two companion master theses is to develop algorithms to identify aberrant gene expression events in omics dataset. The first thesis focuses on single-cell RNA-sequencing data. The challenge is the very high dimensionality of the data and the difficult nature of the noise (low counts). The second thesis focuses on proteomics data, where the difficulty lies in the handling of "missing not random" data. The methods find application to identify causes of rare diseases. Techniques include: machine learning (autoencoders), and statistical modeling.
When the outlier is the signal: Integrative multi-omics analysis of Amyotrophic lateral sclerosis (ALS)
Our lab won the Kaggle Challenge "End ALS" by combining AI methods for outlier detection and gene network analysis, yielding potential new genes involved in Amyotrophic lateral sclerosis. See Task 1 at https://www.kaggle.com/alsgroup/end-als/discussion/242637.
With this master thesis, you will analyse the complete AnswerALS dataset https://www.answerals.org/ to discover new potential targets and develop an integrative reproducible pipeline.
Gene activation analysis
We have already been successful in identifying outliers in gene expression. Our method, OUTRIDER, requires that the gene is expressed in most of the samples and fails in genes with a very low (or null) expression, which we usually discard. Nevertheless, it is well-acknowledged that activation of proto-oncogene drives carcinogenesis. The goal of this thesis is to develop a statistical method to properly capture cases of gene activation. You will leverage a unique dataset of 5,000 leukemia transcriptomes from our collaborator MLL (https://www.mll.com/en/science/5000-genome-project.html).
De novo peptide sequencing
Highly accurate de novo peptide sequencing (DNPS), i.e. determining peptide amino acid sequences solely from tandem mass spectra, will make proteomics amenable for applications including genotyping, cancer surveillance, pathogen surveillance, immuno-oncology, metagenomics, and paleogenomics. We are investigating innovative AI approaches to DNPS. Master theses on this topic are possible with the computational mass spectometry team. More details here: https://www.mdsi.tum.de/en/mdsi/research/funding-support/seed-funds/dl4dnps/

Post-transcriptional regulation of gene expression

The process of gene expression from DNA to proteins is extremely complex and intricate. So far, most efforts have focused on transcription, but RNA levels alone can only explain ~50% of the variability of protein levels, which indicates that post-transcriptional regulatory mechanisms remain unknown. The goal of this thesis is to develop machine learning models that leverage ribosome profiling and proteomics datasets in order to better understand the process of protein synthesis.
Evolutionary Regulatory Genomics

We are working towards self-supervised models leveraging large sequence datasets spanning hundreds of millions of years of evolution to study gene regulation. Applications include modeling of gene regulatory code from RNA synthesis to protein abundance. Projects may involve improving or interpreting current models or designing and implementing new approaches. Relevant publication: Species-aware DNA language models capture regulatory elements and their evolution

Thus, we are looking for students who have either:
- bioinformatics experience or applied deep learning skills (ideally both).
- deep experience in CUDA or Triton programming and deep learning background, as we are seeking to adapt and improve approaches like FlashAttention (https://github.com/Dao-AILab/flash-attention) to models in genomics.
Leveraging genetic interactions for phenotype prediction

Individual traits result from combinatorial gene interactions within pathways. Genetic variations in some genes can significantly alter these traits, posing the challenge of deciphering this complex map to identify responsible genes. The goal of this project is to develop novel machine learning models that can leverage gene function information and capture gene interactions, to predict changes in phenotype and identify the genes causing them. This project involves working with large-scale genome sequencing data sets, such as the UK Biobank.

Relevant publication: DeepRVAT, a neural network-based approach that learns burden scores from rare variants, annotations, and phenotypes

Open research assistant positions (HiWi)

Working as research assistant on a mini-job basis (between 8 and 20 hours a week) is an awesome way to develop your skills and contribute to the field very early in your carreer (after a couple of study semesters). Also, you and our team get to know each other, and you can start advancing on your future thesis.

We look for a HiWi to join our oncology team for variant annotation and cancer driver gene predictions using large datasets. You will learn how to deal with reproducible workflows, machine learning models and dealing with big data.
Further projects and theses can be defined according to the skills and interests of the applicants. Do not hesitate to contact us!

Further bioinformatics projects in Munich

A lab's spin-off, OmicsDiscoveries, is offering 3 IDPs in for (bio)informatics students and 1 project study for administration students. Please apply using the instructions in the respective documents and not to jobs-gagneurlab@in.tum.de:

1. Creating Sashimi plots in the cloud

2. Prototyping an interactive web interface to analyse RNA-seq data

3. Automating the cloud infrastructure deployment of an RNA-seq pipeline

Check the Bioinformatik-muenchen website.

Check the Institute of Computational Biology website from the Helmholtz Zentrum München.

To top

Informatics 29 Computational Molecular Medicine

Technische Universität München

Prof. Dr. Julien Gagneur