Master's Thesis Timo Kühne
From Documentation to Process Models: An Agentic Approach to Rediscover Causal Processes in Unstructured DataAbstractUnstructured clinical documentation contains rich narratives about how a patient’s condition evolves and how clinicians make decisions over time. Yet classical process mining methods need structured event logs with explicit activities, timestamps, and identifiers. This thesis investigates how a multi-agent AI system can reconstruct causal and temporal processes from a single patient’s unstructured records (e.g. clinical notes) and translate them into process-oriented representations that reflect both medical guidelines and real-world clinical practice. The central idea is to extract an event series for one patient and then decompose this series into subprocesses grounded in distinct causal mechanisms. For example, one segment of the process may correspond to routine diagnostic procedures aligned with standards of practice, whereas another segment may correspond to guideline-defined treatment pathways such as those for lung cancer. By comparing discovered event segments with external medical knowledge, the system aims to differentiate guideline-grounded causal chains from undocumented clinical behaviors. The thesis adopts the FHIR standard whenever feasible to ensure interoperability and to enable evaluation on clinical texts that already include FHIR-based annotations. These annotations can be used as ground truth by removing them during inference and assessing whether the agentic system can recover the underlying causal-temporal structure. Methodologically, the thesis implements a multi-agent, divide-and-conquer architecture in which specialized agents handle event detection, temporal reasoning, causal inference, knowledge grounding, and model validation. Rather than assuming a fixed pipeline, the work compares alternative coordination patterns and extraction strategies. The expected contributions are: (i) a literature overview for causal process discovery from unstructured clinical text; (ii) a reference implementation of a multi-agent system capable of reconstructing patient-level causal processes and clustering them into guideline-grounded subprocesses; and (iii) an empirical evaluation using FHIR-compatible datasets assessing the accuracy, robustness, and practical usefulness of the approach. As future work, the patient-level method could be extended to multi-patient cohorts to reveal population-level deviations from guidelines and support continuous clinical process improvement.
RQ1: How can existing methods for event extraction, temporal reasoning, and causal inference from clinical text be adapted to reconstruct a single patient’s causal process from unstructured documentation? RQ2: How can a multi-agent, divide-and-conquer architecture reconstruct a single patient’s causal-temporal event series from unstructured documentation and segment it into subprocesses grounded in clinical guidelines and standards of practice? RQ3: How well does the proposed multi-agent system recover clinically meaningful causal-temporal relations when evaluated against FHIR-annotated ground truth? |
Attributes of this Student Project
| Title (de) | From Documentation to Process Models: An Agentic Approach to Rediscover Causal Processes in Unstructured Data |
| Title (en) | From Documentation to Process Models: An Agentic Approach to Rediscover Causal Processes in Unstructured Data |
| Project | |
| Type | Master's Thesis |
| Status | started |
| Student | Timo Kühne |
| Advisor | Jonas Gottal |
| Supervisor | Prof. Dr. Florian Matthes |
| Start Date | 15.12.2025 |
| Sebis Contributor Agreement signed on | No |
| Checklist filled | No |
| Submission date | 15.06.2026 |
| Kick-off presentation slides | |
| Final presentation slides | |
| Thesis PDF |