Machine Learning Model Compression for Embedded Artificial Intelligence
Type: Bachelor, Master
Date: Immediately
Supervisor: Niclas Kannengießer, Kevin Armbruster, Alexander Blatt
Deploying modern machine learning (ML) models on embedded systems (e.g., microcontrollers,edge devices, and Internet-of-Things platforms) poses significantchallenges due to the limited computational resources, memory, and energy budgetsof such devices. Naively deploying large-scale ML models trained for cloud environments on these small chips leads to infeasible performance, high latency, or excessive energy consumption.To make ML feasible in constrained environments, software architects andAI engineers require tooling and conceptual frameworks to assess and apply model compression techniques – such as pruning, quantization, knowledge distillation, and neural architecture search. These techniques aim to reduce model complexity while preserving accuracy and operational robustness on embedded hardware. In particular, novel modeling and analysis methods are needed to understand and predict the impact of various compression techniques on performance, accuracy, and hardware efficiency before deployment. Research in this area can provide critical insights to guide the design and selection of appropriate compression strategies tailored to embedded scenarios. Within the context of enabling embedded ML through model compression, several thesis topics for Bachelor/Master students are available as listed below. These topics require minimal implementation and focus mainly on theoretical modeling, literature analysis, and conceptual design.
- Challenges and Solution Approaches in Model Compression: Identifythe compression techniques used in industry for different ML tasks,and explore the challenges practitioners face and how they address them.
- Trade-offs in Model Compression: Uncover trade-offs in model compression,such as between accuracy, energy consumption, inference time,and model size. The trade-offs can be conceptualized and/or empiricallyinvestigated.
- Guidelines for Selection of Compression Techniques: Propose aset of guidelines that help to select compression techniques for ML mod-1els, considering target hardware. Identify key influencing factors, suchas model size, chosen compression techniques, and hardware constraints,that drive the selection of compression techniques.
- On Device Training: Uncover techniques for an efficient on-device training,such as quantization, low-resource backpropagation, efficient memoryusage, as well as techniques for preventing catastrophic forgetting and forautomated training surveillance.
- Federated Learning/Split Learning for Embedded Devices: Layout efficient federated learning strategies for embedded-device-to-serverconnections. The focus is on data protection, overcoming hardware restrictions,and outlining suitable network architectures and training strategies.
If you are interested in embedded systems, machine learning, and softwarearchitecture, this thesis is a great opportunity to contribute to a rapidly evolvingfield with growing industrial relevance.
- Practical Impact: Your work will support engineers in designing deployableand efficient ML models for embedded devices.
- Research-Driven Design: Gain experience in concept development,trade-off analysis, and theoretical modeling without the need to implementor train ML models.
- Cutting-Edge Topic: Embedded ML is a fast-growing area of greatrelevance for autonomous systems, IoT, and wearable devices.Interested in one of the topics, or do you have your own ideas related toembedded AI and model compression? Do not hesitate to reach out to us. Incooperation with AITAD, we offer supervision from industry and research.
Contact
Niclas Kannengießer (niclas.kannengiesser@tum.de)
Kevin Armbruster (kevin.armbruster@tum.de)
Alexander Blatt (a.blatt@aitad.de)
Recommended Readings:
- Jin, D., Kannengießer, N., Rank, S., & Sunyaev, A. (2024). Collaborative distributedmachine learning. ACM Comput. Surv., 57 (4), 1–36. https ://doi.org/10.1145/3704807
- Thapa, C., Mahawaga Arachchige, P. C., Camtepe, S., & Sun, L. (2022). Splitfed:When federated learning meets split learning. Proceedings of the AAAIConference on Artificial Intelligence, 36 (8), 8485–8493. https ://doi.org/10.1609/aaai.v36i8.208252
- Zhu, X., Li, J., Liu, Y., Ma, C., & Wang, W. (2024). A survey on model compressionfor large language models. Transactions of the Association forComputational Linguistics, 12, 1556–1577. doi.org/10.1162/tacl a 00704