Master's thesis presentation. Thomas is advised by Santiago Narvaez Rivas, Katharina Donauer (BMW Group) and Prof. Dr. Hans-Joachim Bungartz.
Previous talks at the SCCS Colloquium
Thomas Beranek: AI-Assisted Scheduling for ADAS Traffic Simulations
SCCS Colloquium |
In this master's thesis, a cloud-based architecture including an artificial intelligence (AI) assisted scheduling for cost-efficiently running advanced driver assistance system (ADAS) traffic simulations with the open-source openPASS simulator is designed, implemented, and evaluated. The development of ADAS and other automated driving features requires thoroughly testing the safety of the systems, which cannot be purely achieved by on-the-road testing, and ideally, issues should be discovered as early as possible in the development process. Large-scale traffic simulations are used to test features in virtual models of the cars for millions to billions of kilometers in a fraction of the time and cost of on-the-road testing. The amount of compute resources required is unknown for new simulations, unevenly distributed, and peaks unexpectedly based on demand. Therefore, cloud computing in combination with batch processing is increasingly preferred for running these simulations. Resource requirements for batch jobs are specified at submission, and charges occur based on these provided requirements rather than the actual usage. We introduce a resource usage prediction approach that applies a combination of linear regression models and a vector database with the Amazon Titan Text Embeddings V2 model to accurately predict the memory usage and execution time of new simulations with openPASS. These AI-assisted predictions are used by our scheduling approach to combine simulations to batch jobs of optimal size and selecting the right compute environment before submitting them to AWS Batch. By only using the 64 example configurations supplied with openPASS and a 50\% split, our predictions using the combined approach achieved a mean absolute percentage error (MAPE) of 7.4% for memory usage and 74.5% simulation duration. Further analysis revealed a correlation between the matching score of the vector database results, which increases with the filling of our database. Starting at a matching score of 0.85, which applies to the majority of the test dataset, we received a MAPE of 1.8% for memory usage and 21.7% for simulation duration. When integrated into our architecture, our evaluation showed the simulation time of complete batch jobs went beyond the target in the worst case by only 13.3% and the absolute percentage error including incomplete jobs never exceeded 17.4%. Due to the minimum memory requirements of AWS Batch compute environments, the memory usage predictions were not applicable for our implementation.