Jonna Matthiesen

Deep Learning Researcher
Embedl
Room
Time
Theme
Difficulty
Congress Hall
Room H1/H2
To be released
15:00
To be released
LLM Compression
 
D2
Jonna Matthiesen

Evaluating Retrieval-Augmented Generation Systems: Challenges and Practices

Generative AI—particularly Large Language Models (LLMs) and Transformers—rapidly reshapes applications by enabling real-time situational awareness, autonomous decision-making, and efficient data processing. However, conventional cloud-based deployment often proves challenging due to connectivity constraints, cybersecurity risks, and high operational costs.

This talk outlines a systematic approach to deploying advanced generative AI models directly on resource-constrained devices—such as autonomous vehicles, drones, and IoT devices—to ensure autonomy, security, and real-time performance.

We review state-of-the-art system-on-chips (SoCs) from leading manufacturers, examining their capabilities and limitations for executing generative AI workloads. To overcome hardware constraints, we explore model compression techniques such as quantization, pruning, and knowledge distillation, discussing trade-offs between maintaining high-fidelity performance and achieving practical, real-world viability.

Drawing on a recent real-world case study, we present results demonstrating how knowledge distillation and targeted fine-tuning enable SLMs (e.g., Meta’s Llama 3.2) to run on Qualcomm Snapdragon SoCs without requiring massive computational resources.

We will share the engineering techniques and optimizations necessary to adapt next-generation models for resource-constrained environments.

Attendees will learn how to:

Bio

Jonna Matthiesen is a deep learning researcher specializing in AI optimization for defense, automotive, and IoT applications. With expertise in hardware-aware neural architecture search, model compression, and inference optimization, she focuses on making large language models (LLMs) deployable in resource-constrained environments such as embedded systems and edge devices. She holds a bachelor’s degree in Mathematics from Kiel University, Germany, and a master’s degree in Applied Data Science from Gothenburg University, Sweden. In 2023, Jonna joined Embedl, a company dedicated to efficient deep learning in automotive, defense, and IoT.

Recording