Retrieval Augmented Generation for Threat Intelligence
The previous years’ advances in generative AI and Large Language Models (LLMs) have completely revolutionized the way the world thinks about AI and what problems it can solve. LLMs have been positioned as an interface to data and knowledge in various shapes or form, making knowledge queryable by natural language. In practice however, this comes with many critical challenges: how can you leverage real-time data and not just what the LLM was trained on? How can you make the answers reliable? How can you make such a system fast enough? In this talk, we will dive into how we at Recorded Future have built a retrieval augmented generation (RAG) system using LLMs to be used as an interface to our threat intelligence graph. The end goal of this system is for a user to get reliable, fact-based answers to questions posed in natural language concerning threat intelligence. In particular we will discuss how we approached the above fundamental questions.
With a mission to transform state-of-the-art methods and research in ML into useful things, Aron has spent the last 10 years building ML- and AI-products. Aron has a PhD in mathematics, and has been working in various fields, including life-sciences and the financial industry. Currently, he is working in the intersection of NLP, Generative AI, Graphs, and AI Engineering towards the goal of automating the threat-intelligence life-cycle for cyber- and geopolitical-security.