The Language of Investigation: How LLMs and GenAI can support OSINT

Written by Stuart Clarke | 27 August 2025

In an era where data is exploding in volume and complexity, investigations teams face the challenge of extracting meaningful, actionable insights from vast, diverse sources. Whether an investigation relates to regulatory compliance, fraud, or due diligence, the ability to search, summarise, analyse and report efficiently is critical. This is where Large Language Models (LLMs) and Generative AI (GenAI) are revolutionising how we think about investigations and analysis work.

At their core, LLMs are trained on enormous datasets, and possess an incredible ability to understand, generate and interpret natural language. This makes them ideal for supercharging traditional analytical methods across a range of investigative needs. If we walk through the various stages of an investigation, we can see clearly where using AI (in different forms) can not only support, but accelerate, our understanding of data and the generation of insights.

Contextual search

LLMs go beyond keyword-based search. They can understand the semantic meaning of queries, retrieving and ranking results based on intent, context and relevance. These skills have a range of benefits in an investigative context.

For example, a long-standing challenge for investigators is disambiguation - or resolving data to an entity of interest. The LLM’s ability to consume and interpret contextual information means that the search results themselves are contextualised, and that the most relevant results are highlighted.

Similarly, an LLM can summarise large bodies of information in real time and critically compare these summaries over time. This enables investigators to focus on insights instead of sifting through noise, therefore saving hours of manual review.

From collection to analysis

Having collected contextualised open-source data, LLMs unlock next-generation analytics that move far beyond traditionally relied-upon technologies like NLP. LLMs can transform raw, unstructured data into structured intelligence, automatically identifying entities including people, locations, organisations, products, dates and more. They can also establish connections between entities, building networks of influence or association.

With this information in hand, the investigator can use knowledge graphs to represent entities and their relationships, providing a structured and rich context that LLMs can query and traverse to uncover complex relationships and perform multi-hop reasoning.

Sentiment, topic and cluster analysis

Extracting entities and relationships may be considered table stakes for any analysis solution, but the level of accuracy and sophistication behind an LLM is impressive - particularly when dealing with unstructured content. The same is true of sentiment analysis, where an LLM can pick up on nuanced emotional tone in communication easily, but also go further and detect trends across large volumes of text. While this level of sentiment analysis is powerful when applied to bodies of text, it can reveal even more insights on a network visualisation. A network chart might show us who is connected to (or follows) whom, but we do not always know the true context behind a relationship. Analysis of the sentiment behind social posts can provide deeper insights into a relationship or connection, particularly where there is risk to be found.

Clustering, and the detection of patterns and anomalies, are also areas where LLMs can support better analysis in investigations. The ability of an LLM to group data into thematic clusters means it can effectively identify anomalies, outliers or suspicious content. For example, clustering business ownership might allow us to detect common ownership patterns, potentially indicating shell companies.

The real power of these technologies when it comes to investigations is analysing sentiment, topics and clusters of data over time, which can help detect early warning signs or pinpoint previously undetected issues. Of course, a human can do much of this work, but it can be time-consuming and inaccurate. An LLM can do all of this in a matter of minutes.

Assessing the reliability of LLM-generated analysis

For all investigators, the key question is often: can I rely on the information and intelligence that is being surfaced? Ultimately, human judgment is key to successful investigations. Yet we must recognise that LLMs can support the investigator’s workflow and highlight areas where human input is needed. For example, they can rapidly cross-check information across multiple sources and validate the consistency of a narrative or report that might form part of an intelligence picture. In doing this, the LLM will also highlight discrepancies, so an investigator can carry out additional manual analysis and validation. By working hand-in-hand with the investigator, LLMs can enhance investigation impact while preserving critical credibility and accuracy.

Reporting with GenAI

We now shift our attention from analysis to action. While LLMs extract, interpret, and organise information, Generative AI (GenAI) builds on that intelligence to create new content and support decision-making. Used together, LLMs and GenAI can generate tailored reports based on the same data for different audiences e.g. executive summaries for the C-suite, or detailed briefs for analysts or legal professionals. GenAI ensures consistency while adapting tone, detail and focus to each consumer of the information.

Identifying the next step

GenAI can not only put together case timelines, but also be leveraged to automatically document follow-up tasks and actions based on the information that has been found. This is particularly powerful for a junior investigator, where follow-up actions might suggest escalating an investigation to an individual with a particular specialism or gathering more information from a specific source. The speed at which GenAI can summarise content, suggest next steps and present contextualised reports enables the investigator to quickly generate and test hypotheses, ultimately saving them hours.

Understanding over fear

In this piece, I’ve laid out a host of ways we that LLMs and GenAI can be incorporated into investigations - not simply for technology’s sake, but to make a real impact on our ability to solve complex problems quickly.

It’s crucial that we recognise the potential of these technologies, rather than fearing them. Yet, to do so, we first need to understand them. Investigators must educate themselves on types of AI and their uses so that we can leverage them to get ahead of criminals today.

View full post