Testing the boundaries of OSINT technology: The journey to automation

In my early career, I was fortunate enough to call myself an investigator. Drawn to the concept of pulling on threads and mapping out networks, I spent my waking hours connecting people, objects, locations and events to uncover a bigger picture. Yet early on, one thing became abundantly clear: many of the initial steps of an investigation - from collecting and securing data, to filtering and refining to increase its relevance and remove noise - were repetitive, time-consuming, and sometimes error-prone.

These steps were the foundation of an investigation and are important to get right. Learning to carry them out properly formed an important part of my development as an investigator, but, as I grew in experience, I realised that they weren’t always the best use of time or expertise. Those initial steps were repetitive and took my attention away from the higher-value, complex investigative work my experience was most valuable in. I needed to spend more of my time connecting the dots, detecting patterns in datasets, testing hypotheses, and, ultimately, generating insights.

The argument for automation

An investigation, at its core, is a systematic process of collecting, verifying, and examining data or information, to resolve a question, satisfy a hypothesis, establish facts, or uncover patterns. While every investigation has unique aspects, the foundational steps such as data collection, triage, filtering, and timeline analysis are surprisingly consistent across cases.

A motivation for my switch from being an investigator to building investigations technology was to explore whether these early investigative steps could be automated - and to what extent that automation would improve the overall effectiveness of investigation teams. I was searching for a collaboration between human and machine, following a hypothesis I firmly believe in: that 70% plus of investigative steps should be automated so that we can spend more time investigating and less time preparing.

Understanding the 70%

To many seasoned investigators, this hypothesis might seem like a big claim. To justify it, let me share some examples. Take a sanctions evasion investigation – these cases typically relate to an entity (person, company or perhaps a vessel, aircraft or financial asset like a crypto wallet). With this focus and investigative scope, data collection will almost always focus on the same sources, including corporate records, ownership data, adverse news, sanctions data and legal records. We will seek to use this information to understand the immediate network around a target entity. In this early stage, we are only collecting data and conducting some basic analysis work, preparing (hopefully highly relevant) information for further investigation. These initial steps can be automated so that the next steps, which do require human judgment, can be performed more effectively, making use of a richer and more refined data set.

The same is true of many other use cases. If we take an example like the sale of counterfeit goods: in this instance, we need to collect marketplace and forum data, dark web data and publicly available social network content to identify goods being sold and the individuals behind them. Again, the investigator can direct and influence data collection, but automation technologies mean that there is theoretically no need for human intervention to collect and present the data in a digestible form. If this initial stage is automated, investigators can spend more time making decisions, using their experience to analyse and formulate conclusions about the case.

The art of the (im)possible

Despite my drive to explore this type of automation, the reality was that much of the technology was either too rigid, too shallow, too expensive or lacked the nuance required to navigate real-world messiness that investigators must deal with. Identity resolution, a lack of confidence in data collection at scale and an inability to tie all the investigative workstreams together in a single workflow and investigative view all proved significant blockers. Whilst I remained unwilling to compromise investigation quality, investigative automation was more promise than practice.

That is, until today. Thanks to breakthroughs in AI, large language models (LLMs), agentic automation, and graph-based reasoning, we finally have systems that can work alongside investigators - not just as tools, but as collaborative partners. AI and its growth is unavoidable – it's a global market valued at USD 279 billion in 2024 and is projected to reach 1,811 billion by 2030 (CAGR of 35.9%). By contrast, the global investigations and security market is expected to grow from USD 538 billion in 2024 at just 8.6% CAGR over the same period.

AI is changing the game – is the OSINT community ready?

There is no disputing that the progress of technology has accelerated and created opportunity for investigators. However, the purpose of this series is not to suggest we should replace investigators with AI.

Instead, I want to spotlight the importance of the topic and provoke discussion in this area. In subsequent articles, I will map out further opportunities with AI technology and highlight where we have gaps. As an OSINT community, we need to work together to take control of this technology wave. I hope to provide a forum for us to do just that.

Videris Automate

Videris Investigate

Public sector

Financial services

Corporates

Risk consultancies

The State of OSINT

Who we are

Testing the boundaries of OSINT technology: The journey to automation

The argument for automation

Understanding the 70%

The art of the (im)possible

AI is changing the game – is the OSINT community ready?

Related articles

Covid-19 and Fraud: A Social Pandemic?

Serious and Organised Crime in the Digital Era

What does fraud look like in the digital age?