How reliable is Open Source Intelligence?

Written by Blackdot Solutions

Introduction

Is OSINT reliable? It’s largely drawn from unregulated, biased sources – so can organisations really trust the insights it provides?

Today, OSINT investigators draw much of their data from online sources. However, the internet is increasingly saturated with disinformation. Anyone can post anything online, regardless of whether it’s true. Because of this, many organisations assume OSINT is unreliable and opt to use curated data databases instead.

But is this assumption true? Could researchers actually be missing vital insights without OSINT?

In this article, we explore this assumption and whether there’s any truth to it. Read on to learn why investigators should use a range of internet data, rather than relying solely on curated databases.

Why OSINT is reliable

Scale of reliability

It’s important to remember that all sources exist on a scale of reliability. Publicly available internet data isn’t all inherently unreliable. Likewise, investigators can’t assume information is reliable just because it doesn’t come from open sources.

In a banking context, for example, internal data within Know Your Customer (KYC) files isn’t always accurate. Recent enforcement orders have shown that KYC processes are subject to human error. Sometimes, poorly trained or overworked staff may collect inaccurate information or not take the appropriate steps to verify it. In certain cases, the information collected might not even meet regulations, making it even less reliable.

More widely, even commonly trusted official corporate records databases are notorious for having inaccuracies. National corporate registries are publicly available in many countries like the UK and Australia. In these countries, they form an important part of any due diligence or corporate investigation. Despite this, corporate records contain many errors. What’s more, in countries where the data is self-reported, like the UK, it may be completely untrue or even fraudulent.

Of course, these sources do also contain reliable and accurate information. However, it’s clear that information can’t be neatly and confidently categorised as ‘reliable’ or ‘unreliable’ based solely on its source. All information exists on a scale of reliability and must be taken in context and weighed against other findings.

Therefore, it’s natural to ask ‘is OSINT reliable?’, since you should be assessing the reliability of any source you encounter.

Quality public data

Open source data is a very broad category, and it’s not particularly helpful to paint it all with the same brush. Within the OSINT world are many high-quality, reliable sources that investigators and researchers should consider.

In certain jurisdictions like the United States, official records like court judgments and campaign contribution information are readily available through government websites. In some states, marital records, criminal records, and bankruptcy records can also be accessed. These are official sources managed by a government body, as well as being publicly accessible data.

In addition to government listings, quality media publications are generally a more trustworthy source of information. Journalists who report and write for respectable publications must abide by the ethical rules of journalism and meet high standards before publishing. This means information from these outlets is usually high quality. However, it’s always important to assess potential bias for yourself – as we’ve seen, even trustworthy ‘official’ sources can be fallible.

Well-researched news stories are useful as they can point to emerging scandals, violations, or reputational issues. For example, the Financial Times helped expose the now-insolvent electronic payment processor and financial services provider Wirecard for fraud well before the German regulator – the Federal Financial Supervisory (BaFin) – took notice. Quality investigative journalism appears on different websites as well as those of the publications themselves, from social media posts to high-standard and well-sourced leaks such as those from the International Consortium of Investigative Journalists.

Examples of high-quality open sources include:

Court judgements
Criminal records
Marital records
Bankruptcy records
Trademark or patent registrations
Campaign contributions
Corporate records
Trade names
Well-respected media publications (e.g. Financial Times, New York Times, Wall Street Journal)
Leaked databases or investigations from reputable sources (i.e. Pandora Papers, Panama Papers)

The investigative jigsaw

Even less reliable open sources can form an essential part of your investigative process, as long as any information discovered is considered in context. Here, it’s important to remember that open source data isn’t the same thing as open source intelligence. Whilst one individual piece of data might seem unreliable, a good investigator can often join it up with other information to draw reliable intelligence. The answer to ‘is OSINT reliable?’ largely depends on the investigator’s expertise in dealing with open source data.

For example, you might find a social media profile of your subject that indicates a lavish lifestyle which contrasts with existing internal information. Although the social media profile is not a reliable source of information on its own, it can back up a suspicion and lead to a targeted investigation of your subject’s source of wealth. Similarly, information from less reliable sources can help guide you on where to look next during your research. Blog posts or forum discussions that indicate your subject has dealings with a company in a particular jurisdiction may prompt you to check the official registry and follow the corporate record trail.

Unique insights in unreliable sources

Less regulated or ‘unreliable’ data sources can often reveal crucial information not available through other channels. Criminals wanting to evade detection are hardly going to leave obvious traces in official places. However, they will likely make use of the internet. This means that OSINT is essential to forming a fuller picture of criminal activity.

Likewise, OSINT can uncover potential allegations or scandals before convictions or official rulings, signalling reputational risk. While claims might be unsubstantiated, it’s essential to consider the findings in context before dismissing them.

To make the most of these sources, here are some things you can ask to assess reliability:

Does this information support or conflict with what I already know?
How can I verify this claim (or parts of this claim) with a more official or reputable source?
How does this information fit into the current narrative of my subject?
Is this information time-stamped?
How does it fit into my subject’s known timeline?
Does this information correlate with any hunch or existing suspicion?

Forming suspicions

In the same way that OSINT findings can lead to your next clue, they help corroborate suspicions and form your overall assessment. Any incriminating online information about your subject should play a role in the investigatory process. For instance, public forums or less mainstream publications may contain personal opinions which are less reliable than other sources, but they can still raise or confirm suspicions about potential nefarious activity which you should be aware of.

It’s worth remembering that unlike prosecutors, private companies should act on and report suspicions rather than prove things beyond a reasonable doubt. OSINT can give you enough information to form evidence-based judgments, whether that involves combining internal sources with government sources, social media findings, or online publications. Regulators expect regulated institutions to take immediate action when holding suspicions, not wait for indisputable proof.

Practical guidelines for assessing reliability

So, is OSINT reliable?

As we’ve seen, this can only be answered on a case-by-case basis. A good OSINT investigator’s work can certainly be reliable. Such an investigator is familiar with open source investigation best practices and knows that all sources exist on a scale of reliability. Additionally, they’ll know how different sources can be combined together for reliable outcomes. Here are some practical guidelines for assessing the reliability of open sources to lead reliable OSINT investigations:

Analyse the source. Does the source seem obviously partisan, biased, or politically motivated (e.g. a state-controlled media outlet like Russia Today)? For media sources, is there information on journalistic best practices or fact-checking?
Assess the source’s historical accuracy. Is it a longstanding publication with a good track record (e.g. the Financial Times) or a relatively new and unproven publication?
Cross-reference findings with other sources. Targeted searches may reveal your new findings are shared with a more reliable source.
Use technology. Many tools speed up research processes and improve investigations by automatically differentiating between reliable and less reliable sources, and looking for information shared across several sources, allowing analysts to investigate more efficiently and accurately.

How Videris Can Help

OSINT tools like Blackdot’s Videris help investigators streamline their research and investigation in a single interface. It allows investigators to search across multiple disparate data sources (e.g. search engines, news sites, social media and corporate records) to quickly identify relevant information on their subject.

With capabilities that screen high data volumes and rank sources for relevance, investigators can speed up their processes and improve investigation outcomes.

Book a demo today.

Cookie	Duration	Description
__hssrc	session	This cookie is set by Hubspot. According to their documentation, whenever HubSpot changes the session cookie, this cookie is also set to determine if the visitor has restarted their browser. If this cookie does not exist when HubSpot manages cookies, it is considered a new session.
cookielawinfo-checkbox-advertisement	1 year	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Advertisement".
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Cookie	Duration	Description
__hssc	30 minutes	This cookie is set by HubSpot. The purpose of the cookie is to keep track of sessions. This is used to determine if HubSpot should increment the session number and timestamps in the __hstc cookie. It contains the domain, viewCount (increments each pageView in a session), and session start timestamp.
bcookie	2 years	This cookie is set by linkedIn. The purpose of the cookie is to enable LinkedIn functionalities on the page.
lang	session	This cookie is used to store the language preferences of a user to serve up content in that stored language the next time user visit the website.
lidc	1 day	This cookie is set by LinkedIn and used for routing.

Cookie	Duration	Description
__hstc	1 year 24 days	This cookie is set by Hubspot and is used for tracking visitors. It contains the domain, utk, initial timestamp (first visit), last timestamp (last visit), current timestamp (this visit), and session number (increments for each subsequent session).
_ga	2 years	This cookie is installed by Google Analytics. The cookie is used to calculate visitor, session, campaign data and keep track of site usage for the site's analytics report. The cookies store information anonymously and assign a randomly generated number to identify unique visitors.
_gat_UA-30568652-1	1 minute	This is a pattern type cookie set by Google Analytics, where the pattern element on the name contains the unique identity number of the account or website it relates to. It appears to be a variation of the _gat cookie which is used to limit the amount of data recorded by Google on high traffic volume websites.
_gid	1 day	This cookie is installed by Google Analytics. The cookie is used to store information of how visitors use a website and helps in creating an analytics report of how the website is doing. The data collected including the number visitors, the source where they have come from, and the pages visted in an anonymous form.
_hjFirstSeen	30 minutes	This is set by Hotjar to identify a new user’s first session. It stores a true/false value, indicating whether this was the first time Hotjar saw this user. It is used by Recording filters to identify new user sessions.
hubspotutk	1 year 24 days	This cookie is used by HubSpot to keep track of the visitors to the website. This cookie is passed to Hubspot on form submission and used when deduplicating contacts.

Cookie	Duration	Description
_ga_K2NT2CSZ1K	2 years	No description
_hjAbsoluteSessionInProgress	30 minutes	No description
_hjid	1 year	This cookie is set by Hotjar. This cookie is set when the customer first lands on a page with the Hotjar script. It is used to persist the random user ID, unique to that site on the browser. This ensures that behavior in subsequent visits to the same site will be attributed to the same user ID.
_hjIncludedInPageviewSample	2 minutes	No description
AnalyticsSyncHistory	1 month	No description
li_gc	2 years	No description
UserMatchHistory	1 month	Linkedin - Used to track visitors on multiple websites, in order to present relevant advertisement based on the visitor's preferences.

How reliable is Open Source Intelligence?

Introduction

Why OSINT is reliable

Scale of reliability

Quality public data

The investigative jigsaw

Unique insights in unreliable sources

Forming suspicions

Practical guidelines for assessing reliability

How Videris Can Help

Other articles you maybe interested in

The missing piece of the puzzle? OSINT in public sector counter-fraud strategy

This Year in OSINT

Contents

Sign-up to our newsletter

Get the latest news and insights sent straight to your inbox

Product

Solutions

Industries

Resources

Get the latest news and insights sent straight to your inbox