Similar to dark matter in physics, dark data often comprises a universe of information assets for most organisations, and marketers need to tap that.
Today, organisations realise they have a big data problem, but they don’t know just how big it is. Every day organisations across the world collect data from various sources. This data is used to understand customer behaviour and grow business.
For example, when a user clicks on a link on a website, data, in the form of cookies, is generated. This data is stored and later analysed to determine that user’s browsing behaviour and buying patterns.
7.5 sextillion gigabytes of data is generated worldwide every single day.
But data falls under two categories: Structured, data that resides in a fixed field within a record or file. This includes data contained in relational databases and spreadsheets. Structured data is used by companies to create a customer behaviour model and understand buying patterns. Unstructured, data that does not fall under any category in a company’s database. For example, a customer’s complaint about a product that is collected as information but cannot be processed.
This is dark data.
According to the International Data Corporation (IDC), 10% of digital data
is structured while the remaining 90% is dark data.
How is Dark Data Generated?
Dark data is typically generated when companies don’t know what to do with all the data they have collected. According to Lucidworks, data goes dark when companies face the following problems:
- 39% of company data is not analysed
- 25% of companies can only access structured data sets
- 13% of data analysis tools cannot determine the meaning of the data collected
Dark data usually comes from the following information:
- Customer Information
- Log Files
- Previous Employee Information
- Raw Survey Data
- Financial Statements
- Email Correspondences
- Account Information
- Notes or Presentations
- Old Versions of Relevant Documents
Why Does it Matter?
IDC projects that organisations that analyse all relevant data and deliver actionable information will achieve an extra $430 billion in productivity gains over their less analytically oriented peers by 2020. Not analysing and processing this data also means that companies run the risk of a data breach. According to 2018 Cost of a Data Breach Study conducted by Ponemon Institute along with IBM, the average cost of a data breach globally is $3.86 million for every 2,500 to 100,000 lost or stolen records, a 6.4 percent increase from the 2017 report. The costliest data breach in recent history cost Equifax $275 million!
How to Mine Dark Data?
There are many open source dark data extraction tools that companies can use to extract data.
Snorkel: Developed by Stanford University, Snorkel accelerates data extraction by developing tools to create datasets to help train learning algorithms for dark data extraction
Dark Vision: This app can extract data from videos by analysing individual frames and audio from videos with IBM Watson Visual Recognition and Natural Language Understanding.
How Will Marketers Benefit from This Data?
Although extracting dark data using data extraction tools needs effort, marketers should care about this. Here’s why,
A valuable source of information: The value of this data is enormous! It holds information that is not available in any other format. Extracting and annotating this data to enter it in relevant databases can help solve more significant problems for companies. For example, server log files are clues to website visitor behaviour, customer call detail records can indicate consumer sentiment, and mobile geolocation data can reveal traffic patterns.
Better-quality analytics: Dark data is an untapped resource that, when extracted, can improve the quality of analytics Prompt analysis of this massive data pool can result is faster and better data-driven decision making.
Reduced costs and risk of a data breach: Extracting this data can lessen the risk and liability of storing and securing sensitive information. This can also help to remove unnecessary data, thereby reducing the recurring storage and curation costs.
Data is the New Oil
Data mining is a significant source of revenue for giants like Google, Amazon, Apple, Facebook and Microsoft, who collectively racked up over USD 25 billion in net profit in the first quarter of 2017.
That number itself should prompt any company to start investing in dark data extraction tools. If you don’t extract this data, you are missing out on shifting trends and customer behaviours that could impact your business significantly.