The “dark Web” – the Wild West of underground Internet sites – has received a lot of attention and notoriety of late. But many people may not have heard of “dark data.” Practices regarding dark data have prompted a lot of discussion in data management circles.

Gartner originally coined the term. Condensed here, the definition of dark data is “the information assets organizations collect, process and store during regular business activities, but generally fail to use for other purposes.”

Organizations often retain dark data for compliance purposes. “Storing and securing data typically incurs more expense (and sometimes greater risk) than value,” Gartner writes.

Storage Costs and Security Risks

Izenda Tech Blog logoBen Austin, in a post for R1Soft, provides a list of possible outdated or unstructured data sources that may fall into the dark category:

• Customer Information
• Log Files
• Account Information
• Previous Employee Data
• Financial Statements
• Raw Survey Data
• Email Correspondences
• Notes or Presentations
• Old Versions of Relevant Documents

The challenges of dark data revolve around storage – more data storage means more overhead cost – and security risks. “Along with outdated and seemingly useless documents, dark data will likely also contain sensitive, proprietary information,” Austin writes.

Referring to the data breach that rocked Sony Pictures, Austin adds: “Just because employees at your organization don’t want to take their time to go through piles of old information doesn’t mean that hackers aren’t willing to mine that data for years-old embarrassments that your company had hiding in the basement.”

Data Missed Because of Technology

However, Austin also writes that there may be a lot of untapped potential inside a company’s mass of unused information.

Which leads us to another definition of dark data, via Matt Aslett of 451 Research: “Data that was previously ignored because of technology limitations.”

Timo Elliott of SAP writes on his personal Business Analytics blog about the “dark” structured data that organizations are missing. His post is titled “Are You Making the Most of Your Dark Data?

Crunching Data at the Airport

Elliott writes that organizations may not realize they have useful unused data. His favorite example: A project at Copenhagen Airport, where useful information was gleaned by crunching the data in the log files of the airport’s wi-fi routers. Passengers’ smartphones “ping” routers while they walk through the terminals, offering data on passenger movements. The data could even answer commercial questions, such as “which is the most visited area of duty free?”

“The technology barriers have tumbled down, and it’s a good time to get out your bucket and flashlight and go take a look in your corporate cellar!,” Elliott writes.

Ed Tittel, writing for, recognizes that organizations willing to expend resources on dark data see its potential and some may be reluctant to part with unused data. However, he writes: “As with many potentially rewarding and intriguing information assets, organizations must also be aware that the dark data they possess – or perhaps more chillingly, the dark data about them, their customers and their operations that’s stored in the cloud, outside their immediate control and management – can pose risks to their continued business health and well-being.”

Those risks, Tittel writes, include exposure to legal and regulatory issues, loss of sensitive business information, damage to reputation with a data breach, or the opportunity costs from third parties exploiting proprietary information.

Mitigating the Risks

Tittel and Austin describe ways organizations can mitigate the risks from dark data, including:

• Ongoing inventory and assessment. “Yesterday’s dark data may become a shining source of insight, thanks to new tools or analytic techniques,” Tittel writes.
• Encryption.
• Retention policies and safe disposal.
• Auditing dark data for security purposes.

Organizations should keep unused data whose potential value outweighs its risks and delete that whose risks outmatch its potential returns, Tittel writes.

Austin offers a final thought: “Anytime you can find a new use for old data is a big win – it’s like finding $5 in that pair of pants you hadn’t worn in a month.”

Or, who knows? It could be worth even more.

Learn how you can deliver live data to your application users with Izenda.

Follow Izenda on social media: