Dark Data

“Dark data” refers to data an organization generates through normal activities and then stores but never really uses after that. The term was coined by Gartner as a way to categorize data that is in many ways hidden from view to organizations but that remains persistently stored and often processed through automated tasks like backups.

Organizations without a strategy for managing dark data can encounter several issues if this data is not accounted for over time:

  • Diminishing storage space, especially if backups are made using 1:1 duplicates rather than backup imaging
  • Increased latency/decreased processing speed if unstructured dark data is accessed but not technically used during large scale data-based projects like backups and analytics
  • Cumbersome disorganization that only becomes worse over time if dark data is not assigned any organizational qualities like meta descriptors
  • Risk of data theft; if dark data is not managed or encrypted alongside other sensitive data, hackers may find value in it where organizations do not
  • Huge potential opportunity costs, since dark data could actually yield useful insights to organizations if processed through analytics on a BI platform

Looking at the latter challenge, organizations may not realize the untapped value dark data can provide when a light is shined upon it through analytics. All data has the potential to uncover trends, yield actionable correlations and generally inform decisions made within the business process.

Examples of possible value dark data can provide include:

  • Server log files — Could indicate visitor behaviors, add more context to web-based analytics
  • Financial statements — Could be used to track more granular spending trends, possibly revealing insights for cutting costs
  • Geolocation data — Could describe consumer traffic patterns that aid in future business planning
  • Customer call records — Details could provide context for demographics data as well as consumer sentiment towards certain products and services

BI tools are capable of mining dark data in order to reveal insights like the ones described above. By probing through all data files, opportunities for discovering value are not wasted. More variables can be added to existing measured data in order to provide a deeper view of tracked metrics while providing additional chances to discover meaningful and actionable correlations.

Organizations that add structure to their dark data through a thorough continuous file management, auditing and consolidation process dramatically reduce the risks dark data can pose. These businesses also amplify their capacity to obtain value within their data in order to improve performance, cut costs or provide a host of other potential benefits.