Site   Web

November 17, 2016

Dark Data: Issues and Mitigation

Dark Data, despite the ominous and dramatic name, simply means data that is collected during typical business activities, such as data kept from past customer and transaction records, but is not useful for any other reasons. Much of the internal and external data companies collect is never even analyzed, in that case, the raw data “goes dark” or becomes dark data. In the case of many companies, this dark data will account for the majority of the data kept on file – records kept, often to prove a past transaction, but which do not serve any other specific purpose. This can include a wide variety of customer and transaction data, and commonly includes customer information, payment information, transaction logs, survey data, e-mails, old versions of documents, internal statistics – and anything else can become irrelevant but may be kept around for contingencies. The highest percent of data that businesses collect will become obsolete, dark data at some point in its life.

One point of interest with dark data is the same reason it is often kept around in the first place – a potential for usefulness and profitability if properly analyzed and exploited. Many dark data involves customer demographics, purchasing histories, product data, the ways in which customers use products, and data about trends that may affect the market. All of this sort of data can be instrumental in predicting future customer behavior. The more data you have if analyzed and utilized properly, the better your business can understand the behavior and needs of customers, even in the future.

How and why Dark Data is potentially ignored

Many companies do not have the staff or the capabilities to properly catalog, analyze, and mine dark data for these potential uses. Utilizing dark data often involves an initial investment from companies, in new technology, skilled data analysts, as well as a significant time investment. Many companies do not feel confident that this investment in mining their own dark data will be worthwhile – but as dark data builds up, the more you have, the higher the likelihood the data will be useful. Many times companies prioritized tasks on immediately obvious uses – however, sometimes-dark data needs to be explored to reveal its potential.

Another way in which the potential of dark data can go untapped is isolated data processing policies within different departments of a company. For example, the way one department stores and analyzes data may have little co-ordination with the data processing policies of other departments. This lack of communication can result in the loss of data when picked up by one department when it might be perfectly useful to another.

As touched on before, companies need technology that works to unite the data needs and interests of different departments, as well as comprehensive data policies that do the same. Only then can companies begin to reap the benefits of the untapped dark data.

Problems with dark data

Why is it worth your company’s time and money to properly manage, store, and utilize dark data? Not only are the potentials great for proper application of certain types of dark data, but also because both the costs and security risks of bad data management are equally high.

Just because your company doesn’t make the investment of money, staff, and time to sort through mountains of old data doesn’t mean that hackers won’t be willing to sift through the data to find embarrassing or exploitable information from your company’s past transactions. Very often, this dark data will include credit card information and other transaction information – all of which is prime material for hackers. While this information is often heavily guarded when it is new and considered relevant, without the proper measures, it is often left vulnerable when buried among large amounts of old dark data.

In addition to the obvious risks of leaving such data exposed, certain types of data such as financial information and patient records are often subject to legal and financial regulations. Leaving this data available to hackers will not only allow them to exploit your company but can actually open your company to legal and financial liability.

Information concerning business practices, partnerships and competitive advantages can be exploited by competing companies – this is yet another way this data becomes valuable and worth mining for hackers. Moreover, the very occurrence of a data breach will affect your company’s reputation.

In addition to security concerns, dark data can amass quickly into large amounts that simply create high overhead to the store. This problem can increase quickly, so this is yet another reason to sort through, categorize – and either utilize or delete dark data. In the meantime, this problem can be mitigated by proper backup practices – ones that do not reproduce irrelevant data, but instead recycle prior backups.

Mitigating the risks and headaches of dark data

Dark data can be managed in ways that will make it much easier to utilize when the opportunity arises. This will mitigate both the risks as well as the costs of storing huge amounts of raw data.

Remove Outdated Data:

Get rid of data that has become truly outdated. The goal here is to give structure to the body of data, so it will be easier to either utilize or prune the data over time. Outdated data can create a burden and need extra space on a server. Proper disposal methods are also primal, once data has been sorted through and found to be irrelevant. Once you decide to delete data, you want to be sure that it is gone. If data is actually found to be potentially damaging if exploited, your company will need to look into Department of Defense approved methods of erasure and destruction.

Data Audit:

Audit data regularly over time so as not to let it build up into what we call dark data. If done properly, this will slow the build-up of new unsorted data. These audits represent the bulk of the work involved in properly dealing with dark data.


While backups are a necessary measure for security, traditional backup methods can make the problem even worse – in that it will repeatedly reproduce and store the mass of unsorted data – adding to potential storage and security problems. Using a higher quality backup method will make sure the dark data is only reproduced once, and then recycled for later backups. This will not have a huge impact on security risks in and of itself, but it will prevent potentially massive storage costs.

Data Encryption:

Another way to minimize the risks of storing large amounts of dark data should be an obvious one, but is still sometimes overlooked – encrypt your data! Use a reliable encryption – and remember, this does not just apply to data stored on in-house servers – the risks are even higher for data stored in a cloud or offsite, so make sure this data is properly encrypted as well.


Dark data can easily be forgotten about and left alone. By definition, it fades into the background where it does not get in the way. However, this is where the danger lays – both in the vulnerabilities this presents, and in the opportunities and potential that are missed. Your company should not let the data become “out of sight, out of mind.” Luckily, this is not just a matter of lowering risks and keeping hackers out. Keeping track of this data with regular audits is also the way for your company to get the most out of this data – in large part because data about past customer behavior is extremely useful in predicting future customer behavior. Properly utilizing the data will involve some investment in staff, time, and technology. However, since taking good care of data and auditing preexisting dark data is the key to both lowering risks and to increasing profit, it should be an easy choice to make these investments – which will save your company both security headaches and missed opportunities in the long run.


Kunjal Panchal is a digital strategist and a social media geek. She is passionate about content marketing and strongly believes in the power of storytelling for marketing. A perfect day for her consists of reading her favorite author with a hot cuppa coffee. Find her on Twitter and LinkedIn.