GDPR Mandates: Dark Data, ‘Leaver Data,’ Integrated Archiving
GDPR compliance requires ongoing data management of customer data, and if your organization hasn’t been fully attentive to this matter, consider these five best practices for data migration, data management and long-term data retention:
1: Integrated Archiving and Backup
Regarding backups, your company may face higher risk if it doesn’t use an integrated archive and backup that ensures only one instance of data gets stored for both. Keep in mind that to be GDPR compliant, archiving systems must be able to:
- Respond to a discovery request for personal data on a data subject
- Rectify incorrect data
- Erase data under a “right to be forgotten” request
Seek an archiving tool that allows you to offload data that’s used less often into a secondary system, which reduces data stored in production systems. Cloud-based archiving and backup solutions offer faster implementation, which can be important where compliance is concerned.
2: Dark Data
Unmanaged, unstructured data gobbles up pricey enterprise storage space to the tune of hundreds of terabytes. Many companies are unaware that they have a major problem with this type of “dark data.” It can be comprised of any number of unmanaged files, including PSTs and backups of employee desktops, email accounts from departed employees, eDiscovery sets of results, system generated files and more. Old PSTs are a particularly telling symptom of dark data. Many IT administrators simply have no idea how many of these files exist — much less how to appropriately manage them.
The best practice is the most cost-effective one — rather than trying mass deletion and likely losing something valuable, or hiring consultants who charge hundreds of dollars an hour to rifle through your large volumes of dark data, use a system that is designed to do this work for you much more inexpensively. Seek an information management system that culls dark data quickly via consolidating unstructured data and fully indexing it, simplifying search by keyword, custodian, last date accessed, etc.
3: Defensible Data Migration
During data migration when responding to an eDiscovery request, two things are required by GDPR to prove legal defensibility: chain of custody and data fidelity. Chain of custody is used to prove that the data remains unchanged during the passing of the data from one person to another. This comes into play if a file’s originality is questioned — the company must be able to produce a documented history of direct management and protection.
That’s a piece of the puzzle, but only one piece. Data fidelity, or the measure of similarity to the data’s original state, is also critical. Only when the data fidelity is 100 percent can the data be considered completely unaltered. This is a requirement of the eDiscovery process — that any data that may be relevant remain completely unaltered. If a migration solution provider says you only need to worry about chain of custody reporting to ensure legal defensibility, dig further. The fact is, such providers may potentially alter or intentionally delete file properties, increasing your risk of a spoliation charge.
Before migrating data that could be considered evidence, do these two things:
- Seek a written legal opinion that the planned data migration can go forward.
- Check with your migration services provider to ensure they support both chain of custody and data fidelity to protect you from increased legal risk.
4: Address Your “Leaver Data”
When someone leaves a company, it’s important to know how to properly manage and archive the departing employee’s valuable data to avoid potential compliance risks and increased eDiscovery costs. As a best practice, enterprises need to ensure that they have a way to manage these files as the valuable company asset that they are; if this data is disregarded and lost but later needed for legal reasons, it can become a nightmare, or even become impossible, to find or resurrect, which can be a serious problem given the time constraints of eDiscovery requests.
As a best practice when dealing with leaver data, follow these two steps:
- Develop an exit process that ensures the company knows where employee data is, and protects all employee data before they leave.
- Migrate each and every departing employee’s data to a central repository, such as a low-cost cloud archive, for long-term archiving and management.
5: Journaling Compliance
GDPR regulations require that email communications be available to ensure that if a party wishes to have its information deleted (the right to be forgotten), the data can be discovered and removed. Email journaling can ensure this. However, it’s important to understand how journaling works. The journal mailbox isn’t actually an archive, which means this data still needs to be archived. Otherwise, you may end up in trouble when the journal mailboxes fill up and overwrite current journaled email.
Litigation preparedness is a big reason to use email journaling. During a lawsuit, companies implicated must place a litigation hold on all potentially relevant data. The opposing counsel may ask for responsive data within an open-ended date range, which means placing a litigation hold on all email—past, current, and new. By immediately journaling a target employee’s mailbox, you can ensure all affected email is captured and placed on hold. It’s prudent to automatically journal your C-level employees’ email and hold it for 2+ years, since those employees ostensibly have a higher risk of being summoned in a lawsuit.
Many companies are moving their email to newer systems. Even so, many can’t provide live journaling capability, so it is recommended that one use an on-premises or third-party cloud archive as the journal mailbox. This isn’t the best solution, though, because it’s costly, and may end up costing more. Relying on a third-party cloud archive can also lead to vendor lock-in issues.
A better solution is relying on a public cloud, which not only helps lower costs but provides security and offers unlimited scalability. It wasn’t always possible to journal from your email system to the cloud, but technological advances have made this possible. Previously, public clouds could not manage a journal’s complexities. No longer, since there are new cloud journaling solutions. This enables a live email stream to move directly into the company’s cloud where it can be validated and managed, using the customer’s own encryption keys to secure data. By journaling to the company’s own cloud, the platform addresses both vendor lock-in and high costs — a perfectly compliant solution.
Bill Tolson is vice president at Archive360.