Shedding Light on Dark Data in the IoT Era
Gartner Group forecasts that nearly 26 billion devices, from sensors, to medical diagnostic to tools, thermostats and cars, will be connected to the Internet by 2020. In this era of the Internet of Things (IoT), the enormous amounts of information generated from these devices will result in considerable challenges for organizations in every industry across the globe, not the least of which is managing “dark data."
Dark Data is defined by Gartner as “information assets that organizations collect, process, and store in the course of their regular business activity, but generally fail to use for other purposes.”
Even today, in the early stages of IoT, most (if not all) enterprises are plagued by dark data. Gartner and IDC estimate a jaw-dropping 80 percent of company data is dark, resulting in data retrieval challenges and forcing many organizations to make business decisions based on incomplete or missing information.
On the upside, we are at a turning point in the evolution of enterprise content management (ECM) solutions for managing the machine data created, shared and stored as IoT proliferates. Leading ECM systems with metadata-driven platforms offer a way to manage dark data efficiently to extract business value from previously untapped information.
The concept of metadata isn’t new, but for too many companies it’s an afterthought, particularly where ECM is concerned.
The value in metadata-based ECM lies in managing information while also illuminating associations and relationships along the way. Metadata attributes or tags can be applied to classify data intelligently – by customer, process, case or other categories. The idea is based managing by what it is rather than where it is stored.
Consider this example: consumers search for music on their iPhones based on any attribute, including artist, title, genre, etc. No one needs to know (and rarely thinks about) where the actual music files are stored.
The same concept can hold true in the business world. For example, let’s say an employee is working on a customer contract for a new project, they may save it in their company's network in a folder marked "contracts." But if someone else on the team looks for that contract, logically enough, in the "customers" or "projects" folder, they won't find it - and thus it becomes "dark data" to them. On the other hand, enabling users to search for data by all of these attributes, rather than the user needing to know its location, can transform dark data into searchable, useful information that provides a faster, more intuitive and accurate way to find information, not to mention more intelligence to fuel big data initiatives, automation and data sharing.
Using Context to Connect the Puzzle Pieces
When an ECM system is integrated with existing business systems, users benefit from the ability to see other business-critical content. For example, an employee searching for the latest version of a customer project proposal may also see that there are unresolved support-related issues for this customer, which may affect the project proposal. An integrated ECM approach provides users with a 360-degree view of data, revealing insights that were previously unknown.
Another key piece in the dark data puzzle is managing the oncoming surge of unstructured content. While the majority of data collected by IoT devices is structured and therefore relatively easy to analyze, management of unstructured content is not. With the growth of IoT, linking unstructured content repositories with and structured data systems can essentially lift the dam and help prevent dark data from rising.
Mika Javanainen is senior director of product management at M-Files Corporation.