The Third Age of Data and the Unfolding Scale-out World
Welcome to the Third Age of Data, where an estimated 1 trillion sensors are embedded in a nearly limitless landscape of networked sources, from health monitoring devices to municipal water supplies.
Data’s first age was a transaction-oriented world of back-office databases and once-a-day batch processing. Its second age was human-centric -- the huge digital trails left by Internet and smart phone users. The Third Age of Data is all about information generated by machines. And this data’s volume, variety and velocity are like nothing the world has seen.
A recent PBS documentary, “The Human Face of Big Data,” included the finding that we now generate as much data every two days as the world generated from its inception through 2003. An oft-quoted industry truism holds that every day, we create 2.5 quintillion bytes of data -- 90 percent of the world’s data has been created in the last two years alone. The ever-rising tide of machine data will only accelerate these numbers.
As in any new age, there will be winners and losers. In the Third Age of Data, enterprises need to scale out or die. As IDC chief analyst Frank Gens has said, scale “is the critical ingredient in the unfolding battle for digital success.
Companies need to ask themselves: How are you going to manage this onslaught of data? Where is the data going to go? How are you going to process the raw data so that you can understand it and gain actionable insights? How do you feed those insights into the next generation of products and services you’re creating?
Only those who figure out how to ingest, process and harness this data for real-world insights will emerge as the winners.
What does the playing field look like? Here are a few examples.
The Connected Car Every car is becoming a data generator on wheels, accessing systems within the vehicle as well as the driver’s cell phone and transmitting that data to other systems, which can range from the automaker (for monitoring the vehicle’s performance) to highway departments (for traffic monitoring).
It’s relatively early days for the connected car, but before long, an array of smart networked applications in cars will turn “dumb” cars into dinosaurs. Strategy&, the strategy consulting team at PwC, predicts that 90 percent of vehicles will have built-in-connectivity platforms by 2020. According to research firm Analysys Mason, the number of connected cars will grow to more 150 million this year and more than 800 million by 2023.
IoT General Electric and Cisco Systems have projected that by the end of this decade, at least 1 trillion sensors will be deployed as part of the IoT, representing a $15 trillion market by 2020. Gartner estimates there will be 4.9 billion connected ‘things’ this year, soaring to 25 billion by 2025.
Companies have no choice, but to get the infrastructure right. When they do, it’s amazing what can be done with data – world-changing stuff.
For example, the IT team at a major U.S. university’s independent health research center found itself with little visibility into its massive store of trending data on the global impact of over 400 different diseases in over 180 countries. Their compilation and analysis of these global health statistics was generating tens of millions of files in a single afternoon. Just keeping up with that massive data growth and understanding its ebb and flow was a gigantic challenge.
By putting in place a scale-out storage system with real-time analytics, the center was able to gain both the scalability and operational visibility necessary to efficiently store and conduct their life-saving research.
Another example is a top U.S. telecommunications provider that gathers all of the log data from all of its network endpoints around the world. This immense volume of log data is ingested into a centralized storage tier where it is then analyzed via a mix of tools – including Splunk, Hadoop and some internally developed applications – to gain actionable insights into the activity going on around the world in their network.
Or take Vaisala, which performs weather modeling and forecasting services for planning, assessment and deployment of renewable energy systems. Working with everything from in-the-field sensor data to the advanced weather and climate models of national and international weather services, the company helps clients project potential solar, hydro and wind power generation 10 minutes to 30 years into the future.
All of which creates an interesting data processing and storage challenge: efficiently managing a vast volume of tiny sensor measurements, combined with huge and massively complex forecasting models, to generate meaningful assessments on the ground. And above it. The size of Vaisala’s simulations range from a cube of space across a rectangle of land all the way up to the clouds – and then over decades of time.
It’s a mind-boggling data challenge, but one Vaisala can handle because it put in place a massively scalable system that can ingest and store raw data that’s coming in all the time, and then analyze the data in near real-time.
As the Third Age of Data begins, one can only imagine the stunning advances that lie ahead. Handling all that data will require the right strategies, the right technologies and the right investments to bring this new age to fruition.
Jeff Cobb is Vice President of Product Management at data-aware scale-out storage company Qumulo.