Covering Scientific & Technical AI | Monday, December 2, 2024

Killing Cloud Latency with Hot Data at the Edge 

High latency is an annoyance to consumers, a serious problem to businesses that want to leverage advanced infrastructure services via the cloud and a roadblock to the rollout of advanced IoT-powered capabilities, such as autonomous vehicles.

Aversion to high latency is understandable — after all, public cloud storage means the data is likely in a facility located hundreds or thousands of miles away, guaranteeing significant latency.

The Edge Is the Latency Killer

The only realistic way to solve the problem of latency is to harness the edge and move data closer to the end user or application. Less than 12 percent of enterprise data is "hot," which is data that end users need to access now and regularly, and while the bulk of the data an organization owns can be stored in the cloud, hot data needs to be stored locally.

There is a middle ground for data that's neither hot nor cold, and users request it about 5 percent of the time. To provide performance in the case of a cache miss, that "warm" data needs to be cached no more than about 120 miles away. Lag time will be about one to two milliseconds, which is in line with flash performance expectations.

The trick, of course, is to create an intelligent network that automatically categorizes data as hot, warm or cold, and places each in the appropriate location to keep latency to a minimum. The master copy of all data — hot, warm and cold — resides in the public cloud, but hot and warm data sets are kept on-premises and at the metro edge, respectively. Hot data would be available and stored on-premises in a flash-based edge device for peak performance. Warm data, which is data that has been accessed within a few weeks to a month, should be stored in a point of presence (PoP) on the metro edge that is no more than 120 miles from the end user. Cold and archived data can go to multiple locations in the public cloud
.
The local, on-premises entry point leverages flash storage and a direct connection to the storage network to ensure immediate access by the end user to all data in the hot tier. For data in the warm tier, the metro PoP storage also leverages speedy flash memory technology and a relatively short distance for data to travel to keep latency low and performance high. To the end user, the experience is seamless and everything behaves like local flash storage. To the IT administrator, it reduces the cost of on-premises storage systems, ensures performance and delivers flexibility, while eliminating the large on-premises footprint of days past.

To further reduce the chance of high latency, don’t rely on the public internet for connectivity, but instead tie all of these pieces of the storage puzzle via a dedicated private network line. One might think that private line would be prohibitively expensive compared to public internet, but in fact, there is still so much dark fiber left over from the telecom overbuild of the late 1990s and early 2000s that prices are remarkably affordable. Because data moves along this dedicated line, there is no chance of crosstalk or packet interference, which has the added benefit of hardening security.

Even the big public cloud providers now recognize that enterprise IT cannot rely on the cloud alone for advanced services, such as next-gen IoT or primary storage. For example, AWS Outposts is a new service announced in November that enables customers to run AWS on-premises, and it connects directly to the AWS cloud. To accomplish this, AWS ships customers fully managed racks of AWS-designed compute and storage hardware running VMware.

Another service, AWS IoT SiteWise, part of a suite of IoT services, collects data from industrial sites through an on-premises gateway. AWS isn’t alone. Microsoft also announced new functionality for its own IoT service, Microsoft IoT Edge. AWS and Microsoft Azure realize the cloud alone will not provide the performance that enterprise IT and the growing number of advanced IoT applications will require.

It’s not a simple task to overcome the cloud’s inherent latency problem, but by applying intelligent caching along with the edge, enterprise IT can overcome these issues to take full advantage of the cloud to simplify their storage infrastructure.

Laz Vekiaraides is CTO at ClearSky Data.

 

AIwire