Leveraging 5G to Support Inference at the Edge Sponsored Content by Silicon Mechanics
The hardware and infrastructure that enables AI, machine learning, and inference has consistently improved year after year. As high-performance processing, GPU-acceleration, and data storage have advanced, the performance available in the data center has become powerful enough to support machine learning and training workloads. Still, data bottleneck challenges continue to exist over WANs when teams look to implement AI workloads in real world, production environments.
With the advent of 5G wireless networks, deploying AI at the edge – and managing the movement of crucial data between the edge and the data center – is becoming more practical. When you consider the high-bandwidth and low-latency advantages of 5G paired with the improvements in processing power, high-speed storage, and embedded accelerators within edge devices, you see a pathway towards powerful, real-world inference applications.
Inference on the edge is a major opportunity for businesses to gain a competitive advantage and improve operational efficiencies. Examples of use cases for edge computing include autonomous vehicles; natural language processing; and computer vision.
While 5G is a critical advancement that makes these edge deployments possible, it also means there must be substantial changes to edge devices and data center infrastructure. Most inference deployments will need advanced computing power, expanded storage, and improved connectivity to handle demanding workloads, larger amounts of data, and faster transmission of data to and from the data center.
Even with improved speed and efficiency from 5G networking, businesses cannot rely on these networks to always operate at peak efficiency. As adoption continues and more edge devices are deployed, there may be variance in network strength, bandwidth, and load. Instead, they will require localized, low-latency computing resources for edge data processing and storage to meet their goals. This will limit the amount of data that must be transmitted to cloud or on-premises data centers for intensive compute tasks to improve performance and limit the risk of exceeding network bandwidth.
Real-World Artificial Intelligence at the Edge
Autonomous vehicles, just-in-time maintenance, and real-time image processing. These are some of the ways in which organizations hope to deploy AI technologies such as machine learning and deep learning at the edge.
Machine learning and deep learning rely on massive amounts of data that must be stored and processed. Leveraging these technologies at the edge requires a tiered processing system in which data is analyzed and processed to a point at the edge, then uploaded to a data center for further processing and training of algorithms and artificial neural networks (ANNs). Until 5G, WANs had not been powerful enough to support an effective multi-tiered processing system.
In a tiered system, edge devices can carry some of the burden of data processing. However, AI workflows will require support from more powerful compute resources to train algorithms, enable human oversight, and analyze data. To support this, the data center will require significant changes.
If your organization has been developing ANNs in the cloud or on a local cluster, moving to production inference on the edge has one notable red flag. You must consider the effect of this shift on the networking capabilities of your data center environment, both from the edge to the data center and between compute and storage within the system. This can have a ripple effect on things like power and cooling, which needs to be accounted for in system design.
Another consideration is the potential need for flexibility in workflows within the data center. With hundreds or thousands of edge devices producing and consuming data, supporting various applications and workloads from a single data center environment is key.
Some situations may require innovative technologies such as composable infrastructure to make it cost-effective. Composable infrastructure abstracts hardware resources from their physical location and manages them via software over the network fabric, so you can apply those assets where needed at any given time.
The data center is not the only area that requires significant consideration. As you plan to deploy inference devices on the edge, compute, storage, acceleration, and connectivity capabilities will play a major part in your success.
On the edge, supporting 5G connectivity in industrial mobile computing devices means rethinking many of the core pieces of their design, including RF antennas; power requirements; new hardware and firmware; new safety and regulatory testing; and cybersecurity tools.
New edge infrastructures will also need advanced security solutions to protect against the inherent risks of expanding your environment to thousands of decentralized devices. This means finding tools that eliminate redundant copies of data or resource silos, encrypt data in-flight and at rest, and consider the physical access risks of unmonitored nodes and embedded systems throughout the world.
Regardless of how we each approach adopting inference at the edge, it will inevitably become a central technology in the enterprise businesses of tomorrow.
It is critical for organizations considering edge computing to find technology partners that know how to work with 5G. A company that has experience deploying cutting-edge AI, HPC, and data analytics workloads is ideal for their understanding of emerging technologies, high-speed networking, and complex data management systems.
To learn more about how to prepare your data center for deploying inference at the edge speak with an expert at siliconmechanics.com/contact.