Covering Scientific & Technical AI | Saturday, December 28, 2024

OctoML Unveils Next Iteration Of ML Deployment Platform 

SEATTLEDec. 16, 2021 — OctoML today announced the latest release of its Machine Learning (ML) Deployment Platform to empower enterprises to scale their ML operations (MLOps). Launched at TVMcon 2021, the open source conference for ML acceleration, the new release enables enterprises to automate the optimization, performance benchmarking and deployment of production-ready ML models across the broadest array of clouds, hardware devices and ML acceleration engines.

The new platform supports the three leading clouds (AWS, Microsoft Azure and Google Cloud Platform), and a wide choice of hardware options, including NVIDIA GPUs, Intel and AMD CPUs, as well as leading edge platforms like NVIDIA Jetson and Arm Cortex-A.

“Enterprises today face significant challenges with scaling the deployment of their trained models. In fact, research shows that nearly two-thirds of models take over a month to deploy into production,” said Luis Ceze, CEO, OctoML. “This is because model performance tuning and optimization is largely done manually. Also, models, software platforms, and inference targets are rapidly evolving, requiring highly skilled resources on an ongoing basis. This latest iteration breaks these bottlenecks, making machine learning economically viable and enabling faster innovation.”

A number of OctoML customers are already using the new platform to power their ML model “factories” where trained ML models enter the platform and the output is a package containing that same model—accelerated across the users’ chosen deployment targets. OctoML customers are now able to—through either UI or API-driven workflows—complete dozens of accelerations a week. Customers are also able to leverage performance benchmarking insights to dramatically improve their time to market and reduce the cost of inference serving through model speed-up.

Benefits and features of the new OctoML platform include:

  • Expanded choice of deployment targets
    • Microsoft Azure target support provides choice across all three major clouds, including AWS and Google Cloud Platform.
      • AMD and Intel CPUs and NVIDIA GPUs are target options in each cloud.
    • Extensive edge support with NVIDIA Jetson AGX Xavier, and Jetson Xavier NX to go along with Arm A-72 CPUs using 32 and 64 bit OSs.
  • Pre-accelerated Model Zoo which includes:
    • A Computer Vision (Object Classification and Image Detection) set that includes ResNet, YOLO, Mobilenet, Inception, and more.
    • A Natural Language Processing (NLP) set that includes BERT, GPT-2, and more.
  • Improved performance across the widest breadth of ML models
    • Expanded model format support that includes ONNX, TensorFlow Lite, and several TensorFlow model packaging formats—so users can upload their trained models without conversion.
    • Three new acceleration engines: ONNX Runtime, TensorFlow, and TensorFlow Lite in addition to TVM to provide the optimal performance acceleration and insights for every model.
    • TVM performance speedups across the Model Zoo have a geomean of 2.2x across CPU and GPU compared to the TensorFlow baseline.
  • Streamlined approach to provide data-driven decisions
    • Enhanced performance benchmarking comparison workflows enable swift decision-making.

“We’re pleased to support OctoML with the power of Microsoft Azure,” said John Montgomery, CVP, Azure AI, Microsoft. “The new platform release not only offers customers more automation, choice and performance in their ML journey, but also allows enterprises to take advantage of the security, flexibility and reliability Azure provides.”

Enhanced, High-Impact Free Trial Experience

A free trial of OctoML provides access to the Model Zoo which are pre-accelerated, widely adopted computer vision and natural language processing models. These models offer state-of-the-art performance across each deployment target available within the platform. The result is an instant “out of the box” experience that for the first time gives customers clear visibility into performance-based insights across the myriad of hardware targets, including the three major cloud providers and leading-edge silicon.

The free trial also gives customers the opportunity to test and accelerate their in-house developed models across their choice of deployment targets.

To register for the free trial, please visit: https://octoml.ai/start-for-free.

About OctoML

OctoML is a machine learning deployment platform based in Seattle, Washington. OctoML aims to accelerate model performance while enabling seamless deployment of models across any hardware platform, cloud provider, or edge device. The company’s investors include Addition, Madrona Venture Group, and Amplify Partners. OctoML was founded by creators of open-source Apache TVM, CEO Luis Ceze, CTO Tianqi Chen, CPO Jason Knight, Chief Architect Jared Roesch, and VP of Technology Partnerships Thierry Moreau. For more information, please visit https://octoml.ai.


Source: OctoML

AIwire