Covering Scientific & Technical AI | Friday, December 27, 2024

FedML Launches FEDML Nexus AI, a Next-Gen Cloud Platform Offering Greater Access to GPU Resources 

SUNNYVALE, Calif., Oct. 24, 2023 -- Today, FedML, a rapidly growing startup in artificial intelligence (AI), officially announced the release of FEDML Nexus AI, offering the next generation of cloud services and platform for generative AI.

As large language models (LLMs) and other generative AI applications gain prominence and global GPU demand intensifies, a wave of new GPU providers and resellers has emerged.

FedML CEO Salman Avestimehr commented: “Developers need a way to quickly and easily find and provision the best GPU resources across multiple providers, minimize costs, and launch their AI jobs without worrying about tedious environment setup and management for complex generative AI workloads. They may even want to develop privately on their own infrastructure or in a hybrid manner. No such technology exists today. FEDML Nexus AI bridges this gap in the market and provides such capabilities to developers and enterprises.”

The multi-faceted capabilities of FEDML Nexus AI are as follows:

  • GPU Marketplace for AI Development: Addressing the current dearth of compute nodes/GPUs arising due to the skyrocketing demand for AI models in enterprise applications, FEDML Nexus AI offers a massive GPU marketplace with over 18,000 compute nodes. Beyond partnering with prominent data centers and GPU providers, the FedML GPU marketplace also welcomes individuals to join effortlessly via our "Share and Earn" interface.
  • Unified ML Job Scheduler and GPU Manager: With a simple fedml launch your_job.yaml command, developers can instantly launch AI jobs (training, deployment, federated learning) on the most cost-effective GPU resources, without the need for tedious resource provisioning, environment setup and management. FedML Launch supports any computing-intensive job for LLMs and generative AI, including large-scale distributed training, serverless/dedicated deployment endpoints, and large-scale similarity search in vector DB. It also enables cluster management and deployment of ML jobs on-premises, private, and hybrid clouds.
  • Zero-code LLM Studio: As enterprises increasingly seek to create private, bespoke, and vertically tailored LLMs, FEDML Nexus AI Studio empowers any developer to train, fine-tune, and deploy generative AI models code-free. This Studio allows companies to seamlessly create specialized LLMs with their proprietary data in a secure and cost-effective manner.
  • Optimized MLOps and Compute Libraries for Diverse AI Jobs: Catering to advanced ML developers, FEDML Nexus AI provides powerful MLOps platforms for distributed model training, scalable model serving, and edge-based federated learning. FedML Train offers robust distributed model training with advanced resource optimization and observability. FedML Deploy provides MLOps for swift, auto-scaled model serving, with endpoints on decentralized cloud or on-premises. FedML Federate extends model training and serving to edge servers and smartphones, enhancing privacy compliance and optimizing costs. For developers looking for quick solutions, FEDML Nexus AI's Job Store houses pre-packaged compute libraries for diverse AI jobs, from training to serving to federated training.

“The AI community has made significant progress in developing new paradigms and innovative models for generative AI,” said Aiden Chaoyang He, co-founder and CTO of FedML. “However, challenges in cloud computing still hinder the productionization of these new AI models and research frameworks for native AI apps. These challenges include low availability of GPUs, high cloud costs, fragmented software stack for AI, and inefficiency in development and operations. Our latest Nexus AI is a significant step forward in addressing these challenges. It utilizes FedML Launch, an incredibly versatile cross-cloud scheduler, to coordinate various computing frameworks for training and deployment in a unified and user-friendly manner. This not only saves developers and enterprises the time and effort of finding GPU resources and dealing with fragmented multi-stage machine learning pipelines, but also simplifies complex multi-step workflows in cutting-edge directions such as LLM-based AI agents.”

He concluded: “Our unwavering vision is to provide superior AI infrastructure for the rapidly growing AI community. We are fortunate to have climbed this mountain starting with federated learning, where we have innovated the most complex building blocks, such as schedulers, orchestrators, and distributed systems for ML. These innovations have been further leveraged to meet the general demands of model training and deployment.”

Developers and enterprises eager to leverage the capabilities of FEDML Nexus AI can sign up now at https://nexus.fedml.ai.

FedML was co-founded by Avestimehr, a Dean’s professor at USC and the inaugural director of the USC + Amazon Center on Secure & Trusted Machine Learning, and his former PhD student Dr. Chaoyang He, who published several award-winning papers and has more than 10 years R&D experience at Google, Amazon, Facebook, Tencent and Baidu. Over the past four years, Avestimehr and He have worked with nearly 40 collaborators to build FedML’s open source library and commercial software that combines federated learning tools with an industrial-grade MLOps platform and secure data marketplace.

About FedML

FedML is a leader in custom AI development, using distributed AI and federated learning to help companies build and train their own AI models. FedML’s enterprise software platform and open-source library empower developers to train, deploy and customize models across edge and cloud nodes at any scale. FedML’s distributed MLOps platform uniquely enables sharing of data, models, and compute resources in a way that preserves data privacy and security. The company hosts the top-ranked GitHub library for federated learning, and is used by more than 3,000 developers globally and 10 enterprise customers spanning multiple industries.


Source: FedML

AIwire