Together AI Introduces Instant GPU Clusters and Expanded NVIDIA Blackwell GPU Deployments
SAN JOSE, Calif., March 18, 2025 -- Together AI is unveiling new advancements at NVIDIA GTC, including the deployment of NVIDIA Blackwell GPUs at scale and the preview release of Together Instant GPU Clusters. An NVIDIA Cloud Partner and Gold Sponsor of GTC, Together AI continually delivers the most scalable, high-performance AI infrastructure with the most significant efficiencies, performance, and speed to support the needs of AI pioneers and enterprises.
Together AI is rapidly deploying thousands of NVIDIA Blackwell GPUs to accelerate the next generation of AI workloads. Together GPU Clusters featuring NVIDIA HGX B200 are turbocharged with Together Kernel Collection to deliver unprecedented performance: 90% faster training than NVIDIA Hopper, achieving 15,200 tokens/second/node on a training run for a 70B parameter large language model.
Built on NVIDIA's latest 5nm process technology, NVIDIA Blackwell GPUs offer a significant leap in compute efficiency and throughput, with advanced FP8 Tensor Cores delivering up to 35x the performance of previous-generation accelerators.
Introducing Together Instant GPU Clusters – Now in Preview
Today, Together AI is launching the Preview release of Together Instant GPU Clusters, offering up to 64 NVIDIA Hopper GPUs (80GB SXM) per deployment. These clusters are interconnected via NVIDIA Quantum-2 InfiniBand and NVIDIA NVLink, delivering ultra-low latency and high-bandwidth performance for AI teams that require rapid access to high-performance compute resources.
Together Instant GPU Clusters accelerated by NVIDIA are fully self-service and can be spun up in minutes via the Together AI console. Whether for burst compute during peak demand, model validation before major investments, or large-scale AI training and inference, these clusters provide high-performance, flexible, and on-demand infrastructure. AI teams can skip lengthy sales processes, deploy NVIDIA GPUs in minutes, and accelerate AI research, experimentation, and production workloads with optimized NVIDIA-accelerated performance.
Together AI provides scalable, high-performance NVIDIA GPU clusters tailored to AI teams at every stage:
- Instant GPU Clusters (Up to 64 NVIDIA Hopper GPUs) – Self-service, high-speed clusters designed for rapid iteration and burst compute demands, available in minutes via the Together AI console.
- Dedicated GPU Clusters (64 – 1,000 NVIDIA GPUs) – Custom-configured, high-density compute infrastructure optimized for large-scale training and inference. These clusters leverage NVIDIA HGX architectures and are integrated with Together AI's advanced performance optimizations.
- Custom GPU Clusters (1,000 – 100,000+ NVIDIA GPUs) – Hyperscale deployments designed for enterprise AI supercomputing projects, including multi-region AI Factories with ultra-high-density GPU compute capacity.
Seamless Integration with NVIDIA AI Enterprise and NIM
Together AI is an NVIDIA Cloud Partner delivering AI compute optimized for NVIDIA AI. Customers can now deploy NVIDIA NIM microservices, part of NVIDIA AI Enterprise, directly from build.nvidia.com to Together AI, streamlining AI application deployment at scale.
NVIDIA NIM enables enterprises to deploy production-ready AI models with pre-optimized inference performance, supporting state-of-the-art models such as Nemotron-4 340B and NVIDIA NeMo Retriever. With Together AI's optimized infrastructure, customers benefit from reduced inference latency, higher throughput, and seamless integration with the latest AI frameworks included in NVIDIA AI Enterprise.
Together AI continues to push the boundaries of AI infrastructure, providing the flexibility, performance, and scalability needed to power the next generation of AI breakthroughs. Whether you're training cutting-edge AI models, scaling research, or deploying mission-critical AI systems, Together AI delivers the compute resources to fuel AI innovation.
Together AI is a Gold Sponsor at NVIDIA GTC 2025 and is exhibiting at Booth #1332.
About Together AI
Together AI empowers developers and enterprises to train, fine-tune and run inference for generative AI models — delivering unparalleled performance, control, and cost-efficiency. The Together AI Platform supports a comprehensive range of top open source and custom models across multiple modalities, while offering flexible deployment options with the highest levels of privacy and security. Committed to advancing the frontier of AI through open collaboration, innovation and transparency, Together AI ensures that powerful AI systems remain accessible and flexible while creating optimal outcomes for society. To start fine-tuning and running the world's best open source models, visit together.ai.
Source: Together AI