Cohere For AI Launches Aya 23, 8B and 35B Parameter Open Weights Release
May 24, 2024 -- Cohere For AI is excited to announce Aya 23, a new family of state-of-the-art, multilingual, generative large language research model (LLM) covering 23 different languages. Cohere for AI releasing both the 8-billion and 35-billion parameter Aya 23 models as open weights as part of its continued commitment to multilingual research.
The Aya 23 model family builds on Aya - an open science movement that brought together 3,000 collaborators from around the world to build the largest multilingual instruction fine-tuning dataset to-date and state-of-the-art massively multilingual model. Aya 101 covered 101 languages and is focused on breadth, for Aya 23 Cohere focuses on depth by pairing a highly performant pre-trained model with the recently released Aya dataset collection. The result is a powerful multilingual large language research model serving 23 languages, expanding state-of-the-art language modeling capabilities to nearly half of the world's population.
Aya 23, as well as the wider family of Aya models and datasets contributes to a paradigm shift in how the ML community approaches multilingual AI research. As LLMs, and AI generally, have changed the global technological landscape, many communities across the world have been left unsupported due to the language limitations of existing models.
Most high-performant language models only serve a handful of languages. Aya 23 is part of Cohere's commitment to contributing state-of-the-art research demonstrating that more languages can be treated as first-class citizens and releasing models that support researchers who join this mission. Aya 23, as well as the wider family of Aya models and datasets contributes to a paradigm shift in how the ML community approaches multilingual AI research.
Cohere benchmarks Aya 23’s performance against both massively multilingual open source models such as Aya-101 and widely used open weight instruction-tuned models. 35B parameter Aya 23 achieves the highest results across all benchmarks for the languages covered while 8B parameter Aya 23 demonstrates best-in-class multilingual performance. Through this release, Aya demonstrates superior capabilities in complex tasks, such as natural language understanding, summarization, and translation, across a wide linguistic spectrum.
The 8B parameter version of Aya 23 demonstrates the company's commitment to developing highly efficient and accessible multilingual research models for everyday developers. Given the model's smaller size, it offers reduced computational resource requirements. These are all important factors to help close the gap for AI researchers globally in democratizing access to cutting-edge technology. This is part of Cohere's ongoing research as a lab into delivering efficiency at scale, learn more about the work here.
Aya 23 is now available to experiment, explore, and build on for fundamental research and safety auditing. You can experience the model at https://huggingface.co/spaces/CohereForAI/aya-23.
Learn more about this model and the broader Aya initiative at https://cohere.com/research/aya.
The Cohere for AI team is also sharing a technical report on Aya 23 with a complete set of evaluation results on multiple multilingual NLP benchmarks, and generation quality assessments.
Source: Cohere for AI