ACCESS Resources Expand Open Science Horizons in AI and Speech Model Innovation
Dec. 7, 2023 -- The general public often isn’t aware of major research advancements or scientific breakthroughs until the finish line. To most people, that’s usually when something becomes newsworthy – when results are finally official.
But there’s a whole race of research to be run before those results break through the tape. The open-science philosophy is trying to make that race shorter and easier to navigate. By making their results and methods public, researchers are hoping others can build upon the foundational work of those who came before.
The academic community and scientific industry can still be competitive, however. Research teams from Carnegie Mellon University (CMU), Shanghai Jiao Tong University in China and the Honda Research Institute in Japan are setting an example for open-science believers in their work pre-training speech models.
Principal Investigator Shinji Watanabe and researchers used Delta – the most performant GPU-computing resource in the ACCESS portfolio – to reproduce the development methodology of Whisper, an automatic speech recognition system trained on nearly 700,000 hours of multilingual and multitask-supervised data collected from the web. Delta is operated and maintained at the National Center for Supercomputing Applications (NCSA), an ACCESS resource provider. ACCESS provides these resources at no cost to researchers who need cyberinfrastructure to help power their research.
Whisper was developed and maintained by OpenAI and the full scope of development for its models – from data collection to training – is not publicly available, making it difficult for researchers to further improve its performance and address training-related issues such as efficiency, robustness, fairness and bias. Programs like ACCESS can help give researchers the same advantages as large corporations – companies that have the kinds of resources needed to work on large data sets, like the natural language processing models used in OpenAI.
"While research on large-scale pre-training has exclusively been done by big tech companies, Delta is helping change this paradigm," said Watanabe. "Thanks to the generous resources and support provided by Delta, researchers from academia now have the capability to train state-of-the-art models at an industry scale. Notably, our open Whisper-style model (OWSM) stands out as the first large-scale speech model developed by the academic community."
“The sizing and composition of the computational and storage resources on Delta allow researchers in AI/ML to quickly train new models and make them available to the academic community,” said Greg Bauer, a senior technical program manager at NCSA.
“Open source with accessible data is an essential component of scientific research,” Watanabe said. “It connects researchers, contributes to the community and makes AI technologies transparent.”
Project Details
Resource Provider Institution(s): National Center for Supercomputing Applications (NCSA)
Affiliations: Carnegie Mellon University (CMU), Shanghai Jiao Tong University in China, Honda Research Institute in Japan
Funding Agency: NSF
Grant or Allocation Number(s): CIS230250
The science story featured here was enabled by the ACCESS program, which is supported by National Science Foundation grants #2138259, #2138286, #2138307, #2137603, and #2138296.
Source: Andrew Helregel, NCSA/ACCESS