Skip to main content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Informational Only

This challenge is no longer accepting new submissions.

Automatic Speech Recognition in Reverberant Environments (ASpIRE)

Build automatic speech recognition systems able to perform well across a variety of acoustic environments without matched training data

Office of Director of National Intelligence - Intelligence Advanced Research Project Activity

Total Cash Prizes Offered: $110,000
Type of Challenge: Scientific
Submission Start: 02/11/2015 12:00 AM ET
Submission End: 02/26/2015 12:00 AM ET

This challenge is externally hosted.

You can view the challenge details here: https://www.innocentive.com/ar/challenge/9933624

Description

Automatic speech recognition software that works in a variety of acoustic environments and recording scenarios is a holy grail of the speech research community. IARPA’s Automatic Speech recognition In Reverberant Environments (ASpIRE) Challenge is seeking that grail.

Who We Are: The Intelligence Advanced Research Projects Activity (IARPA) focuses on high-risk, high-payoff research. The ASpIRE challenge is a spin-off of our Babel program, which works to develop agile and robust technology that can be rapidly applied to any human language.

What We’re Doing: Previous work has shown that automatic speech recognition (ASR) performance degrades in room microphone conditions, especially when data used for training is mismatched with data used in testing. The ASpIRE challenge asks the public to develop approaches to mitigate the effects of these conditions and create software that can function in many acoustic environments and recording scenarios. Participants can address either a single microphone or multimicrophone scenario.

Where We’re Doing This: Around the world—anyone over age 18 is welcome to participate. We’re looking for solutions from anyone who thinks they might have a way of addressing this problem, including analysts, natural language processing (NLP) specialists, machine learning programmers, and even experts in disciplines we haven’t yet considered.

When We’re Doing This: Fifteen hours of data (divided into a development set and development-test set) have been posted on the challenge website. These data, which consist of multimicrophone recordings of conversational speech with transcriptions, are meant to be used for optimization, training selection, and tuning. At any time during this period, solvers may run their software against the data and revise their solutions.

During the evaluation period, participants will be given approximately 10 hours of new transcribed far-field microphone data from noisy, reverberant rooms. These data will be divided into the single-microphone or multimicrophone conditions, and word error rate will be the objective measure of performance. To be eligible for award, the single-microphone submissions must be received before February 19, 2014 and multimicrophone submissions before February 27, 2014.

Why We’re Doing This: Challenges are widely recognized as a cost-efficient way to gather cross-disciplinary solutions to difficult problems. Challenges to stimulate breakthroughs in science and technology also support the White House’s Strategy for American Innovation, as well as government transparency and efficiency. By sponsoring full and open competition via challenges, IARPA is tackling the most challenging research questions of today—and changing the future of technology.

Why Participate? The ASpIRE Challenge gives experts the opportunity to contribute to technological breakthroughs that can make what has been impossible in the ASR community—software that works in a variety of acoustic environments and recording scenarios—a reality.

We offered 2 prizes: $30,000 for software that addresses the single-microphone condition and $20,000 for software that addresses the multi-microphone condition.

Winner

The team from the Center for Language and Speech Processing, Johns Hopkins University (Vijayaditya Peddinti, Guoguo Chen, Dr. Daniel Povey, Dr. Sanjeev Khudanpur);

The multi-institutional team from Raytheon BBN Technologies (Jeff Ma, Roger Hsiao, William Hartmann, Rich Schwartz, Stavros Tsakalidis), Brno University of Technology (Martin Karafiat, Lukas Burget, Igor Szoke, Frantisek Grezl), and Johns Hopkins University (Sri Harish Mallidi, Hynek Hermansky)

The team from the Institute for Infocomm Research, A*STAR, Singapore (Dr. Jonathan William Dennis and Dr. Tran Huy Dat)

Prizes

First Prize - Single Microphone Condition
Cash Prize Amount: $30,000

First Prize - Single Microphone
Cash Prize Amount: $30,000

First Prize - Single Microphone
Cash Prize Amount: $30,000

First Prize - Multiple Microphone Microphone
Cash Prize Amount: $20,000

Rules

ASpIRE Evaluation Plan 11-21-14

Judging Criteria

Objective criteria
Systems were evaluated for Word Error Rate on the evaluation data set.

How To Enter

All submissions must be entered through Innocentive, at (link) . Eligibility requirements and official entries are through Innocentive.