Abstract
Representing speech and audio signals in discrete units has become a compelling alternative to traditional high-dimensional feature vectors. Numerous studies have highlighted the efficacy of discrete units in various applications such as speech compression and restoration, speech recognition, and speech generation. To foster exploration in this domain, we introduce the Interspeech 2024 Challenge, which focuses on new speech processing benchmarks using discrete units. It encompasses three pivotal tasks, namely multilingual automatic speech recognition, text-to-speech, and singing voice synthesis, and aims to assess the potential applicability of discrete units in these tasks. This paper outlines the challenge designs and baseline descriptions. We also collate baseline and selected submission systems, along with preliminary findings, offering valuable contributions to future research in this evolving field.
| Original language | English |
|---|---|
| Pages (from-to) | 2559-2563 |
| Number of pages | 5 |
| Journal | Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH |
| DOIs | |
| State | Published - 2024 |
| Event | 25th Interspeech Conferece 2024 - Kos Island, Greece Duration: 1 Sep 2024 → 5 Sep 2024 |
Bibliographical note
Publisher Copyright:© 2024 International Speech Communication Association. All rights reserved.
Keywords
- discrete speech units
- singing voice synthesis
- speech recognition
- text-to-speech
Fingerprint
Dive into the research topics of 'The Interspeech 2024 Challenge on Speech Processing Using Discrete Units'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver