TY - JOUR
T1 - HEBDB
T2 - 25th Interspeech Conferece 2024
AU - Turetzky, Arnon
AU - Tal, Or
AU - Segal-Feldman, Yael
AU - Dissen, Yehoshua
AU - Zeldes, Ella
AU - Roth, Amit
AU - Cohen, Eyal
AU - Shrem, Yosi
AU - Chernyak, Bronya R.
AU - Seleznova, Olga
AU - Keshet, Joseph
AU - Adi, Yossi
N1 - Publisher Copyright:
© 2024 International Speech Communication Association. All rights reserved.
PY - 2024
Y1 - 2024
N2 - We present HEBDB, a weakly supervised dataset for spoken language processing in the Hebrew language. HEBDB offers roughly 2500 hours of natural and spontaneous speech recordings in the Hebrew language, consisting of a large variety of speakers and topics. We provide raw recordings together with a pre-processed, weakly supervised, and filtered version. The goal of HEBDB is to further enhance research and development of spoken language processing tools for the Hebrew language. Hence, we additionally provide two baseline systems for Automatic Speech Recognition (ASR): (i) a self-supervised model; and (ii) a fully supervised model. We present the performance of these two methods optimized on HEBDB and compare them to current multi-lingual ASR alternatives. Results suggest the proposed method reaches better results than the evaluated baselines considering similar model sizes. Dataset, code, and models are publicly available under https://pages.cs.huji.ac.il/adiyoss-lab/HebDB/.
AB - We present HEBDB, a weakly supervised dataset for spoken language processing in the Hebrew language. HEBDB offers roughly 2500 hours of natural and spontaneous speech recordings in the Hebrew language, consisting of a large variety of speakers and topics. We provide raw recordings together with a pre-processed, weakly supervised, and filtered version. The goal of HEBDB is to further enhance research and development of spoken language processing tools for the Hebrew language. Hence, we additionally provide two baseline systems for Automatic Speech Recognition (ASR): (i) a self-supervised model; and (ii) a fully supervised model. We present the performance of these two methods optimized on HEBDB and compare them to current multi-lingual ASR alternatives. Results suggest the proposed method reaches better results than the evaluated baselines considering similar model sizes. Dataset, code, and models are publicly available under https://pages.cs.huji.ac.il/adiyoss-lab/HebDB/.
KW - Automatic Speech Recognition
KW - Hebrew Speech Technologies
KW - Speech Benchmark
UR - http://www.scopus.com/inward/record.url?scp=85214787499&partnerID=8YFLogxK
U2 - 10.21437/Interspeech.2024-417
DO - 10.21437/Interspeech.2024-417
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.conferencearticle???
AN - SCOPUS:85214787499
SN - 2308-457X
SP - 1360
EP - 1364
JO - Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
JF - Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
Y2 - 1 September 2024 through 5 September 2024
ER -