Discriminative Kernel-based phoneme sequence recognition

Joseph Keshet*, Shai Shalev-Shwartz, Samy Bengio, Yoram Singer, Dan Chazan

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

12 Scopus citations

Abstract

We describe a new method for phoneme sequence recognition given a speech utterance, which is not based on the HMM. In contrast to HMM-based approaches, our method uses a discriminative kernel-based training procedure in which the learning process is tailored to the goal of minimizing the Levenshtein distance between the predicted phoneme sequence and the correct sequence. The phoneme sequence predictor is devised by mapping the speech utterance along with a proposed phoneme sequence to a vector-space endowed with an inner-product that is realized by a Mercer kernel. Building on large margin techniques for predicting whole sequences, we are able to devise a learning algorithm which distills to separating the correct phoneme sequence from all other sequences. We describe an iterative algorithm for learning the phoneme sequence recognizer and further describe an efficient implementation of it. We present initial encouraging experimental results with the TIMIT and compare the proposed method to an HMM-based approach.

Original languageEnglish
Title of host publicationINTERSPEECH 2006 and 9th International Conference on Spoken Language Processing, INTERSPEECH 2006 - ICSLP
PublisherInternational Speech Communication Association
Pages593-596
Number of pages4
ISBN (Print)9781604234497
StatePublished - 2006
EventINTERSPEECH 2006 and 9th International Conference on Spoken Language Processing, INTERSPEECH 2006 - ICSLP - Pittsburgh, PA, United States
Duration: 17 Sep 200621 Sep 2006

Publication series

NameProceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
Volume2
ISSN (Electronic)1990-9772

Conference

ConferenceINTERSPEECH 2006 and 9th International Conference on Spoken Language Processing, INTERSPEECH 2006 - ICSLP
Country/TerritoryUnited States
CityPittsburgh, PA
Period17/09/0621/09/06

Keywords

  • Acoustic modeling
  • Phoneme recognition
  • Speech recognition
  • Support vector machines

Fingerprint

Dive into the research topics of 'Discriminative Kernel-based phoneme sequence recognition'. Together they form a unique fingerprint.

Cite this