Deep Audio Waveform Prior

Arnon Turetzky, Tzvi Michelson, Yossi Adi, Shmuel Peleg

Research output: Contribution to journalConference articlepeer-review

2 Scopus citations

Abstract

Convolutional neural networks contain strong priors for generating natural looking images [1]. These priors enable image denoising, super resolution, and inpainting in an unsupervised manner. Previous attempts to demonstrate similar ideas in audio, namely deep audio priors, (i) use hand picked architectures such as harmonic convolutions, (ii) only work with spectrogram input, and (iii) have been used mostly for eliminating Gaussian noise [2]. In this work we show that existing State-Of-The-Art (SOTA) architectures for audio source separation contain deep priors even when working with the raw waveform. Deep priors can be discovered by training a neural network to generate a single corrupted signal when given white noise as input. A network with relevant deep priors is likely to generate a cleaner version of the signal before converging on the corrupted signal. We demonstrate this restoration effect with several corruptions: background noise, reverberations, and a gap in the signal (audio inpainting).

Original languageEnglish
Pages (from-to)2938-2942
Number of pages5
JournalProceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
Volume2022-September
DOIs
StatePublished - 2022
Event23rd Annual Conference of the International Speech Communication Association, INTERSPEECH 2022 - Incheon, Korea, Republic of
Duration: 18 Sep 202222 Sep 2022

Bibliographical note

Publisher Copyright:
Copyright © 2022 ISCA.

Keywords

  • audio denoising
  • audio inpainting
  • deep priors
  • dereverberation

Fingerprint

Dive into the research topics of 'Deep Audio Waveform Prior'. Together they form a unique fingerprint.

Cite this