Toward deeper understanding of neural networks: The power of initialization and a dual view on expressivity

Amit Daniely, Roy Frostig, Yoram Singer

Research output: Contribution to journalConference articlepeer-review

170 Scopus citations

Abstract

We develop a general duality between neural networks and compositional kernel Hilbert spaces. We introduce the notion of a computation skeleton, an acyclic graph that succinctly describes both a family of neural networks and a kernel space. Random neural networks are generated from a skeleton through node replication followed by sampling from a normal distribution to assign weights. The kernel space consists of functions that arise by compositions, averaging, and non-linear transformations governed by the skeleton's graph topology and activation functions. We prove that random networks induce representations which approximate the kernel space. In particular, it follows that random weight initialization often yields a favorable starting point for optimization despite the worst-case intractability of training neural networks.

Original languageAmerican English
Pages (from-to)2261-2269
Number of pages9
JournalAdvances in Neural Information Processing Systems
StatePublished - 2016
Externally publishedYes
Event30th Annual Conference on Neural Information Processing Systems, NIPS 2016 - Barcelona, Spain
Duration: 5 Dec 201610 Dec 2016

Bibliographical note

Publisher Copyright:
© 2016 NIPS Foundation - All Rights Reserved.

Fingerprint

Dive into the research topics of 'Toward deeper understanding of neural networks: The power of initialization and a dual view on expressivity'. Together they form a unique fingerprint.

Cite this