TY - JOUR
T1 - Representations and generalization in artificial and brain neural networks
AU - Li, Qianyi
AU - Sorscher, Ben
AU - Sompolinsky, Haim
N1 - Publisher Copyright:
Copyright © 2024 the Author(s). Published by PNAS.
PY - 2024/7
Y1 - 2024/7
N2 - Humans and animals excel at generalizing from limited data, a capability yet to be fully replicated in artificial intelligence. This perspective investigates generalization in biological and artificial deep neural networks (DNNs), in both in-distribution and out-of-distribution contexts. We introduce two hypotheses: First, the geometric properties of the neural manifolds associated with discrete cognitive entities, such as objects, words, and concepts, are powerful order parameters. They link the neural substrate to the generalization capabilities and provide a unified methodology bridging gaps between neuroscience, machine learning, and cognitive science. We overview recent progress in studying the geometry of neural manifolds, particularly in visual object recognition, and discuss theories connecting manifold dimension and radius to generalization capacity. Second, we suggest that the theory of learning in wide DNNs, especially in the thermodynamic limit, provides mechanistic insights into the learning processes generating desired neural representational geometries and generalization. This includes the role of weight norm regularization, network architecture, and hyper-parameters. We will explore recent advances in this theory and ongoing challenges. We also discuss the dynamics of learning and its relevance to the issue of representational drift in the brain.
AB - Humans and animals excel at generalizing from limited data, a capability yet to be fully replicated in artificial intelligence. This perspective investigates generalization in biological and artificial deep neural networks (DNNs), in both in-distribution and out-of-distribution contexts. We introduce two hypotheses: First, the geometric properties of the neural manifolds associated with discrete cognitive entities, such as objects, words, and concepts, are powerful order parameters. They link the neural substrate to the generalization capabilities and provide a unified methodology bridging gaps between neuroscience, machine learning, and cognitive science. We overview recent progress in studying the geometry of neural manifolds, particularly in visual object recognition, and discuss theories connecting manifold dimension and radius to generalization capacity. Second, we suggest that the theory of learning in wide DNNs, especially in the thermodynamic limit, provides mechanistic insights into the learning processes generating desired neural representational geometries and generalization. This includes the role of weight norm regularization, network architecture, and hyper-parameters. We will explore recent advances in this theory and ongoing challenges. We also discuss the dynamics of learning and its relevance to the issue of representational drift in the brain.
KW - deep neural networks
KW - few-shot learning
KW - neural manifolds
KW - representational drift
KW - visual cortex
UR - http://www.scopus.com/inward/record.url?scp=85197002348&partnerID=8YFLogxK
U2 - 10.1073/pnas.2311805121
DO - 10.1073/pnas.2311805121
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???
C2 - 38913896
AN - SCOPUS:85197002348
SN - 0027-8424
VL - 121
JO - Proceedings of the National Academy of Sciences of the United States of America
JF - Proceedings of the National Academy of Sciences of the United States of America
IS - 27
M1 - e2311805121
ER -