TY - JOUR
T1 - Machine learning in a data-limited regime
T2 - Augmenting experiments with synthetic data uncovers order in crumpled sheets
AU - Hoffmann, Jordan
AU - Bar-Sinai, Yohai
AU - Lee, Lisa M.
AU - Andrejevic, Jovana
AU - Mishra, Shruti
AU - Rubinstein, Shmuel M.
AU - Rycroft, Chris H.
N1 - Publisher Copyright:
© 2019 The Authors.
PY - 2019
Y1 - 2019
N2 - Machine learning has gained widespread attention as a powerful tool to identify structure in complex, highdimensional data. However, these techniques are ostensibly inapplicable for experimental systems where data are scarce or expensive to obtain. Here, we introduce a strategy to resolve this impasse by augmenting the experimental dataset with synthetically generated data of a much simpler sister system. Specifically, we study spontaneously emerging local order in crease networks of crumpled thin sheets, a paradigmatic example of spatial complexity, and show that machine learning techniques can be effective even in a data-limited regime. This is achieved by augmenting the scarce experimental dataset with inexhaustible amounts of simulated data of rigid flat-folded sheets, which are simple to simulate and share common statistical properties. This considerably improves the predictive power in a test problem of pattern completion and demonstrates the usefulness of machine learning in bench-top experiments where data are good but scarce.
AB - Machine learning has gained widespread attention as a powerful tool to identify structure in complex, highdimensional data. However, these techniques are ostensibly inapplicable for experimental systems where data are scarce or expensive to obtain. Here, we introduce a strategy to resolve this impasse by augmenting the experimental dataset with synthetically generated data of a much simpler sister system. Specifically, we study spontaneously emerging local order in crease networks of crumpled thin sheets, a paradigmatic example of spatial complexity, and show that machine learning techniques can be effective even in a data-limited regime. This is achieved by augmenting the scarce experimental dataset with inexhaustible amounts of simulated data of rigid flat-folded sheets, which are simple to simulate and share common statistical properties. This considerably improves the predictive power in a test problem of pattern completion and demonstrates the usefulness of machine learning in bench-top experiments where data are good but scarce.
UR - http://www.scopus.com/inward/record.url?scp=85065425144&partnerID=8YFLogxK
U2 - 10.1126/sciadv.aau6792
DO - 10.1126/sciadv.aau6792
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???
C2 - 31032399
AN - SCOPUS:85065425144
SN - 2375-2548
VL - 5
JO - Science advances
JF - Science advances
IS - 4
M1 - eaau6792
ER -