TY - JOUR
T1 - The evolution of continuous learning of the structure of the environment
AU - Kolodny, Oren
AU - Edelman, Shimon
AU - Lotem, Arnon
PY - 2014/3/6
Y1 - 2014/3/6
N2 - Continuous, 'always on', learning of structure from a stream of data is studied mainly in the fields of machine learning or language acquisition, but its evolutionary roots may go back to the first organisms that were internally motivated to learn and represent their environment. Here, we study under what conditions such continuous learning (CL) may be more adaptive than simple reinforcement learning and examine how it could have evolved from the same basic associative elements. We use agent-based computer simulations to compare three learning strategies: simple reinforcement learning; reinforcement learning with chaining (RL-chain) and CL that applies the same associative mechanisms used by the other strategies, but also seeks statistical regularities in the relations among all items in the environment, regardless of the initial association with food. We show that a sufficiently structured environment favours the evolution of both RL-chain and CL and that CL outperforms the other strategies when food is relatively rare and the time for learning is limited. This advantage of internally motivated CL stems from its ability to capture statistical patterns in the environment even before they are associated with food, at which point they immediately become useful for planning.
AB - Continuous, 'always on', learning of structure from a stream of data is studied mainly in the fields of machine learning or language acquisition, but its evolutionary roots may go back to the first organisms that were internally motivated to learn and represent their environment. Here, we study under what conditions such continuous learning (CL) may be more adaptive than simple reinforcement learning and examine how it could have evolved from the same basic associative elements. We use agent-based computer simulations to compare three learning strategies: simple reinforcement learning; reinforcement learning with chaining (RL-chain) and CL that applies the same associative mechanisms used by the other strategies, but also seeks statistical regularities in the relations among all items in the environment, regardless of the initial association with food. We show that a sufficiently structured environment favours the evolution of both RL-chain and CL and that CL outperforms the other strategies when food is relatively rare and the time for learning is limited. This advantage of internally motivated CL stems from its ability to capture statistical patterns in the environment even before they are associated with food, at which point they immediately become useful for planning.
KW - Decision-making
KW - Evolution of cognition
KW - Foraging theory
KW - Representation
KW - Statistical learning
UR - http://www.scopus.com/inward/record.url?scp=84893096157&partnerID=8YFLogxK
U2 - 10.1098/rsif.2013.1091
DO - 10.1098/rsif.2013.1091
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???
C2 - 24402920
AN - SCOPUS:84893096157
SN - 1742-5689
VL - 11
JO - Journal of the Royal Society Interface
JF - Journal of the Royal Society Interface
IS - 92
M1 - 20131091
ER -