A Recurrent Memory Model Implementation of Reinforcement Learning
En cours de chargement...
Date
Authors
Nom de la revue
ISSN de la revue
Titre du volume
Éditeur
Université d'Ottawa / University of Ottawa
Résumé
Reinforcement Learning (RL) provides a robust framework for understanding how humans and animals learn from and adapt to our environments through trial and error. It is comprised of two processes that work together, exploration and exploitation. This thesis presents an implementation of these two aspects of RL using recurrent neural networks. Several different challenges were approached including one-to-many problems, nonlinearly separable problems, and the generation and representation of random behaviour. The first study is an implementation of exploitation that is capable of cycling through previously learned behaviours and stabilizing on the correct one for each given problem. This is achieved with contextual tags and a unit that represents environmental feedback. It is able to solve nonlinearly separable and one-to-many problems. The second study is an implementation of exploration that is able to randomly select from available options and adapt based on the feedback from those decisions. It tackles the question of how to generate and represent random behaviour, as well as how to represent and apply reward. The implementations and techniques detailed in this thesis could be applied to many other similar models of cognitive behaviour. Further research could experiment by combining these two models and investigating the implementation of different trade-off strategies.
Description
Mots-clés
Reinforcement Learning, Randomness, Recurrent associative memory, Exploration-Exploitation, Artificial neural networks, Biasing decisions, Cognitive psychology
