On Hierarchical Goal Based Reinforcement Learning

Denis, Nicholas

On Hierarchical Goal Based Reinforcement Learning

Fichiers

Principal Denis_Nicholas_2019_thesis.pdf (4.64 MB)

Date

2019-08-27

Authors

Denis, Nicholas

Éditeur

Université d'Ottawa / University of Ottawa

Résumé

Discrete time sequential decision processes require that an agent select an action at each time step. As humans, we plan over long time horizons and use temporal abstraction by selecting temporally extended actions such as “make lunch” or “get a masters degree”, whereby each is comprised of more granular actions. This thesis concerns itself with such hierarchical temporal abstractions in the form of macro actions and options, as they apply to goal-based Markov Decision Processes. A novel algorithm for discovering hierarchical macro actions in goal-based MDPs, as well as a novel algorithm utilizing landmark options for transfer learning in multi-task goal- based reinforcement learning settings are introduced. Theoretical properties regarding the life-long regret of an agent executing the latter algorithm are also discussed.

Mots-clés

Markov decision process, Reinforcement learning, Options framework, Temporal abstraction, Macro actions

URI

http://hdl.handle.net/10393/39552
http://dx.doi.org/10.20381/ruor-23795

Collections

- Thèses, 2011 - // Theses, 2011 -

Notice complète

On Hierarchical Goal Based Reinforcement Learning

Fichiers

Date

Authors

Nom de la revue

ISSN de la revue

Titre du volume

Éditeur

Résumé

Description

Mots-clés

Citation

URI

Collections

Approbation

Évaluation

Complété par

Référencé par