Interaction in Reinforcement Learning Reduces the Need for Finely Tuned Hyperparameters in Complex Tasks

Research output: Contribution to journal/Conference contribution in journal/Contribution to newspaperJournal articleResearchpeer-review

Standard

Interaction in Reinforcement Learning Reduces the Need for Finely Tuned Hyperparameters in Complex Tasks. / Stahlhut, Chris; Navarro-Guerrero, Nicolás; Weber, Cornelius; Wermter, Stefan.

In: Kognitive Systeme, Vol. 3, No. 2, 01.12.2015.

Research output: Contribution to journal/Conference contribution in journal/Contribution to newspaperJournal articleResearchpeer-review

Harvard

APA

CBE

MLA

Vancouver

Author

Stahlhut, Chris ; Navarro-Guerrero, Nicolás ; Weber, Cornelius ; Wermter, Stefan. / Interaction in Reinforcement Learning Reduces the Need for Finely Tuned Hyperparameters in Complex Tasks. In: Kognitive Systeme. 2015 ; Vol. 3, No. 2.

Bibtex

@article{06f6c0e5a0ee42a9aee8558e395fad5e,
title = "Interaction in Reinforcement Learning Reduces the Need for Finely Tuned Hyperparameters in Complex Tasks",
abstract = "Giving interactive feedback, other than well done / badly done alone, can speed up reinforcement learning. However, the amount of feedback needed to improve the learning speed and performance has not been thoroughly investigated. To narrow this gap, we study the effects of one type of interaction: we allow the learner to ask a teacher whether the last performed action was good or not and if not, the learner can undo that action and choose another one; hence the learner avoids bad action sequences. This allows the interactive learner to reduce the overall number of steps necessary to reach its goal and learn faster than a non-interactive learner. Our results show that while interaction does not increase the learning speed in a simple task with 1 degree of freedom, it does speed up learning significantly in more complex tasks with 2 or 3 degrees of freedom.",
keywords = "ausrl",
author = "Chris Stahlhut and Nicol{\'a}s Navarro-Guerrero and Cornelius Weber and Stefan Wermter",
year = "2015",
month = "12",
day = "1",
doi = "10.17185/duepublico/40718",
language = "English",
volume = "3",
journal = "Kognitive Systeme",
number = "2",

}

RIS

TY - JOUR

T1 - Interaction in Reinforcement Learning Reduces the Need for Finely Tuned Hyperparameters in Complex Tasks

AU - Stahlhut, Chris

AU - Navarro-Guerrero, Nicolás

AU - Weber, Cornelius

AU - Wermter, Stefan

PY - 2015/12/1

Y1 - 2015/12/1

N2 - Giving interactive feedback, other than well done / badly done alone, can speed up reinforcement learning. However, the amount of feedback needed to improve the learning speed and performance has not been thoroughly investigated. To narrow this gap, we study the effects of one type of interaction: we allow the learner to ask a teacher whether the last performed action was good or not and if not, the learner can undo that action and choose another one; hence the learner avoids bad action sequences. This allows the interactive learner to reduce the overall number of steps necessary to reach its goal and learn faster than a non-interactive learner. Our results show that while interaction does not increase the learning speed in a simple task with 1 degree of freedom, it does speed up learning significantly in more complex tasks with 2 or 3 degrees of freedom.

AB - Giving interactive feedback, other than well done / badly done alone, can speed up reinforcement learning. However, the amount of feedback needed to improve the learning speed and performance has not been thoroughly investigated. To narrow this gap, we study the effects of one type of interaction: we allow the learner to ask a teacher whether the last performed action was good or not and if not, the learner can undo that action and choose another one; hence the learner avoids bad action sequences. This allows the interactive learner to reduce the overall number of steps necessary to reach its goal and learn faster than a non-interactive learner. Our results show that while interaction does not increase the learning speed in a simple task with 1 degree of freedom, it does speed up learning significantly in more complex tasks with 2 or 3 degrees of freedom.

KW - ausrl

U2 - 10.17185/duepublico/40718

DO - 10.17185/duepublico/40718

M3 - Journal article

VL - 3

JO - Kognitive Systeme

JF - Kognitive Systeme

IS - 2

ER -