A comparison of action selection methods for implicit policy method reinforcement learning in continuous action-space
Nichols, Barry D. ORCID: https://orcid.org/0000-0002-6760-6037
(2016)
A comparison of action selection methods for implicit policy method reinforcement learning in continuous action-space.
2016 International Joint Conference on Neural Networks (IJCNN).
In: International Joint Conference on Neural Networks (IJCNN 2016), 24-29 Jul 2016, Vancouver, Canada.
ISBN 9781509006205.
ISSN 2161-4407
[Conference or Workshop Item]
(doi:10.1109/IJCNN.2016.7727688)
|
PDF
- Final accepted version (with author's formatting)
Download (338kB) | Preview |
Abstract
In this paper I investigate methods of applying reinforcement learning to continuous state- and action-space problems without a policy function. I compare the performance of four methods, one of which is the discretisation of the action-space, and the other three are optimisation techniques applied to finding the greedy action without discretisation. The optimisation methods I apply are gradient descent, Nelder-Mead and Newton's Method. The action selection methods are applied in conjunction with the SARSA algorithm, with a multilayer perceptron utilized for the approximation of the value function. The approaches are applied to two simulated continuous state- and action-space control problems: Cart-Pole and double Cart-Pole. The results are compared both in terms of action selection time and the number of trials required to train on the benchmark problems.
Item Type: | Conference or Workshop Item (Paper) |
---|---|
Research Areas: | A. > School of Science and Technology A. > School of Science and Technology > Computer Science |
Item ID: | 19845 |
Notes on copyright: | Full text: © 2016 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. |
Useful Links: | |
Depositing User: | Barry Nichols |
Date Deposited: | 19 May 2016 10:14 |
Last Modified: | 29 Nov 2022 21:29 |
URI: | https://eprints.mdx.ac.uk/id/eprint/19845 |
Actions (login required)
![]() |
View Item |
Statistics
Additional statistics are available via IRStats2.