Entropy-Based Adaptive Exploit-Explore Coefficient for Monte-Carlo Path Planning

Carmo, Ana Raquel; Delamer, Jean-Alexis; Watanabe, Yoko; Ventura, Rodrigo; Chanel, Caroline P.C.

doi:10.3233/FAIA200470

Abstract

Efficient path planning for autonomous vehicles in cluttered environments is a challenging sequential decision-making problem under uncertainty. In this context, this paper implements a partially observable stochastic shortest path (PO-SSP) planning problem for autonomous urban navigation of Unmanned Aerial Vehicles (UAVs). To solve this planning problem, the POMCP-GO algorithm is used, which is goal oriented variant of POMCP, one of the fastest online state-of-the-art solvers for partially observable environments based on Monte Carlo Planning. This algorithm relies on the Upper Confidence Bounds (UCB1) algorithm as action selection strategy. UCB1 depends on an exploration constant typically adjusted empirically. Its best value varies significantly between planning problems, and hence, an exhaustive search to find the most suitable value is required. This exhaustive search applied to a complex path planning problem may be extremely time consuming. Moreover, considering real applications where online planning is needed, this extensive search is not suitable. Thereby this paper explores the use of an adaptive exploration coefficient for action selection during planning. Monte-Carlo value backup approximation is also applied which empirically demonstrates to accelerate the policy value convergence. Simulation results show that the use of the adaptive exploration coefficient within a user-defined interval achieves better convergence and success rates when compared with most hand-tuned fixed coefficients in said interval, although never achieving the same results as the best fixed coefficient. Therefore, a compromise must be made between the desired quality of the results and the time one is willing to spend on the exhaustive search for the best coefficient value before planning.

Contact

IOS Press Copyright 2024

Contact

IOS Press Copyright 2024

This website uses cookies

This website uses cookies