Revisiting Continuous-Time Reinforcement Learning. A Study of HJB Solvers Based on PINNs and FEMs - DATAMOVE - Mouvement de données pour le calcul haute performance
Communication Dans Un Congrès Année : 2023

Revisiting Continuous-Time Reinforcement Learning. A Study of HJB Solvers Based on PINNs and FEMs

Alena Shilova
  • Fonction : Auteur
  • PersonId : 1126323
Thomas Delliaux
  • Fonction : Auteur
  • PersonId : 1348770
Philippe Preux
Bruno Raffin

Résumé

Despite recent advances in Reinforcement Learning (RL), the Markov Decision Processes are not always the best choice to model complex dynamical systems requiring interactions at high frequency. Being able to work with arbitrary time intervals, Continuous Time Reinforcement Learning (CTRL) is more suitable for those problems. Instead of the Bellman equation operating in discrete time, it is the Hamiltonian Jacobi Bellman (HJB) equation that describes value function evolution in CTRL. Even though the value function is a solution of the HJB equation, it may not be its unique solution. To distinguish the value function from other solutions, it is important to look for the viscosity solutions of the HJB equation. The viscosity solutions constitute a special class of solutions that possess uniqueness and stability properties. In this paper, we bring together the formalism of viscosity solutions and practical methods for finding them. We also propose a novel way of training neural networks to obtain viscosity solutions. Finally, we do a comparison of those methods with discrete time RL algorithms to emphasize the benefits of considering the continuous time setting. This paper aims at providing the necessary theoretical basis for working with CTRL and setting a few possible directions for future research.
Fichier principal
Vignette du fichier
118_revisiting_continuous_time_rei.pdf (952.13 Ko) Télécharger le fichier
Origine Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-04705270 , version 1 (22-09-2024)

Identifiants

  • HAL Id : hal-04705270 , version 1

Citer

Alena Shilova, Thomas Delliaux, Philippe Preux, Bruno Raffin. Revisiting Continuous-Time Reinforcement Learning. A Study of HJB Solvers Based on PINNs and FEMs. EWRL 2023 Workshop, Sep 2023, Brussels, Belgium. ⟨hal-04705270⟩
31 Consultations
5 Téléchargements

Partager

More