Revisiting Continuous-Time Reinforcement Learning. A Study of HJB Solvers Based on PINNs and FEMs

Alena Shilova; Thomas Delliaux; Philippe Preux; Bruno Raffin

Communication Dans Un Congrès Année : 2023

Revisiting Continuous-Time Reinforcement Learning. A Study of HJB Solvers Based on PINNs and FEMs

(1, 2) , (1) , (1) , (2)

1
2

Alena Shilova

Fonction : Auteur
PersonId : 1126323

Scool

Data Aware Large Scale Computing

Thomas Delliaux

Fonction : Auteur
PersonId : 1348770

Scool

Philippe Preux

Fonction : Auteur
PersonId : 5488
IdHAL : preux-philippe
ORCID : 0000-0002-2067-2838
IdRef : 059896353

Scool

Bruno Raffin

Fonction : Auteur
PersonId : 4842
IdHAL : bruno-raffin
ORCID : 0000-0002-7980-4946
IdRef : 091616999

Data Aware Large Scale Computing

Résumé

Despite recent advances in Reinforcement Learning (RL), the Markov Decision Processes are not always the best choice to model complex dynamical systems requiring interactions at high frequency. Being able to work with arbitrary time intervals, Continuous Time Reinforcement Learning (CTRL) is more suitable for those problems. Instead of the Bellman equation operating in discrete time, it is the Hamiltonian Jacobi Bellman (HJB) equation that describes value function evolution in CTRL. Even though the value function is a solution of the HJB equation, it may not be its unique solution. To distinguish the value function from other solutions, it is important to look for the viscosity solutions of the HJB equation. The viscosity solutions constitute a special class of solutions that possess uniqueness and stability properties. In this paper, we bring together the formalism of viscosity solutions and practical methods for finding them. We also propose a novel way of training neural networks to obtain viscosity solutions. Finally, we do a comparison of those methods with discrete time RL algorithms to emphasize the benefits of considering the continuous time setting. This paper aims at providing the necessary theoretical basis for working with CTRL and setting a few possible directions for future research.

Domaines

Informatique [cs] Intelligence artificielle [cs.AI]

Fichier principal

118_revisiting_continuous_time_rei.pdf (952.13 Ko)

Origine	Fichiers produits par l'(les) auteur(s)

Bruno Raffin : Connectez-vous pour contacter le contributeur

https://hal.science/hal-04705270

Soumis le : dimanche 22 septembre 2024-21:06:43

Dernière modification le : mercredi 25 septembre 2024-03:04:47

Dates et versions

hal-04705270 , version 1 (22-09-2024)

Identifiants

HAL Id : hal-04705270 , version 1

Citer

Alena Shilova, Thomas Delliaux, Philippe Preux, Bruno Raffin. Revisiting Continuous-Time Reinforcement Learning. A Study of HJB Solvers Based on PINNs and FEMs. EWRL 2023 Workshop, Sep 2023, Brussels, Belgium. ⟨hal-04705270⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UGA CNRS INRIA LIG LIG_SRCPR CRISTAL INRIA2 LIG-SRCPR-DATAMOVE UNIV-LILLE CRISTAL-SCOOL LIG_SIDCH

31 Consultations

5 Téléchargements

Revisiting Continuous-Time Reinforcement Learning. A Study of HJB Solvers Based on PINNs and FEMs

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager