Bridging Simulation-based Search and Model-based Reinforcement Learning with Entropy Regularization

Reinforcement learning (RL) provides a unified framework for sequential decision-making problem, where a computer agent interacts with an environment while trying to learn optimal decisions to maximize its long-term reward. This makes RL a suitable choice for many real-world applications, including finance. RL applications in finance have created a lot of in-depth innovation such as better execution of approving loans, managing investments, and most importantly measuring risk. The objective of this project is to advance RL approaches by combining them with powerful simulation-based search algorithms. In particular, we will investigate how simulation-based search can improve the sample efficiency of RL. The performance of our proposed algorithms will first be evaluated on well-known test domains, such as board and video games. We will then employ our algorithms in finance applications.

Faculty Supervisor:

Martin Müller;Dale Schuurmans

Student:

Partner:

Royal Bank of Canada (Borealis)

Discipline:

Computer science

Sector:

Technology; Information and Communications Technology

University:

University of Alberta

Program: