Bellman is a package for model-based reinforcement learning (MBRL) in Python, using TensorFlow and building on top of model-free reinforcement learning package TensorFlow Agents.
Bellman provides a framework for flexible composition of model-based reinforcement learning algorithms. It offers two major classes of algorithms: decision time planning and background planning algorithms. With each class any kind of supervised learning method can be easily used to learn certain component of the environment. Bellman was designed with modularity in mind - important components can be flexibly combined, such as type of decision time planning method (e.g. a cross entropy method or a random shooting method) and type of model for state transition (e.g. a probabilistic neural network or an ensemble of neural networks). Bellman also provides implementation of several popular state-of-the-art MBRL algorithms, such as PETS, MBPO and METRPO. The online documentation contains more details.
Bellman requires Python 3.7 onwards and uses TensorFlow 2.4+ for running computations, which allows fast execution on GPU's.
Bellman was originally created by (in alphabetical order) Vincent Adam, Jordi Grau-Moya, Felix Leibfried, John McLeod, Hrvoje Stojic, and Peter Vrancx, at Secondmind Labs.
It is now actively maintained by (in alphabetical order) Felix Leibfried, John McLeod, Hrvoje Stojic and Peter Vrancx.
Bellman is an open source project, distributed under Apache License 2.0. If you have relevant skills and are interested in contributing then please do contact us. We have a public Bellman slack workspace. Please use this invite link if you'd like to join, whether to ask short informal questions or to be involved in the discussion and future development of Bellman.
We are very grateful to our Secondmind Labs colleagues, maintainers of GPflow and Trieste in particular, for their help with creating contributing guidelines, instructions for users and open-sourcing in general.
Modular design allows flexible composition of agents. Fancy Gaussian processes instead of neural nets for transition dynamics learning? No problem!
Implementation of state-of-the-art model-based reinforcement learning agents, such as PETS, MBPO or METRPO.
Set up experiments with standard loops quickly and easy, allowing for standardized and systematic comparison.
For latest (stable) release from PyPI you can use pip
to install the toolbox pip install bellman
. Use pip
tagain to install the toolbox from latest source from GitHub. Check-out the develop
branch of the Bellman GitHub repository, and in the repository root run pip install -e .
, this will install the toolbox in editable mode.
If you wish to contribute please use Poetry to manage dependencies in a local virtual environment. Poetry configuration file specifies all the development dependencies (testing, linting, typing, docs etc) and makes it much easier to contribute. To install Poetry, follow the instructions in the Poetry documentation.
To install this project in editable mode, run the commands below from the root directory of the bellman
repository poetry install
. This command creates a virtual environment for this project in a hidden .venv
directory under the root directory. You can easily activate it with poetry shell
. You must also run the poetry install
command to install updated dependencies when the pyproject.toml
file is updated, for example after a git pull
.
To cite Bellman, please reference our arXiv paper where we review the framework and describe the design. Bibtex reference is given below:
@article{bellman2021,
author = {McLeod, John and Stojic, Hrvoje and Adam, Vincent and Kim, Dongho and Grau-Moya, Jordi and Vrancx, Peter and Leibfried, Felix},
title = {Bellman: A Toolbox for Model-based Reinforcement Learning in TensorFlow},
year = {2021},
journal = {arXiv:2103.14407},
url = {https://arxiv.org/abs/2103.14407}
}