bellman.training

This package provides helper wrappers for training RL agents which have trainable components. The TF-Agents Agent class has a train method, which in the model-free setting updates the policy parameters. In the model-based setting there may be more than one trainable “component”, for example a transition model and a parameterised policy.

The wrappers provide a consistent interface for training the components of RL agents on a defined schedule. An agent specifies a list of names (implemented in the toolbox by a set of enumerations) of components which will be trained, and an AgentTrainer object defines the schedules at which those components should be trained as well as which “real” data (from the real environment) should be used to train each component.

bellman.training.utils

Utilities to support training of agents.