TRL

TRL - Transformer Reinforcement Learning

With the TRL (Transformer Reinforcement Learning) library you can train transformer language models with reinforcement learning. The library is integrated with 🤗 transformers.

TRL supports decoder models such as GPT-2, BLOOM, GPT-Neo which can all be optimized using Proximal Policy Optimization (PPO). You can find installation instructions in the installation guide and an introduction to the library in the Quickstart section. There is also a more in-depth example to tune GPT-2 to produce positive movie reviews.