Tau LLM Unity ML Agents Project
Welcome to the Tau LLM Unity ML Agents Project repository! This project focuses on training reinforcement learning agents using Unity ML-Agents and the PPO algorithm. Our goal is to optimize the performance of the agents through various configurations and training runs.
Project Overview
This repository contains the code and configurations for training agents in a Unity environment using the Proximal Policy Optimization (PPO) algorithm. The agents are designed to learn and adapt to their environment, improving their performance over time.
Key Features
- Reinforcement Learning: Utilizes the PPO algorithm for training agents.
- Unity ML-Agents: Integrates with Unity ML-Agents for a seamless training experience.
- Custom Reward Functions: Implements gradient-based reward functions for nuanced feedback.
- Memory Networks: Incorporates memory networks to handle temporal dependencies.
- TensorBoard Integration: Monitors training progress and performance using TensorBoard.
Configuration
Below is the configuration used for training the agents:
behaviors:
TauAgent:
trainer_type: ppo
hyperparameters:
batch_size: 256
buffer_size: 4096
learning_rate: 0.00003
beta: 0.005
epsilon: 0.2
lambd: 0.95
num_epoch: 10
learning_rate_schedule: linear
network_settings:
normalize: true
hidden_units: 256
num_layers: 4
vis_encode_type: simple
memory:
memory_size: 256
sequence_length: 256
num_layers: 4
reward_signals:
extrinsic:
gamma: 0.99
strength: 1.0
curiosity:
gamma: 0.995
strength: 0.1
network_settings:
normalize: true
hidden_units: 256
num_layers: 4
learning_rate: 0.00003
keep_checkpoints: 10
checkpoint_interval: 100000
threaded: true
max_steps: 3000000
time_horizon: 256
summary_freq: 10000
Model Naming Convention
The models in this repository follow the naming convention Tau_<series>_<max_steps>
. This helps in easily identifying the series and the number of training steps for each model.
Getting Started
Prerequisites
- Unity 6
- Unity ML-Agents Toolkit
- Python 3.10.11
- PyTorch
- Transformers
Installation
Clone the repository:
git clone https://github.com/p3nGu1nZz/Tau.git cd tau\MLAgentsProject
Install the required Python packages:
pip install -r requirements.txt
Open the Unity project:
- Launch Unity Hub and open the project folder.
Training the Agent
To start training the agent, run the following command:
mlagents-learn .\config\tau_agent_ppo_c.yaml --run-id=tau_agent_ppo_A0 --env .\Build --torch-device cuda --timeout-wait 300 --force
Note: The preferred way to run a build is by creating a new build into the Build
directory which is referenced by the above command.
Monitoring Training
You can monitor the training progress using TensorBoard:
tensorboard --logdir results
Results
The training results, including the average reward and cumulative reward, can be visualized using TensorBoard. The graphs below show the performance of the agent over time:
Citation
If you use this project in your research, please cite it as follows:
@misc{Tau,
author = {K. Rawson},
title = {Tau LLM Unity ML Agents Project},
year = {2024},
publisher = {GitHub},
journal = {GitHub repository},
howpublished = {\url{https://github.com/p3nGu1nZz/Tau}},
}
License
This project is licensed under the MIT License - see the LICENSE file for details.
Acknowledgments
- Unity ML-Agents Toolkit
- TensorFlow and PyTorch communities
- Hugging Face for hosting the model repository