--- license: mit --- # Tau LLM Unity ML Agents Project Welcome to the Tau LLM Unity ML Agents Project repository! This project focuses on training reinforcement learning agents using Unity ML-Agents and the PPO algorithm. Our goal is to optimize the performance of the agents through various configurations and training runs. ## Project Overview This repository contains the code and configurations for training agents in a Unity environment using the Proximal Policy Optimization (PPO) algorithm. The agents are designed to learn and adapt to their environment, improving their performance over time. ### Key Features - **Reinforcement Learning**: Utilizes the PPO algorithm for training agents. - **Unity ML-Agents**: Integrates with Unity ML-Agents for a seamless training experience. - **Custom Reward Functions**: Implements gradient-based reward functions for nuanced feedback. - **Memory Networks**: Incorporates memory networks to handle temporal dependencies. - **TensorBoard Integration**: Monitors training progress and performance using TensorBoard. ## Configuration Below is the configuration used for training the agents: ```yaml behaviors: TauAgent: trainer_type: ppo hyperparameters: batch_size: 256 buffer_size: 4096 learning_rate: 0.00003 beta: 0.005 epsilon: 0.2 lambd: 0.95 num_epoch: 10 learning_rate_schedule: linear network_settings: normalize: true hidden_units: 256 num_layers: 4 vis_encode_type: simple memory: memory_size: 256 sequence_length: 256 num_layers: 4 reward_signals: extrinsic: gamma: 0.99 strength: 1.0 curiosity: gamma: 0.995 strength: 0.1 network_settings: normalize: true hidden_units: 256 num_layers: 4 learning_rate: 0.00003 keep_checkpoints: 10 checkpoint_interval: 100000 threaded: true max_steps: 3000000 time_horizon: 256 summary_freq: 10000 ``` ## Model Naming Convention The models in this repository follow the naming convention `Tau__`. This helps in easily identifying the series and the number of training steps for each model. ## Getting Started ### Prerequisites - Unity 6 - Unity ML-Agents Toolkit - Python 3.10.11 - PyTorch - Transformers ### Installation 1. Clone the repository: ```bash git clone https://github.com/p3nGu1nZz/Tau.git cd tau\MLAgentsProject ``` 2. Install the required Python packages: ```bash pip install -r requirements.txt ``` 3. Open the Unity project: - Launch Unity Hub and open the project folder. ### Training the Agent To start training the agent, run the following command: ```bash mlagents-learn .\config\tau_agent_ppo_c.yaml --run-id=tau_agent_ppo_A0 --env .\Build --torch-device cuda --timeout-wait 300 --force ``` Note: The preferred way to run a build is by creating a new build into the `Build` directory which is referenced by the above command. ### Monitoring Training You can monitor the training progress using TensorBoard: ```bash tensorboard --logdir results ``` ## Results The training results, including the average reward and cumulative reward, can be visualized using TensorBoard. The graphs below show the performance of the agent over time: ![Average Reward](path/to/average_reward.png) ![Cumulative Reward](path/to/cumulative_reward.png) ## Citation If you use this project in your research, please cite it as follows: ```bibtex @misc{Tau, author = {K. Rawson}, title = {Tau LLM Unity ML Agents Project}, year = {2024}, publisher = {GitHub}, journal = {GitHub repository}, howpublished = {\url{https://github.com/p3nGu1nZz/Tau}}, } ``` ## License This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details. ## Acknowledgments - Unity ML-Agents Toolkit - TensorFlow and PyTorch communities - Hugging Face for hosting the model repository