A2C Agent playing PandaReachDense-v3
This is a trained model of a A2C agent playing PandaReachDense-v3 using the stable-baselines3 library.
Usage (with Stable-baselines3)
from stable_baselines3 import A2C
from huggingface_sb3 import load_from_hub
model = load_from_hub(repo_id='Francesco-A/a2c-PandaReachDense-v3',
filename= 'a2c-PandaReachDense-v3.zip')
Training details (last output)
Metric | Value |
---|---|
rollout/ep_len_mean | 4.05 |
rollout/ep_rew_mean | -0.317 |
time/fps | 378 |
time/iterations | 50000 |
time/time_elapsed | 2641 |
time/total_timesteps | 1000000 |
train/entropy_loss | 1.25 |
train/explained_variance | 0.975 |
train/learning_rate | 0.0007 |
train/n_updates | 49999 |
train/policy_loss | -0.0935 |
train/std | 0.185 |
train/value_loss | 0.0306 |
- Downloads last month
- 4
Evaluation results
- mean_reward on PandaReachDense-v3self-reported-0.20 +/- 0.09