File size: 1,852 Bytes
8042066
fdae1ab
 
 
 
 
 
8042066
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
---
library_name: stable-baselines3
tags:
- reinforcement-learning
- ppo
- game-agent
---
license: apache-2.0
language:
- en
pipeline_tag: reinforcement-learning
tags:
- web
- game
- CosmicVoyage
---This model is a reinforcement learning agent trained to autonomously navigate and control the web-based game Cosmic Voyager. Utilizing the Proximal Policy Optimization (PPO) algorithm, the agent learns optimal strategies to maximize in-game performance.

Training Configuration:

Algorithm: Proximal Policy Optimization (PPO)
Policy: Convolutional Neural Network (CnnPolicy)
Learning Rate: 5e-5
Batch Size: 256
Number of Steps per Update (n_steps): 2048
Number of Epochs: 20
Maximum Gradient Norm (max_grad_norm): 0.75
Discount Factor (gamma): 0.95
GAE Lambda (gae_lambda): 0.95
Clip Range: 0.1
Entropy Coefficient (ent_coef): 0.02
Target KL Divergence (target_kl): 0.025
Total Timesteps: 3,000,000
Policy Architecture:

Feature Extractor Dimensions: 1024
Network Architecture:
Policy Network (pi): [1024, 512, 256]
Value Function Network (vf): [1024, 512, 256]
Activation Function: LeakyReLU
Image Normalization: Disabled
Environment Configuration:

Observation Dimensions: Adjusted to fit the game's requirements
Frame Stacking: Implemented to provide temporal context
Usage:

This model is designed to be integrated into the Cosmic Voyager game, enabling autonomous gameplay. For integration details and deployment instructions, please refer to the accompanying documentation.

Training Monitoring:

Training progress and metrics were tracked using Weights & Biases under the project 'Cosmic Voyager RL' by the entity 'andiB1293'.

Disclaimer:

This model is tailored specifically for the Cosmic Voyager game environment. Performance in different settings or games may vary. Users are advised to test the model thoroughly in their specific use cases.