|
[2024-10-23 06:06:45,610][02423] Saving configuration to /content/train_dir/default_experiment/config.json... |
|
[2024-10-23 06:06:45,615][02423] Rollout worker 0 uses device cpu |
|
[2024-10-23 06:06:45,617][02423] Rollout worker 1 uses device cpu |
|
[2024-10-23 06:06:45,618][02423] Rollout worker 2 uses device cpu |
|
[2024-10-23 06:06:45,620][02423] Rollout worker 3 uses device cpu |
|
[2024-10-23 06:06:45,621][02423] Rollout worker 4 uses device cpu |
|
[2024-10-23 06:06:45,622][02423] Rollout worker 5 uses device cpu |
|
[2024-10-23 06:06:45,623][02423] Rollout worker 6 uses device cpu |
|
[2024-10-23 06:06:45,624][02423] Rollout worker 7 uses device cpu |
|
[2024-10-23 06:06:45,780][02423] Using GPUs [0] for process 0 (actually maps to GPUs [0]) |
|
[2024-10-23 06:06:45,782][02423] InferenceWorker_p0-w0: min num requests: 2 |
|
[2024-10-23 06:06:45,816][02423] Starting all processes... |
|
[2024-10-23 06:06:45,817][02423] Starting process learner_proc0 |
|
[2024-10-23 06:06:47,863][02423] Starting all processes... |
|
[2024-10-23 06:06:47,873][02423] Starting process inference_proc0-0 |
|
[2024-10-23 06:06:47,874][02423] Starting process rollout_proc0 |
|
[2024-10-23 06:06:47,875][02423] Starting process rollout_proc1 |
|
[2024-10-23 06:06:47,876][02423] Starting process rollout_proc2 |
|
[2024-10-23 06:06:47,876][02423] Starting process rollout_proc3 |
|
[2024-10-23 06:06:47,876][02423] Starting process rollout_proc4 |
|
[2024-10-23 06:06:47,876][02423] Starting process rollout_proc5 |
|
[2024-10-23 06:06:47,876][02423] Starting process rollout_proc6 |
|
[2024-10-23 06:06:47,876][02423] Starting process rollout_proc7 |
|
[2024-10-23 06:07:04,164][04577] Worker 2 uses CPU cores [0] |
|
[2024-10-23 06:07:04,167][04562] Using GPUs [0] for process 0 (actually maps to GPUs [0]) |
|
[2024-10-23 06:07:04,168][04562] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0 |
|
[2024-10-23 06:07:04,227][04562] Num visible devices: 1 |
|
[2024-10-23 06:07:04,264][04562] Starting seed is not provided |
|
[2024-10-23 06:07:04,265][04562] Using GPUs [0] for process 0 (actually maps to GPUs [0]) |
|
[2024-10-23 06:07:04,266][04562] Initializing actor-critic model on device cuda:0 |
|
[2024-10-23 06:07:04,268][04562] RunningMeanStd input shape: (3, 72, 128) |
|
[2024-10-23 06:07:04,271][04562] RunningMeanStd input shape: (1,) |
|
[2024-10-23 06:07:04,331][04579] Worker 3 uses CPU cores [1] |
|
[2024-10-23 06:07:04,351][04562] ConvEncoder: input_channels=3 |
|
[2024-10-23 06:07:04,390][04575] Using GPUs [0] for process 0 (actually maps to GPUs [0]) |
|
[2024-10-23 06:07:04,393][04575] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0 |
|
[2024-10-23 06:07:04,477][04575] Num visible devices: 1 |
|
[2024-10-23 06:07:04,553][04578] Worker 1 uses CPU cores [1] |
|
[2024-10-23 06:07:04,627][04576] Worker 0 uses CPU cores [0] |
|
[2024-10-23 06:07:04,640][04581] Worker 5 uses CPU cores [1] |
|
[2024-10-23 06:07:04,642][04582] Worker 6 uses CPU cores [0] |
|
[2024-10-23 06:07:04,642][04580] Worker 4 uses CPU cores [0] |
|
[2024-10-23 06:07:04,701][04587] Worker 7 uses CPU cores [1] |
|
[2024-10-23 06:07:04,747][04562] Conv encoder output size: 512 |
|
[2024-10-23 06:07:04,748][04562] Policy head output size: 512 |
|
[2024-10-23 06:07:04,807][04562] Created Actor Critic model with architecture: |
|
[2024-10-23 06:07:04,807][04562] ActorCriticSharedWeights( |
|
(obs_normalizer): ObservationNormalizer( |
|
(running_mean_std): RunningMeanStdDictInPlace( |
|
(running_mean_std): ModuleDict( |
|
(obs): RunningMeanStdInPlace() |
|
) |
|
) |
|
) |
|
(returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace) |
|
(encoder): VizdoomEncoder( |
|
(basic_encoder): ConvEncoder( |
|
(enc): RecursiveScriptModule( |
|
original_name=ConvEncoderImpl |
|
(conv_head): RecursiveScriptModule( |
|
original_name=Sequential |
|
(0): RecursiveScriptModule(original_name=Conv2d) |
|
(1): RecursiveScriptModule(original_name=ELU) |
|
(2): RecursiveScriptModule(original_name=Conv2d) |
|
(3): RecursiveScriptModule(original_name=ELU) |
|
(4): RecursiveScriptModule(original_name=Conv2d) |
|
(5): RecursiveScriptModule(original_name=ELU) |
|
) |
|
(mlp_layers): RecursiveScriptModule( |
|
original_name=Sequential |
|
(0): RecursiveScriptModule(original_name=Linear) |
|
(1): RecursiveScriptModule(original_name=ELU) |
|
) |
|
) |
|
) |
|
) |
|
(core): ModelCoreRNN( |
|
(core): GRU(512, 512) |
|
) |
|
(decoder): MlpDecoder( |
|
(mlp): Identity() |
|
) |
|
(critic_linear): Linear(in_features=512, out_features=1, bias=True) |
|
(action_parameterization): ActionParameterizationDefault( |
|
(distribution_linear): Linear(in_features=512, out_features=5, bias=True) |
|
) |
|
) |
|
[2024-10-23 06:07:05,109][04562] Using optimizer <class 'torch.optim.adam.Adam'> |
|
[2024-10-23 06:07:05,781][02423] Heartbeat connected on InferenceWorker_p0-w0 |
|
[2024-10-23 06:07:05,793][02423] Heartbeat connected on RolloutWorker_w1 |
|
[2024-10-23 06:07:05,796][02423] Heartbeat connected on RolloutWorker_w0 |
|
[2024-10-23 06:07:05,805][02423] Heartbeat connected on RolloutWorker_w3 |
|
[2024-10-23 06:07:05,807][02423] Heartbeat connected on RolloutWorker_w2 |
|
[2024-10-23 06:07:05,809][02423] Heartbeat connected on RolloutWorker_w4 |
|
[2024-10-23 06:07:05,814][02423] Heartbeat connected on RolloutWorker_w5 |
|
[2024-10-23 06:07:05,816][02423] Heartbeat connected on RolloutWorker_w6 |
|
[2024-10-23 06:07:05,826][02423] Heartbeat connected on RolloutWorker_w7 |
|
[2024-10-23 06:07:05,828][02423] Heartbeat connected on Batcher_0 |
|
[2024-10-23 06:07:06,234][04562] No checkpoints found |
|
[2024-10-23 06:07:06,234][04562] Did not load from checkpoint, starting from scratch! |
|
[2024-10-23 06:07:06,234][04562] Initialized policy 0 weights for model version 0 |
|
[2024-10-23 06:07:06,239][04562] Using GPUs [0] for process 0 (actually maps to GPUs [0]) |
|
[2024-10-23 06:07:06,247][04562] LearnerWorker_p0 finished initialization! |
|
[2024-10-23 06:07:06,248][02423] Heartbeat connected on LearnerWorker_p0 |
|
[2024-10-23 06:07:06,340][04575] RunningMeanStd input shape: (3, 72, 128) |
|
[2024-10-23 06:07:06,342][04575] RunningMeanStd input shape: (1,) |
|
[2024-10-23 06:07:06,355][04575] ConvEncoder: input_channels=3 |
|
[2024-10-23 06:07:06,468][04575] Conv encoder output size: 512 |
|
[2024-10-23 06:07:06,468][04575] Policy head output size: 512 |
|
[2024-10-23 06:07:06,530][02423] Inference worker 0-0 is ready! |
|
[2024-10-23 06:07:06,532][02423] All inference workers are ready! Signal rollout workers to start! |
|
[2024-10-23 06:07:06,736][04581] Doom resolution: 160x120, resize resolution: (128, 72) |
|
[2024-10-23 06:07:06,742][04579] Doom resolution: 160x120, resize resolution: (128, 72) |
|
[2024-10-23 06:07:06,747][04576] Doom resolution: 160x120, resize resolution: (128, 72) |
|
[2024-10-23 06:07:06,743][04580] Doom resolution: 160x120, resize resolution: (128, 72) |
|
[2024-10-23 06:07:06,740][04578] Doom resolution: 160x120, resize resolution: (128, 72) |
|
[2024-10-23 06:07:06,750][04582] Doom resolution: 160x120, resize resolution: (128, 72) |
|
[2024-10-23 06:07:06,744][04587] Doom resolution: 160x120, resize resolution: (128, 72) |
|
[2024-10-23 06:07:06,751][04577] Doom resolution: 160x120, resize resolution: (128, 72) |
|
[2024-10-23 06:07:07,776][04581] Decorrelating experience for 0 frames... |
|
[2024-10-23 06:07:07,774][04587] Decorrelating experience for 0 frames... |
|
[2024-10-23 06:07:08,129][04580] Decorrelating experience for 0 frames... |
|
[2024-10-23 06:07:08,136][04576] Decorrelating experience for 0 frames... |
|
[2024-10-23 06:07:08,140][04577] Decorrelating experience for 0 frames... |
|
[2024-10-23 06:07:08,739][04587] Decorrelating experience for 32 frames... |
|
[2024-10-23 06:07:08,845][04579] Decorrelating experience for 0 frames... |
|
[2024-10-23 06:07:09,431][04581] Decorrelating experience for 32 frames... |
|
[2024-10-23 06:07:09,580][04576] Decorrelating experience for 32 frames... |
|
[2024-10-23 06:07:09,587][04580] Decorrelating experience for 32 frames... |
|
[2024-10-23 06:07:09,596][04577] Decorrelating experience for 32 frames... |
|
[2024-10-23 06:07:09,957][02423] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 0. Throughput: 0: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) |
|
[2024-10-23 06:07:10,707][04587] Decorrelating experience for 64 frames... |
|
[2024-10-23 06:07:11,189][04578] Decorrelating experience for 0 frames... |
|
[2024-10-23 06:07:11,343][04579] Decorrelating experience for 32 frames... |
|
[2024-10-23 06:07:11,909][04577] Decorrelating experience for 64 frames... |
|
[2024-10-23 06:07:11,913][04580] Decorrelating experience for 64 frames... |
|
[2024-10-23 06:07:11,915][04576] Decorrelating experience for 64 frames... |
|
[2024-10-23 06:07:12,748][04581] Decorrelating experience for 64 frames... |
|
[2024-10-23 06:07:12,856][04587] Decorrelating experience for 96 frames... |
|
[2024-10-23 06:07:13,511][04578] Decorrelating experience for 32 frames... |
|
[2024-10-23 06:07:14,732][04580] Decorrelating experience for 96 frames... |
|
[2024-10-23 06:07:14,748][04576] Decorrelating experience for 96 frames... |
|
[2024-10-23 06:07:14,786][04579] Decorrelating experience for 64 frames... |
|
[2024-10-23 06:07:14,877][04581] Decorrelating experience for 96 frames... |
|
[2024-10-23 06:07:14,957][02423] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 0.0. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) |
|
[2024-10-23 06:07:15,398][04577] Decorrelating experience for 96 frames... |
|
[2024-10-23 06:07:15,966][04578] Decorrelating experience for 64 frames... |
|
[2024-10-23 06:07:16,104][04579] Decorrelating experience for 96 frames... |
|
[2024-10-23 06:07:16,207][04582] Decorrelating experience for 0 frames... |
|
[2024-10-23 06:07:18,652][04562] Signal inference workers to stop experience collection... |
|
[2024-10-23 06:07:18,672][04575] InferenceWorker_p0-w0: stopping experience collection |
|
[2024-10-23 06:07:18,717][04582] Decorrelating experience for 32 frames... |
|
[2024-10-23 06:07:18,899][04578] Decorrelating experience for 96 frames... |
|
[2024-10-23 06:07:19,165][04582] Decorrelating experience for 64 frames... |
|
[2024-10-23 06:07:19,527][04582] Decorrelating experience for 96 frames... |
|
[2024-10-23 06:07:19,957][02423] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 218.8. Samples: 2188. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) |
|
[2024-10-23 06:07:19,964][02423] Avg episode reward: [(0, '2.375')] |
|
[2024-10-23 06:07:22,212][04562] Signal inference workers to resume experience collection... |
|
[2024-10-23 06:07:22,214][04575] InferenceWorker_p0-w0: resuming experience collection |
|
[2024-10-23 06:07:24,957][02423] Fps is (10 sec: 1638.4, 60 sec: 1092.3, 300 sec: 1092.3). Total num frames: 16384. Throughput: 0: 309.5. Samples: 4642. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) |
|
[2024-10-23 06:07:24,973][02423] Avg episode reward: [(0, '3.430')] |
|
[2024-10-23 06:07:29,957][02423] Fps is (10 sec: 2867.2, 60 sec: 1433.6, 300 sec: 1433.6). Total num frames: 28672. Throughput: 0: 340.8. Samples: 6816. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2024-10-23 06:07:29,962][02423] Avg episode reward: [(0, '3.746')] |
|
[2024-10-23 06:07:33,599][04575] Updated weights for policy 0, policy_version 10 (0.0164) |
|
[2024-10-23 06:07:34,957][02423] Fps is (10 sec: 2867.2, 60 sec: 1802.2, 300 sec: 1802.2). Total num frames: 45056. Throughput: 0: 434.3. Samples: 10858. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2024-10-23 06:07:34,964][02423] Avg episode reward: [(0, '4.349')] |
|
[2024-10-23 06:07:39,957][02423] Fps is (10 sec: 4096.0, 60 sec: 2321.1, 300 sec: 2321.1). Total num frames: 69632. Throughput: 0: 594.6. Samples: 17838. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2024-10-23 06:07:39,960][02423] Avg episode reward: [(0, '4.530')] |
|
[2024-10-23 06:07:42,485][04575] Updated weights for policy 0, policy_version 20 (0.0028) |
|
[2024-10-23 06:07:44,962][02423] Fps is (10 sec: 4094.1, 60 sec: 2457.3, 300 sec: 2457.3). Total num frames: 86016. Throughput: 0: 606.7. Samples: 21238. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) |
|
[2024-10-23 06:07:44,965][02423] Avg episode reward: [(0, '4.472')] |
|
[2024-10-23 06:07:49,958][02423] Fps is (10 sec: 3276.7, 60 sec: 2560.0, 300 sec: 2560.0). Total num frames: 102400. Throughput: 0: 639.5. Samples: 25582. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2024-10-23 06:07:49,965][02423] Avg episode reward: [(0, '4.383')] |
|
[2024-10-23 06:07:49,975][04562] Saving new best policy, reward=4.383! |
|
[2024-10-23 06:07:53,856][04575] Updated weights for policy 0, policy_version 30 (0.0031) |
|
[2024-10-23 06:07:54,957][02423] Fps is (10 sec: 4097.9, 60 sec: 2821.7, 300 sec: 2821.7). Total num frames: 126976. Throughput: 0: 710.8. Samples: 31988. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) |
|
[2024-10-23 06:07:54,959][02423] Avg episode reward: [(0, '4.476')] |
|
[2024-10-23 06:07:54,968][04562] Saving new best policy, reward=4.476! |
|
[2024-10-23 06:07:59,957][02423] Fps is (10 sec: 4505.8, 60 sec: 2949.1, 300 sec: 2949.1). Total num frames: 147456. Throughput: 0: 787.2. Samples: 35422. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2024-10-23 06:07:59,964][02423] Avg episode reward: [(0, '4.353')] |
|
[2024-10-23 06:08:04,957][02423] Fps is (10 sec: 3276.8, 60 sec: 2904.4, 300 sec: 2904.4). Total num frames: 159744. Throughput: 0: 850.5. Samples: 40462. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2024-10-23 06:08:04,960][02423] Avg episode reward: [(0, '4.379')] |
|
[2024-10-23 06:08:05,210][04575] Updated weights for policy 0, policy_version 40 (0.0026) |
|
[2024-10-23 06:08:09,957][02423] Fps is (10 sec: 3276.8, 60 sec: 3003.7, 300 sec: 3003.7). Total num frames: 180224. Throughput: 0: 916.0. Samples: 45860. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2024-10-23 06:08:09,965][02423] Avg episode reward: [(0, '4.402')] |
|
[2024-10-23 06:08:14,612][04575] Updated weights for policy 0, policy_version 50 (0.0026) |
|
[2024-10-23 06:08:14,957][02423] Fps is (10 sec: 4505.6, 60 sec: 3413.3, 300 sec: 3150.8). Total num frames: 204800. Throughput: 0: 944.6. Samples: 49322. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2024-10-23 06:08:14,960][02423] Avg episode reward: [(0, '4.275')] |
|
[2024-10-23 06:08:19,959][02423] Fps is (10 sec: 4095.4, 60 sec: 3686.3, 300 sec: 3159.7). Total num frames: 221184. Throughput: 0: 990.7. Samples: 55442. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2024-10-23 06:08:19,961][02423] Avg episode reward: [(0, '4.313')] |
|
[2024-10-23 06:08:24,957][02423] Fps is (10 sec: 3276.8, 60 sec: 3686.4, 300 sec: 3167.6). Total num frames: 237568. Throughput: 0: 936.8. Samples: 59996. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2024-10-23 06:08:24,960][02423] Avg episode reward: [(0, '4.489')] |
|
[2024-10-23 06:08:24,963][04562] Saving new best policy, reward=4.489! |
|
[2024-10-23 06:08:26,287][04575] Updated weights for policy 0, policy_version 60 (0.0022) |
|
[2024-10-23 06:08:29,960][02423] Fps is (10 sec: 4095.5, 60 sec: 3891.0, 300 sec: 3276.7). Total num frames: 262144. Throughput: 0: 937.0. Samples: 63400. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2024-10-23 06:08:29,963][02423] Avg episode reward: [(0, '4.617')] |
|
[2024-10-23 06:08:29,976][04562] Saving new best policy, reward=4.617! |
|
[2024-10-23 06:08:34,957][02423] Fps is (10 sec: 4505.7, 60 sec: 3959.5, 300 sec: 3325.0). Total num frames: 282624. Throughput: 0: 991.9. Samples: 70218. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2024-10-23 06:08:34,961][02423] Avg episode reward: [(0, '4.538')] |
|
[2024-10-23 06:08:36,232][04575] Updated weights for policy 0, policy_version 70 (0.0039) |
|
[2024-10-23 06:08:39,960][02423] Fps is (10 sec: 3276.9, 60 sec: 3754.5, 300 sec: 3276.7). Total num frames: 294912. Throughput: 0: 943.4. Samples: 74444. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2024-10-23 06:08:39,965][02423] Avg episode reward: [(0, '4.467')] |
|
[2024-10-23 06:08:39,976][04562] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000072_294912.pth... |
|
[2024-10-23 06:08:44,957][02423] Fps is (10 sec: 3276.8, 60 sec: 3823.2, 300 sec: 3319.9). Total num frames: 315392. Throughput: 0: 928.5. Samples: 77204. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2024-10-23 06:08:44,962][02423] Avg episode reward: [(0, '4.366')] |
|
[2024-10-23 06:08:47,117][04575] Updated weights for policy 0, policy_version 80 (0.0031) |
|
[2024-10-23 06:08:49,957][02423] Fps is (10 sec: 4506.7, 60 sec: 3959.5, 300 sec: 3399.7). Total num frames: 339968. Throughput: 0: 971.4. Samples: 84176. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) |
|
[2024-10-23 06:08:49,964][02423] Avg episode reward: [(0, '4.452')] |
|
[2024-10-23 06:08:54,957][02423] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3354.8). Total num frames: 352256. Throughput: 0: 965.0. Samples: 89284. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2024-10-23 06:08:54,960][02423] Avg episode reward: [(0, '4.337')] |
|
[2024-10-23 06:08:58,907][04575] Updated weights for policy 0, policy_version 90 (0.0027) |
|
[2024-10-23 06:08:59,957][02423] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3388.5). Total num frames: 372736. Throughput: 0: 934.7. Samples: 91382. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2024-10-23 06:08:59,963][02423] Avg episode reward: [(0, '4.379')] |
|
[2024-10-23 06:09:04,957][02423] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3419.3). Total num frames: 393216. Throughput: 0: 946.6. Samples: 98038. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) |
|
[2024-10-23 06:09:04,964][02423] Avg episode reward: [(0, '4.598')] |
|
[2024-10-23 06:09:08,030][04575] Updated weights for policy 0, policy_version 100 (0.0035) |
|
[2024-10-23 06:09:09,957][02423] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3447.5). Total num frames: 413696. Throughput: 0: 977.6. Samples: 103986. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2024-10-23 06:09:09,960][02423] Avg episode reward: [(0, '4.564')] |
|
[2024-10-23 06:09:14,957][02423] Fps is (10 sec: 3276.8, 60 sec: 3686.4, 300 sec: 3407.9). Total num frames: 425984. Throughput: 0: 947.7. Samples: 106044. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2024-10-23 06:09:14,961][02423] Avg episode reward: [(0, '4.445')] |
|
[2024-10-23 06:09:19,443][04575] Updated weights for policy 0, policy_version 110 (0.0015) |
|
[2024-10-23 06:09:19,957][02423] Fps is (10 sec: 3686.4, 60 sec: 3823.0, 300 sec: 3465.8). Total num frames: 450560. Throughput: 0: 931.3. Samples: 112128. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2024-10-23 06:09:19,960][02423] Avg episode reward: [(0, '4.553')] |
|
[2024-10-23 06:09:24,957][02423] Fps is (10 sec: 4505.6, 60 sec: 3891.2, 300 sec: 3489.2). Total num frames: 471040. Throughput: 0: 990.6. Samples: 119020. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2024-10-23 06:09:24,963][02423] Avg episode reward: [(0, '4.602')] |
|
[2024-10-23 06:09:29,957][02423] Fps is (10 sec: 3686.4, 60 sec: 3754.9, 300 sec: 3481.6). Total num frames: 487424. Throughput: 0: 977.1. Samples: 121172. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2024-10-23 06:09:29,964][02423] Avg episode reward: [(0, '4.359')] |
|
[2024-10-23 06:09:30,791][04575] Updated weights for policy 0, policy_version 120 (0.0039) |
|
[2024-10-23 06:09:34,957][02423] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3502.8). Total num frames: 507904. Throughput: 0: 938.1. Samples: 126390. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) |
|
[2024-10-23 06:09:34,961][02423] Avg episode reward: [(0, '4.406')] |
|
[2024-10-23 06:09:39,957][02423] Fps is (10 sec: 4096.0, 60 sec: 3891.4, 300 sec: 3522.6). Total num frames: 528384. Throughput: 0: 973.6. Samples: 133096. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) |
|
[2024-10-23 06:09:39,964][02423] Avg episode reward: [(0, '4.635')] |
|
[2024-10-23 06:09:39,974][04562] Saving new best policy, reward=4.635! |
|
[2024-10-23 06:09:40,314][04575] Updated weights for policy 0, policy_version 130 (0.0032) |
|
[2024-10-23 06:09:44,957][02423] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3514.6). Total num frames: 544768. Throughput: 0: 987.9. Samples: 135836. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2024-10-23 06:09:44,962][02423] Avg episode reward: [(0, '4.504')] |
|
[2024-10-23 06:09:49,957][02423] Fps is (10 sec: 3276.8, 60 sec: 3686.4, 300 sec: 3507.2). Total num frames: 561152. Throughput: 0: 936.0. Samples: 140158. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) |
|
[2024-10-23 06:09:49,963][02423] Avg episode reward: [(0, '4.508')] |
|
[2024-10-23 06:09:51,966][04575] Updated weights for policy 0, policy_version 140 (0.0024) |
|
[2024-10-23 06:09:54,958][02423] Fps is (10 sec: 4095.9, 60 sec: 3891.2, 300 sec: 3549.9). Total num frames: 585728. Throughput: 0: 957.0. Samples: 147052. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) |
|
[2024-10-23 06:09:54,962][02423] Avg episode reward: [(0, '4.371')] |
|
[2024-10-23 06:09:59,957][02423] Fps is (10 sec: 4505.6, 60 sec: 3891.2, 300 sec: 3565.9). Total num frames: 606208. Throughput: 0: 989.0. Samples: 150548. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) |
|
[2024-10-23 06:09:59,961][02423] Avg episode reward: [(0, '4.442')] |
|
[2024-10-23 06:10:02,493][04575] Updated weights for policy 0, policy_version 150 (0.0030) |
|
[2024-10-23 06:10:04,958][02423] Fps is (10 sec: 3276.6, 60 sec: 3754.6, 300 sec: 3534.2). Total num frames: 618496. Throughput: 0: 951.1. Samples: 154930. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2024-10-23 06:10:04,963][02423] Avg episode reward: [(0, '4.456')] |
|
[2024-10-23 06:10:09,957][02423] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3572.6). Total num frames: 643072. Throughput: 0: 934.8. Samples: 161084. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2024-10-23 06:10:09,964][02423] Avg episode reward: [(0, '4.399')] |
|
[2024-10-23 06:10:12,446][04575] Updated weights for policy 0, policy_version 160 (0.0034) |
|
[2024-10-23 06:10:14,957][02423] Fps is (10 sec: 4505.9, 60 sec: 3959.5, 300 sec: 3586.8). Total num frames: 663552. Throughput: 0: 962.0. Samples: 164462. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2024-10-23 06:10:14,960][02423] Avg episode reward: [(0, '4.539')] |
|
[2024-10-23 06:10:19,957][02423] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3578.6). Total num frames: 679936. Throughput: 0: 964.4. Samples: 169788. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) |
|
[2024-10-23 06:10:19,964][02423] Avg episode reward: [(0, '4.486')] |
|
[2024-10-23 06:10:24,186][04575] Updated weights for policy 0, policy_version 170 (0.0042) |
|
[2024-10-23 06:10:24,957][02423] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3570.9). Total num frames: 696320. Throughput: 0: 932.6. Samples: 175064. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2024-10-23 06:10:24,960][02423] Avg episode reward: [(0, '4.367')] |
|
[2024-10-23 06:10:29,957][02423] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3604.5). Total num frames: 720896. Throughput: 0: 948.0. Samples: 178498. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2024-10-23 06:10:29,963][02423] Avg episode reward: [(0, '4.524')] |
|
[2024-10-23 06:10:33,587][04575] Updated weights for policy 0, policy_version 180 (0.0025) |
|
[2024-10-23 06:10:34,957][02423] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3596.5). Total num frames: 737280. Throughput: 0: 994.6. Samples: 184916. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2024-10-23 06:10:34,960][02423] Avg episode reward: [(0, '4.792')] |
|
[2024-10-23 06:10:34,980][04562] Saving new best policy, reward=4.792! |
|
[2024-10-23 06:10:39,958][02423] Fps is (10 sec: 3276.7, 60 sec: 3754.7, 300 sec: 3588.9). Total num frames: 753664. Throughput: 0: 932.7. Samples: 189022. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2024-10-23 06:10:39,960][02423] Avg episode reward: [(0, '4.678')] |
|
[2024-10-23 06:10:39,972][04562] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000184_753664.pth... |
|
[2024-10-23 06:10:44,810][04575] Updated weights for policy 0, policy_version 190 (0.0025) |
|
[2024-10-23 06:10:44,957][02423] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3619.7). Total num frames: 778240. Throughput: 0: 930.8. Samples: 192432. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2024-10-23 06:10:44,960][02423] Avg episode reward: [(0, '4.679')] |
|
[2024-10-23 06:10:49,962][02423] Fps is (10 sec: 4503.6, 60 sec: 3959.2, 300 sec: 3630.5). Total num frames: 798720. Throughput: 0: 984.0. Samples: 199214. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2024-10-23 06:10:49,965][02423] Avg episode reward: [(0, '4.644')] |
|
[2024-10-23 06:10:54,957][02423] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3604.5). Total num frames: 811008. Throughput: 0: 951.2. Samples: 203886. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2024-10-23 06:10:54,962][02423] Avg episode reward: [(0, '4.515')] |
|
[2024-10-23 06:10:56,532][04575] Updated weights for policy 0, policy_version 200 (0.0030) |
|
[2024-10-23 06:10:59,957][02423] Fps is (10 sec: 3278.3, 60 sec: 3754.7, 300 sec: 3615.2). Total num frames: 831488. Throughput: 0: 932.3. Samples: 206414. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2024-10-23 06:10:59,964][02423] Avg episode reward: [(0, '4.510')] |
|
[2024-10-23 06:11:04,958][02423] Fps is (10 sec: 4505.5, 60 sec: 3959.5, 300 sec: 3642.8). Total num frames: 856064. Throughput: 0: 968.7. Samples: 213380. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2024-10-23 06:11:04,964][02423] Avg episode reward: [(0, '4.488')] |
|
[2024-10-23 06:11:05,603][04575] Updated weights for policy 0, policy_version 210 (0.0033) |
|
[2024-10-23 06:11:09,958][02423] Fps is (10 sec: 4095.9, 60 sec: 3822.9, 300 sec: 3635.2). Total num frames: 872448. Throughput: 0: 971.7. Samples: 218790. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2024-10-23 06:11:09,960][02423] Avg episode reward: [(0, '4.416')] |
|
[2024-10-23 06:11:14,957][02423] Fps is (10 sec: 3276.9, 60 sec: 3754.7, 300 sec: 3627.9). Total num frames: 888832. Throughput: 0: 943.8. Samples: 220968. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2024-10-23 06:11:14,960][02423] Avg episode reward: [(0, '4.526')] |
|
[2024-10-23 06:11:17,201][04575] Updated weights for policy 0, policy_version 220 (0.0019) |
|
[2024-10-23 06:11:19,957][02423] Fps is (10 sec: 4096.1, 60 sec: 3891.2, 300 sec: 3653.6). Total num frames: 913408. Throughput: 0: 942.0. Samples: 227304. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2024-10-23 06:11:19,960][02423] Avg episode reward: [(0, '4.336')] |
|
[2024-10-23 06:11:24,958][02423] Fps is (10 sec: 4505.3, 60 sec: 3959.4, 300 sec: 3662.3). Total num frames: 933888. Throughput: 0: 996.3. Samples: 233858. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2024-10-23 06:11:24,960][02423] Avg episode reward: [(0, '4.488')] |
|
[2024-10-23 06:11:27,608][04575] Updated weights for policy 0, policy_version 230 (0.0027) |
|
[2024-10-23 06:11:29,957][02423] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3639.1). Total num frames: 946176. Throughput: 0: 965.2. Samples: 235864. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2024-10-23 06:11:29,967][02423] Avg episode reward: [(0, '4.571')] |
|
[2024-10-23 06:11:34,957][02423] Fps is (10 sec: 2457.8, 60 sec: 3686.4, 300 sec: 3616.8). Total num frames: 958464. Throughput: 0: 894.3. Samples: 239452. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2024-10-23 06:11:34,963][02423] Avg episode reward: [(0, '4.495')] |
|
[2024-10-23 06:11:39,957][02423] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3625.7). Total num frames: 978944. Throughput: 0: 920.5. Samples: 245310. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2024-10-23 06:11:39,964][02423] Avg episode reward: [(0, '4.461')] |
|
[2024-10-23 06:11:40,534][04575] Updated weights for policy 0, policy_version 240 (0.0027) |
|
[2024-10-23 06:11:44,958][02423] Fps is (10 sec: 3686.3, 60 sec: 3618.1, 300 sec: 3619.4). Total num frames: 995328. Throughput: 0: 936.1. Samples: 248538. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2024-10-23 06:11:44,963][02423] Avg episode reward: [(0, '4.646')] |
|
[2024-10-23 06:11:49,957][02423] Fps is (10 sec: 3276.8, 60 sec: 3550.1, 300 sec: 3613.3). Total num frames: 1011712. Throughput: 0: 875.3. Samples: 252770. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2024-10-23 06:11:49,964][02423] Avg episode reward: [(0, '4.719')] |
|
[2024-10-23 06:11:52,000][04575] Updated weights for policy 0, policy_version 250 (0.0034) |
|
[2024-10-23 06:11:54,957][02423] Fps is (10 sec: 4096.1, 60 sec: 3754.7, 300 sec: 3636.1). Total num frames: 1036288. Throughput: 0: 908.7. Samples: 259682. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2024-10-23 06:11:54,962][02423] Avg episode reward: [(0, '4.736')] |
|
[2024-10-23 06:11:59,957][02423] Fps is (10 sec: 4505.6, 60 sec: 3754.7, 300 sec: 3644.0). Total num frames: 1056768. Throughput: 0: 936.8. Samples: 263126. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2024-10-23 06:11:59,962][02423] Avg episode reward: [(0, '4.619')] |
|
[2024-10-23 06:12:02,103][04575] Updated weights for policy 0, policy_version 260 (0.0032) |
|
[2024-10-23 06:12:04,958][02423] Fps is (10 sec: 3276.7, 60 sec: 3549.9, 300 sec: 3623.9). Total num frames: 1069056. Throughput: 0: 902.3. Samples: 267906. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) |
|
[2024-10-23 06:12:04,965][02423] Avg episode reward: [(0, '4.774')] |
|
[2024-10-23 06:12:09,957][02423] Fps is (10 sec: 3686.4, 60 sec: 3686.4, 300 sec: 3707.2). Total num frames: 1093632. Throughput: 0: 885.1. Samples: 273686. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2024-10-23 06:12:09,965][02423] Avg episode reward: [(0, '4.624')] |
|
[2024-10-23 06:12:12,583][04575] Updated weights for policy 0, policy_version 270 (0.0024) |
|
[2024-10-23 06:12:14,957][02423] Fps is (10 sec: 4505.7, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 1114112. Throughput: 0: 916.8. Samples: 277118. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) |
|
[2024-10-23 06:12:14,959][02423] Avg episode reward: [(0, '4.649')] |
|
[2024-10-23 06:12:19,957][02423] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3776.6). Total num frames: 1130496. Throughput: 0: 965.2. Samples: 282886. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) |
|
[2024-10-23 06:12:19,967][02423] Avg episode reward: [(0, '4.727')] |
|
[2024-10-23 06:12:24,189][04575] Updated weights for policy 0, policy_version 280 (0.0028) |
|
[2024-10-23 06:12:24,957][02423] Fps is (10 sec: 3686.4, 60 sec: 3618.2, 300 sec: 3804.4). Total num frames: 1150976. Throughput: 0: 945.5. Samples: 287856. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) |
|
[2024-10-23 06:12:24,959][02423] Avg episode reward: [(0, '4.868')] |
|
[2024-10-23 06:12:24,966][04562] Saving new best policy, reward=4.868! |
|
[2024-10-23 06:12:29,957][02423] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3818.3). Total num frames: 1171456. Throughput: 0: 949.6. Samples: 291270. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2024-10-23 06:12:29,960][02423] Avg episode reward: [(0, '4.944')] |
|
[2024-10-23 06:12:29,972][04562] Saving new best policy, reward=4.944! |
|
[2024-10-23 06:12:33,267][04575] Updated weights for policy 0, policy_version 290 (0.0039) |
|
[2024-10-23 06:12:34,958][02423] Fps is (10 sec: 4095.8, 60 sec: 3891.2, 300 sec: 3804.4). Total num frames: 1191936. Throughput: 0: 994.2. Samples: 297508. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) |
|
[2024-10-23 06:12:34,961][02423] Avg episode reward: [(0, '4.741')] |
|
[2024-10-23 06:12:39,957][02423] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3790.6). Total num frames: 1204224. Throughput: 0: 934.9. Samples: 301754. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2024-10-23 06:12:39,960][02423] Avg episode reward: [(0, '4.770')] |
|
[2024-10-23 06:12:39,973][04562] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000294_1204224.pth... |
|
[2024-10-23 06:12:40,117][04562] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000072_294912.pth |
|
[2024-10-23 06:12:44,957][02423] Fps is (10 sec: 3276.9, 60 sec: 3822.9, 300 sec: 3804.4). Total num frames: 1224704. Throughput: 0: 931.2. Samples: 305030. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) |
|
[2024-10-23 06:12:44,960][02423] Avg episode reward: [(0, '4.792')] |
|
[2024-10-23 06:12:44,954][04575] Updated weights for policy 0, policy_version 300 (0.0045) |
|
[2024-10-23 06:12:49,957][02423] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 3804.4). Total num frames: 1249280. Throughput: 0: 978.5. Samples: 311940. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2024-10-23 06:12:49,965][02423] Avg episode reward: [(0, '4.793')] |
|
[2024-10-23 06:12:54,957][02423] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 1265664. Throughput: 0: 954.8. Samples: 316652. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2024-10-23 06:12:54,960][02423] Avg episode reward: [(0, '4.941')] |
|
[2024-10-23 06:12:56,379][04575] Updated weights for policy 0, policy_version 310 (0.0018) |
|
[2024-10-23 06:12:59,957][02423] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3804.4). Total num frames: 1282048. Throughput: 0: 934.5. Samples: 319172. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2024-10-23 06:12:59,960][02423] Avg episode reward: [(0, '5.188')] |
|
[2024-10-23 06:12:59,969][04562] Saving new best policy, reward=5.188! |
|
[2024-10-23 06:13:04,957][02423] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3818.3). Total num frames: 1306624. Throughput: 0: 957.2. Samples: 325962. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2024-10-23 06:13:04,961][02423] Avg episode reward: [(0, '5.267')] |
|
[2024-10-23 06:13:04,967][04562] Saving new best policy, reward=5.267! |
|
[2024-10-23 06:13:05,565][04575] Updated weights for policy 0, policy_version 320 (0.0028) |
|
[2024-10-23 06:13:09,959][02423] Fps is (10 sec: 4095.2, 60 sec: 3822.8, 300 sec: 3790.5). Total num frames: 1323008. Throughput: 0: 970.2. Samples: 331516. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2024-10-23 06:13:09,964][02423] Avg episode reward: [(0, '5.165')] |
|
[2024-10-23 06:13:14,957][02423] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3790.6). Total num frames: 1339392. Throughput: 0: 939.2. Samples: 333532. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2024-10-23 06:13:14,965][02423] Avg episode reward: [(0, '5.008')] |
|
[2024-10-23 06:13:17,175][04575] Updated weights for policy 0, policy_version 330 (0.0026) |
|
[2024-10-23 06:13:19,957][02423] Fps is (10 sec: 4096.7, 60 sec: 3891.2, 300 sec: 3818.3). Total num frames: 1363968. Throughput: 0: 946.4. Samples: 340096. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2024-10-23 06:13:19,963][02423] Avg episode reward: [(0, '4.939')] |
|
[2024-10-23 06:13:24,957][02423] Fps is (10 sec: 4505.6, 60 sec: 3891.2, 300 sec: 3804.5). Total num frames: 1384448. Throughput: 0: 997.6. Samples: 346644. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2024-10-23 06:13:24,959][02423] Avg episode reward: [(0, '5.283')] |
|
[2024-10-23 06:13:24,965][04562] Saving new best policy, reward=5.283! |
|
[2024-10-23 06:13:27,661][04575] Updated weights for policy 0, policy_version 340 (0.0016) |
|
[2024-10-23 06:13:29,961][02423] Fps is (10 sec: 3275.7, 60 sec: 3754.5, 300 sec: 3776.6). Total num frames: 1396736. Throughput: 0: 970.1. Samples: 348686. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2024-10-23 06:13:29,967][02423] Avg episode reward: [(0, '5.272')] |
|
[2024-10-23 06:13:34,957][02423] Fps is (10 sec: 3686.4, 60 sec: 3823.0, 300 sec: 3818.3). Total num frames: 1421312. Throughput: 0: 938.1. Samples: 354156. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2024-10-23 06:13:34,964][02423] Avg episode reward: [(0, '5.292')] |
|
[2024-10-23 06:13:34,966][04562] Saving new best policy, reward=5.292! |
|
[2024-10-23 06:13:37,812][04575] Updated weights for policy 0, policy_version 350 (0.0028) |
|
[2024-10-23 06:13:39,957][02423] Fps is (10 sec: 4507.1, 60 sec: 3959.5, 300 sec: 3818.3). Total num frames: 1441792. Throughput: 0: 984.1. Samples: 360938. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2024-10-23 06:13:39,962][02423] Avg episode reward: [(0, '5.545')] |
|
[2024-10-23 06:13:39,974][04562] Saving new best policy, reward=5.545! |
|
[2024-10-23 06:13:44,957][02423] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 1454080. Throughput: 0: 983.0. Samples: 363408. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2024-10-23 06:13:44,961][02423] Avg episode reward: [(0, '5.731')] |
|
[2024-10-23 06:13:45,057][04562] Saving new best policy, reward=5.731! |
|
[2024-10-23 06:13:49,760][04575] Updated weights for policy 0, policy_version 360 (0.0018) |
|
[2024-10-23 06:13:49,957][02423] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3804.4). Total num frames: 1474560. Throughput: 0: 932.4. Samples: 367922. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2024-10-23 06:13:49,960][02423] Avg episode reward: [(0, '5.587')] |
|
[2024-10-23 06:13:54,958][02423] Fps is (10 sec: 4505.5, 60 sec: 3891.2, 300 sec: 3818.3). Total num frames: 1499136. Throughput: 0: 964.1. Samples: 374900. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2024-10-23 06:13:54,960][02423] Avg episode reward: [(0, '5.555')] |
|
[2024-10-23 06:13:59,084][04575] Updated weights for policy 0, policy_version 370 (0.0021) |
|
[2024-10-23 06:13:59,961][02423] Fps is (10 sec: 4094.5, 60 sec: 3891.0, 300 sec: 3804.4). Total num frames: 1515520. Throughput: 0: 998.3. Samples: 378460. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2024-10-23 06:13:59,966][02423] Avg episode reward: [(0, '5.953')] |
|
[2024-10-23 06:13:59,980][04562] Saving new best policy, reward=5.953! |
|
[2024-10-23 06:14:04,957][02423] Fps is (10 sec: 3276.9, 60 sec: 3754.7, 300 sec: 3790.5). Total num frames: 1531904. Throughput: 0: 944.4. Samples: 382594. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) |
|
[2024-10-23 06:14:04,959][02423] Avg episode reward: [(0, '6.249')] |
|
[2024-10-23 06:14:04,963][04562] Saving new best policy, reward=6.249! |
|
[2024-10-23 06:14:09,958][02423] Fps is (10 sec: 3687.6, 60 sec: 3823.0, 300 sec: 3818.3). Total num frames: 1552384. Throughput: 0: 940.3. Samples: 388958. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) |
|
[2024-10-23 06:14:09,968][02423] Avg episode reward: [(0, '6.573')] |
|
[2024-10-23 06:14:09,983][04562] Saving new best policy, reward=6.573! |
|
[2024-10-23 06:14:10,309][04575] Updated weights for policy 0, policy_version 380 (0.0039) |
|
[2024-10-23 06:14:14,960][02423] Fps is (10 sec: 4504.4, 60 sec: 3959.3, 300 sec: 3818.3). Total num frames: 1576960. Throughput: 0: 971.6. Samples: 392406. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2024-10-23 06:14:14,965][02423] Avg episode reward: [(0, '6.454')] |
|
[2024-10-23 06:14:19,960][02423] Fps is (10 sec: 3685.4, 60 sec: 3754.5, 300 sec: 3790.5). Total num frames: 1589248. Throughput: 0: 961.9. Samples: 397446. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2024-10-23 06:14:19,965][02423] Avg episode reward: [(0, '6.630')] |
|
[2024-10-23 06:14:19,980][04562] Saving new best policy, reward=6.630! |
|
[2024-10-23 06:14:22,007][04575] Updated weights for policy 0, policy_version 390 (0.0027) |
|
[2024-10-23 06:14:24,957][02423] Fps is (10 sec: 3277.7, 60 sec: 3754.7, 300 sec: 3804.4). Total num frames: 1609728. Throughput: 0: 933.8. Samples: 402958. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2024-10-23 06:14:24,963][02423] Avg episode reward: [(0, '6.699')] |
|
[2024-10-23 06:14:24,966][04562] Saving new best policy, reward=6.699! |
|
[2024-10-23 06:14:29,957][02423] Fps is (10 sec: 4097.3, 60 sec: 3891.4, 300 sec: 3804.4). Total num frames: 1630208. Throughput: 0: 956.3. Samples: 406442. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2024-10-23 06:14:29,960][02423] Avg episode reward: [(0, '7.707')] |
|
[2024-10-23 06:14:29,997][04562] Saving new best policy, reward=7.707! |
|
[2024-10-23 06:14:30,919][04575] Updated weights for policy 0, policy_version 400 (0.0051) |
|
[2024-10-23 06:14:34,957][02423] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3790.5). Total num frames: 1646592. Throughput: 0: 989.9. Samples: 412468. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2024-10-23 06:14:34,967][02423] Avg episode reward: [(0, '7.656')] |
|
[2024-10-23 06:14:39,957][02423] Fps is (10 sec: 3276.8, 60 sec: 3686.4, 300 sec: 3790.5). Total num frames: 1662976. Throughput: 0: 923.8. Samples: 416470. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2024-10-23 06:14:39,960][02423] Avg episode reward: [(0, '8.041')] |
|
[2024-10-23 06:14:39,967][04562] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000406_1662976.pth... |
|
[2024-10-23 06:14:40,088][04562] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000184_753664.pth |
|
[2024-10-23 06:14:40,109][04562] Saving new best policy, reward=8.041! |
|
[2024-10-23 06:14:43,041][04575] Updated weights for policy 0, policy_version 410 (0.0029) |
|
[2024-10-23 06:14:44,957][02423] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3818.3). Total num frames: 1687552. Throughput: 0: 918.7. Samples: 419800. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2024-10-23 06:14:44,960][02423] Avg episode reward: [(0, '8.625')] |
|
[2024-10-23 06:14:44,965][04562] Saving new best policy, reward=8.625! |
|
[2024-10-23 06:14:49,962][02423] Fps is (10 sec: 4503.6, 60 sec: 3890.9, 300 sec: 3804.4). Total num frames: 1708032. Throughput: 0: 977.8. Samples: 426598. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) |
|
[2024-10-23 06:14:49,964][02423] Avg episode reward: [(0, '8.602')] |
|
[2024-10-23 06:14:54,013][04575] Updated weights for policy 0, policy_version 420 (0.0031) |
|
[2024-10-23 06:14:54,957][02423] Fps is (10 sec: 3276.8, 60 sec: 3686.4, 300 sec: 3776.6). Total num frames: 1720320. Throughput: 0: 932.9. Samples: 430938. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2024-10-23 06:14:54,960][02423] Avg episode reward: [(0, '9.111')] |
|
[2024-10-23 06:14:54,962][04562] Saving new best policy, reward=9.111! |
|
[2024-10-23 06:14:59,957][02423] Fps is (10 sec: 3688.1, 60 sec: 3823.2, 300 sec: 3818.3). Total num frames: 1744896. Throughput: 0: 920.3. Samples: 433816. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2024-10-23 06:14:59,960][02423] Avg episode reward: [(0, '8.381')] |
|
[2024-10-23 06:15:03,521][04575] Updated weights for policy 0, policy_version 430 (0.0013) |
|
[2024-10-23 06:15:04,957][02423] Fps is (10 sec: 4505.6, 60 sec: 3891.2, 300 sec: 3804.4). Total num frames: 1765376. Throughput: 0: 963.5. Samples: 440800. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2024-10-23 06:15:04,960][02423] Avg episode reward: [(0, '9.526')] |
|
[2024-10-23 06:15:04,965][04562] Saving new best policy, reward=9.526! |
|
[2024-10-23 06:15:09,957][02423] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 1777664. Throughput: 0: 953.6. Samples: 445872. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) |
|
[2024-10-23 06:15:09,960][02423] Avg episode reward: [(0, '9.934')] |
|
[2024-10-23 06:15:09,980][04562] Saving new best policy, reward=9.934! |
|
[2024-10-23 06:15:14,958][02423] Fps is (10 sec: 3276.7, 60 sec: 3686.5, 300 sec: 3790.5). Total num frames: 1798144. Throughput: 0: 923.2. Samples: 447986. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2024-10-23 06:15:14,961][02423] Avg episode reward: [(0, '10.156')] |
|
[2024-10-23 06:15:14,965][04562] Saving new best policy, reward=10.156! |
|
[2024-10-23 06:15:15,324][04575] Updated weights for policy 0, policy_version 440 (0.0045) |
|
[2024-10-23 06:15:19,957][02423] Fps is (10 sec: 4505.6, 60 sec: 3891.4, 300 sec: 3818.3). Total num frames: 1822720. Throughput: 0: 942.3. Samples: 454872. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2024-10-23 06:15:19,962][02423] Avg episode reward: [(0, '9.628')] |
|
[2024-10-23 06:15:24,957][02423] Fps is (10 sec: 4096.1, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 1839104. Throughput: 0: 989.9. Samples: 461014. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2024-10-23 06:15:24,962][02423] Avg episode reward: [(0, '9.721')] |
|
[2024-10-23 06:15:25,027][04575] Updated weights for policy 0, policy_version 450 (0.0032) |
|
[2024-10-23 06:15:29,957][02423] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3790.5). Total num frames: 1855488. Throughput: 0: 961.4. Samples: 463062. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) |
|
[2024-10-23 06:15:29,961][02423] Avg episode reward: [(0, '9.887')] |
|
[2024-10-23 06:15:34,957][02423] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3818.3). Total num frames: 1880064. Throughput: 0: 946.5. Samples: 469186. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2024-10-23 06:15:34,960][02423] Avg episode reward: [(0, '10.844')] |
|
[2024-10-23 06:15:34,964][04562] Saving new best policy, reward=10.844! |
|
[2024-10-23 06:15:35,696][04575] Updated weights for policy 0, policy_version 460 (0.0024) |
|
[2024-10-23 06:15:39,957][02423] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 3804.4). Total num frames: 1900544. Throughput: 0: 1005.2. Samples: 476172. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2024-10-23 06:15:39,963][02423] Avg episode reward: [(0, '11.371')] |
|
[2024-10-23 06:15:39,988][04562] Saving new best policy, reward=11.371! |
|
[2024-10-23 06:15:44,957][02423] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3790.6). Total num frames: 1916928. Throughput: 0: 984.9. Samples: 478138. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2024-10-23 06:15:44,960][02423] Avg episode reward: [(0, '11.664')] |
|
[2024-10-23 06:15:44,963][04562] Saving new best policy, reward=11.664! |
|
[2024-10-23 06:15:47,289][04575] Updated weights for policy 0, policy_version 470 (0.0024) |
|
[2024-10-23 06:15:49,957][02423] Fps is (10 sec: 3276.8, 60 sec: 3755.0, 300 sec: 3804.4). Total num frames: 1933312. Throughput: 0: 942.8. Samples: 483228. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) |
|
[2024-10-23 06:15:49,965][02423] Avg episode reward: [(0, '12.544')] |
|
[2024-10-23 06:15:49,977][04562] Saving new best policy, reward=12.544! |
|
[2024-10-23 06:15:54,957][02423] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3818.3). Total num frames: 1957888. Throughput: 0: 987.5. Samples: 490310. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2024-10-23 06:15:54,959][02423] Avg episode reward: [(0, '12.760')] |
|
[2024-10-23 06:15:54,963][04562] Saving new best policy, reward=12.760! |
|
[2024-10-23 06:15:56,106][04575] Updated weights for policy 0, policy_version 480 (0.0028) |
|
[2024-10-23 06:15:59,958][02423] Fps is (10 sec: 4095.9, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 1974272. Throughput: 0: 1004.2. Samples: 493174. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2024-10-23 06:15:59,959][02423] Avg episode reward: [(0, '13.223')] |
|
[2024-10-23 06:15:59,971][04562] Saving new best policy, reward=13.223! |
|
[2024-10-23 06:16:04,962][02423] Fps is (10 sec: 3275.3, 60 sec: 3754.4, 300 sec: 3790.5). Total num frames: 1990656. Throughput: 0: 946.0. Samples: 497446. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2024-10-23 06:16:04,964][02423] Avg episode reward: [(0, '13.716')] |
|
[2024-10-23 06:16:04,971][04562] Saving new best policy, reward=13.716! |
|
[2024-10-23 06:16:07,701][04575] Updated weights for policy 0, policy_version 490 (0.0050) |
|
[2024-10-23 06:16:09,957][02423] Fps is (10 sec: 4096.1, 60 sec: 3959.5, 300 sec: 3818.3). Total num frames: 2015232. Throughput: 0: 965.1. Samples: 504442. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2024-10-23 06:16:09,960][02423] Avg episode reward: [(0, '12.883')] |
|
[2024-10-23 06:16:14,958][02423] Fps is (10 sec: 4507.2, 60 sec: 3959.4, 300 sec: 3804.4). Total num frames: 2035712. Throughput: 0: 993.4. Samples: 507766. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2024-10-23 06:16:14,962][02423] Avg episode reward: [(0, '13.525')] |
|
[2024-10-23 06:16:18,622][04575] Updated weights for policy 0, policy_version 500 (0.0015) |
|
[2024-10-23 06:16:19,957][02423] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 2048000. Throughput: 0: 962.2. Samples: 512484. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2024-10-23 06:16:19,960][02423] Avg episode reward: [(0, '13.707')] |
|
[2024-10-23 06:16:24,957][02423] Fps is (10 sec: 3686.8, 60 sec: 3891.2, 300 sec: 3818.3). Total num frames: 2072576. Throughput: 0: 944.9. Samples: 518692. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2024-10-23 06:16:24,963][02423] Avg episode reward: [(0, '14.630')] |
|
[2024-10-23 06:16:24,968][04562] Saving new best policy, reward=14.630! |
|
[2024-10-23 06:16:28,093][04575] Updated weights for policy 0, policy_version 510 (0.0015) |
|
[2024-10-23 06:16:29,957][02423] Fps is (10 sec: 4915.2, 60 sec: 4027.7, 300 sec: 3860.0). Total num frames: 2097152. Throughput: 0: 978.7. Samples: 522178. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2024-10-23 06:16:29,962][02423] Avg episode reward: [(0, '15.240')] |
|
[2024-10-23 06:16:29,974][04562] Saving new best policy, reward=15.240! |
|
[2024-10-23 06:16:34,957][02423] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3832.2). Total num frames: 2109440. Throughput: 0: 987.3. Samples: 527658. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) |
|
[2024-10-23 06:16:34,965][02423] Avg episode reward: [(0, '14.847')] |
|
[2024-10-23 06:16:39,958][02423] Fps is (10 sec: 2457.3, 60 sec: 3686.3, 300 sec: 3818.3). Total num frames: 2121728. Throughput: 0: 916.1. Samples: 531536. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2024-10-23 06:16:39,961][02423] Avg episode reward: [(0, '14.974')] |
|
[2024-10-23 06:16:39,977][04562] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000518_2121728.pth... |
|
[2024-10-23 06:16:40,201][04562] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000294_1204224.pth |
|
[2024-10-23 06:16:42,025][04575] Updated weights for policy 0, policy_version 520 (0.0025) |
|
[2024-10-23 06:16:44,958][02423] Fps is (10 sec: 2867.1, 60 sec: 3686.4, 300 sec: 3818.3). Total num frames: 2138112. Throughput: 0: 895.5. Samples: 533470. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2024-10-23 06:16:44,965][02423] Avg episode reward: [(0, '14.368')] |
|
[2024-10-23 06:16:49,957][02423] Fps is (10 sec: 3686.8, 60 sec: 3754.7, 300 sec: 3804.4). Total num frames: 2158592. Throughput: 0: 940.9. Samples: 539780. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) |
|
[2024-10-23 06:16:49,963][02423] Avg episode reward: [(0, '13.964')] |
|
[2024-10-23 06:16:53,646][04575] Updated weights for policy 0, policy_version 530 (0.0022) |
|
[2024-10-23 06:16:54,957][02423] Fps is (10 sec: 3686.5, 60 sec: 3618.1, 300 sec: 3790.5). Total num frames: 2174976. Throughput: 0: 879.0. Samples: 543998. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) |
|
[2024-10-23 06:16:54,962][02423] Avg episode reward: [(0, '13.505')] |
|
[2024-10-23 06:16:59,957][02423] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3832.2). Total num frames: 2199552. Throughput: 0: 884.0. Samples: 547544. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2024-10-23 06:16:59,962][02423] Avg episode reward: [(0, '13.093')] |
|
[2024-10-23 06:17:02,679][04575] Updated weights for policy 0, policy_version 540 (0.0021) |
|
[2024-10-23 06:17:04,957][02423] Fps is (10 sec: 4505.6, 60 sec: 3823.2, 300 sec: 3818.3). Total num frames: 2220032. Throughput: 0: 934.4. Samples: 554534. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2024-10-23 06:17:04,963][02423] Avg episode reward: [(0, '14.717')] |
|
[2024-10-23 06:17:09,957][02423] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3790.5). Total num frames: 2232320. Throughput: 0: 902.0. Samples: 559280. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2024-10-23 06:17:09,964][02423] Avg episode reward: [(0, '15.766')] |
|
[2024-10-23 06:17:09,977][04562] Saving new best policy, reward=15.766! |
|
[2024-10-23 06:17:14,029][04575] Updated weights for policy 0, policy_version 550 (0.0018) |
|
[2024-10-23 06:17:14,957][02423] Fps is (10 sec: 3276.8, 60 sec: 3618.2, 300 sec: 3804.4). Total num frames: 2252800. Throughput: 0: 881.2. Samples: 561834. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2024-10-23 06:17:14,964][02423] Avg episode reward: [(0, '17.158')] |
|
[2024-10-23 06:17:14,973][04562] Saving new best policy, reward=17.158! |
|
[2024-10-23 06:17:19,957][02423] Fps is (10 sec: 4505.6, 60 sec: 3822.9, 300 sec: 3818.3). Total num frames: 2277376. Throughput: 0: 912.0. Samples: 568696. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2024-10-23 06:17:19,964][02423] Avg episode reward: [(0, '18.543')] |
|
[2024-10-23 06:17:19,973][04562] Saving new best policy, reward=18.543! |
|
[2024-10-23 06:17:24,325][04575] Updated weights for policy 0, policy_version 560 (0.0015) |
|
[2024-10-23 06:17:24,964][02423] Fps is (10 sec: 4093.4, 60 sec: 3686.0, 300 sec: 3804.3). Total num frames: 2293760. Throughput: 0: 946.6. Samples: 574140. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2024-10-23 06:17:24,968][02423] Avg episode reward: [(0, '18.514')] |
|
[2024-10-23 06:17:29,957][02423] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3790.5). Total num frames: 2310144. Throughput: 0: 951.1. Samples: 576270. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2024-10-23 06:17:29,964][02423] Avg episode reward: [(0, '17.935')] |
|
[2024-10-23 06:17:34,606][04575] Updated weights for policy 0, policy_version 570 (0.0022) |
|
[2024-10-23 06:17:34,957][02423] Fps is (10 sec: 4098.6, 60 sec: 3754.7, 300 sec: 3832.2). Total num frames: 2334720. Throughput: 0: 961.8. Samples: 583060. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2024-10-23 06:17:34,959][02423] Avg episode reward: [(0, '17.982')] |
|
[2024-10-23 06:17:39,957][02423] Fps is (10 sec: 4505.6, 60 sec: 3891.3, 300 sec: 3832.2). Total num frames: 2355200. Throughput: 0: 1010.3. Samples: 589462. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) |
|
[2024-10-23 06:17:39,963][02423] Avg episode reward: [(0, '18.187')] |
|
[2024-10-23 06:17:44,963][02423] Fps is (10 sec: 3275.0, 60 sec: 3822.6, 300 sec: 3790.5). Total num frames: 2367488. Throughput: 0: 977.4. Samples: 591534. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) |
|
[2024-10-23 06:17:44,966][02423] Avg episode reward: [(0, '18.160')] |
|
[2024-10-23 06:17:46,318][04575] Updated weights for policy 0, policy_version 580 (0.0020) |
|
[2024-10-23 06:17:49,957][02423] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3818.3). Total num frames: 2392064. Throughput: 0: 949.6. Samples: 597268. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2024-10-23 06:17:49,960][02423] Avg episode reward: [(0, '18.055')] |
|
[2024-10-23 06:17:54,921][04575] Updated weights for policy 0, policy_version 590 (0.0041) |
|
[2024-10-23 06:17:54,957][02423] Fps is (10 sec: 4918.0, 60 sec: 4027.7, 300 sec: 3846.1). Total num frames: 2416640. Throughput: 0: 1000.1. Samples: 604286. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2024-10-23 06:17:54,960][02423] Avg episode reward: [(0, '17.147')] |
|
[2024-10-23 06:17:59,957][02423] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3804.4). Total num frames: 2428928. Throughput: 0: 998.4. Samples: 606764. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2024-10-23 06:17:59,960][02423] Avg episode reward: [(0, '16.713')] |
|
[2024-10-23 06:18:04,957][02423] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3818.3). Total num frames: 2449408. Throughput: 0: 956.6. Samples: 611744. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2024-10-23 06:18:04,960][02423] Avg episode reward: [(0, '18.607')] |
|
[2024-10-23 06:18:04,966][04562] Saving new best policy, reward=18.607! |
|
[2024-10-23 06:18:06,363][04575] Updated weights for policy 0, policy_version 600 (0.0019) |
|
[2024-10-23 06:18:09,958][02423] Fps is (10 sec: 4505.5, 60 sec: 4027.7, 300 sec: 3846.1). Total num frames: 2473984. Throughput: 0: 989.6. Samples: 618668. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) |
|
[2024-10-23 06:18:09,960][02423] Avg episode reward: [(0, '19.402')] |
|
[2024-10-23 06:18:09,968][04562] Saving new best policy, reward=19.402! |
|
[2024-10-23 06:18:14,960][02423] Fps is (10 sec: 4094.9, 60 sec: 3959.3, 300 sec: 3818.3). Total num frames: 2490368. Throughput: 0: 1014.7. Samples: 621936. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2024-10-23 06:18:14,964][02423] Avg episode reward: [(0, '20.212')] |
|
[2024-10-23 06:18:14,966][04562] Saving new best policy, reward=20.212! |
|
[2024-10-23 06:18:17,432][04575] Updated weights for policy 0, policy_version 610 (0.0013) |
|
[2024-10-23 06:18:19,957][02423] Fps is (10 sec: 3276.9, 60 sec: 3822.9, 300 sec: 3804.4). Total num frames: 2506752. Throughput: 0: 955.1. Samples: 626040. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2024-10-23 06:18:19,961][02423] Avg episode reward: [(0, '20.109')] |
|
[2024-10-23 06:18:24,957][02423] Fps is (10 sec: 3687.4, 60 sec: 3891.6, 300 sec: 3832.2). Total num frames: 2527232. Throughput: 0: 964.6. Samples: 632870. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2024-10-23 06:18:24,961][02423] Avg episode reward: [(0, '21.515')] |
|
[2024-10-23 06:18:24,975][04562] Saving new best policy, reward=21.515! |
|
[2024-10-23 06:18:26,812][04575] Updated weights for policy 0, policy_version 620 (0.0018) |
|
[2024-10-23 06:18:29,958][02423] Fps is (10 sec: 4505.4, 60 sec: 4027.7, 300 sec: 3832.2). Total num frames: 2551808. Throughput: 0: 994.1. Samples: 636262. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) |
|
[2024-10-23 06:18:29,962][02423] Avg episode reward: [(0, '20.593')] |
|
[2024-10-23 06:18:34,957][02423] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3804.4). Total num frames: 2564096. Throughput: 0: 976.8. Samples: 641224. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2024-10-23 06:18:34,963][02423] Avg episode reward: [(0, '19.391')] |
|
[2024-10-23 06:18:38,096][04575] Updated weights for policy 0, policy_version 630 (0.0017) |
|
[2024-10-23 06:18:39,957][02423] Fps is (10 sec: 3686.6, 60 sec: 3891.2, 300 sec: 3846.1). Total num frames: 2588672. Throughput: 0: 957.6. Samples: 647376. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2024-10-23 06:18:39,963][02423] Avg episode reward: [(0, '20.103')] |
|
[2024-10-23 06:18:39,977][04562] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000632_2588672.pth... |
|
[2024-10-23 06:18:40,097][04562] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000406_1662976.pth |
|
[2024-10-23 06:18:44,957][02423] Fps is (10 sec: 4505.6, 60 sec: 4028.1, 300 sec: 3846.1). Total num frames: 2609152. Throughput: 0: 978.8. Samples: 650810. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2024-10-23 06:18:44,965][02423] Avg episode reward: [(0, '18.968')] |
|
[2024-10-23 06:18:48,053][04575] Updated weights for policy 0, policy_version 640 (0.0032) |
|
[2024-10-23 06:18:49,957][02423] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3818.3). Total num frames: 2625536. Throughput: 0: 987.7. Samples: 656192. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2024-10-23 06:18:49,961][02423] Avg episode reward: [(0, '19.192')] |
|
[2024-10-23 06:18:54,957][02423] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3818.4). Total num frames: 2641920. Throughput: 0: 954.9. Samples: 661640. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2024-10-23 06:18:54,961][02423] Avg episode reward: [(0, '19.160')] |
|
[2024-10-23 06:18:58,591][04575] Updated weights for policy 0, policy_version 650 (0.0016) |
|
[2024-10-23 06:18:59,958][02423] Fps is (10 sec: 4095.9, 60 sec: 3959.4, 300 sec: 3846.1). Total num frames: 2666496. Throughput: 0: 957.3. Samples: 665014. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2024-10-23 06:18:59,967][02423] Avg episode reward: [(0, '17.891')] |
|
[2024-10-23 06:19:04,958][02423] Fps is (10 sec: 4505.4, 60 sec: 3959.4, 300 sec: 3846.1). Total num frames: 2686976. Throughput: 0: 1010.6. Samples: 671518. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2024-10-23 06:19:04,960][02423] Avg episode reward: [(0, '17.094')] |
|
[2024-10-23 06:19:09,785][04575] Updated weights for policy 0, policy_version 660 (0.0014) |
|
[2024-10-23 06:19:09,958][02423] Fps is (10 sec: 3686.3, 60 sec: 3822.9, 300 sec: 3818.3). Total num frames: 2703360. Throughput: 0: 959.9. Samples: 676068. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2024-10-23 06:19:09,961][02423] Avg episode reward: [(0, '17.181')] |
|
[2024-10-23 06:19:14,957][02423] Fps is (10 sec: 3686.5, 60 sec: 3891.4, 300 sec: 3846.1). Total num frames: 2723840. Throughput: 0: 962.5. Samples: 679574. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2024-10-23 06:19:14,960][02423] Avg episode reward: [(0, '17.352')] |
|
[2024-10-23 06:19:18,639][04575] Updated weights for policy 0, policy_version 670 (0.0051) |
|
[2024-10-23 06:19:19,957][02423] Fps is (10 sec: 4505.8, 60 sec: 4027.7, 300 sec: 3860.0). Total num frames: 2748416. Throughput: 0: 1006.4. Samples: 686514. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2024-10-23 06:19:19,964][02423] Avg episode reward: [(0, '18.003')] |
|
[2024-10-23 06:19:24,958][02423] Fps is (10 sec: 3686.1, 60 sec: 3891.1, 300 sec: 3832.2). Total num frames: 2760704. Throughput: 0: 971.7. Samples: 691104. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) |
|
[2024-10-23 06:19:24,965][02423] Avg episode reward: [(0, '17.405')] |
|
[2024-10-23 06:19:29,957][02423] Fps is (10 sec: 3276.8, 60 sec: 3823.0, 300 sec: 3846.1). Total num frames: 2781184. Throughput: 0: 956.3. Samples: 693842. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) |
|
[2024-10-23 06:19:29,959][02423] Avg episode reward: [(0, '17.341')] |
|
[2024-10-23 06:19:30,208][04575] Updated weights for policy 0, policy_version 680 (0.0040) |
|
[2024-10-23 06:19:34,957][02423] Fps is (10 sec: 4506.0, 60 sec: 4027.7, 300 sec: 3873.8). Total num frames: 2805760. Throughput: 0: 995.2. Samples: 700974. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) |
|
[2024-10-23 06:19:34,960][02423] Avg episode reward: [(0, '18.399')] |
|
[2024-10-23 06:19:39,961][02423] Fps is (10 sec: 4094.5, 60 sec: 3891.0, 300 sec: 3846.0). Total num frames: 2822144. Throughput: 0: 993.8. Samples: 706364. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2024-10-23 06:19:39,963][02423] Avg episode reward: [(0, '19.398')] |
|
[2024-10-23 06:19:40,584][04575] Updated weights for policy 0, policy_version 690 (0.0041) |
|
[2024-10-23 06:19:44,957][02423] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3832.2). Total num frames: 2838528. Throughput: 0: 967.7. Samples: 708562. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2024-10-23 06:19:44,963][02423] Avg episode reward: [(0, '21.047')] |
|
[2024-10-23 06:19:49,957][02423] Fps is (10 sec: 4097.4, 60 sec: 3959.5, 300 sec: 3873.8). Total num frames: 2863104. Throughput: 0: 970.8. Samples: 715204. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2024-10-23 06:19:49,964][02423] Avg episode reward: [(0, '21.869')] |
|
[2024-10-23 06:19:49,976][04562] Saving new best policy, reward=21.869! |
|
[2024-10-23 06:19:50,412][04575] Updated weights for policy 0, policy_version 700 (0.0014) |
|
[2024-10-23 06:19:54,957][02423] Fps is (10 sec: 4505.6, 60 sec: 4027.7, 300 sec: 3860.0). Total num frames: 2883584. Throughput: 0: 1009.9. Samples: 721514. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2024-10-23 06:19:54,961][02423] Avg episode reward: [(0, '22.561')] |
|
[2024-10-23 06:19:54,966][04562] Saving new best policy, reward=22.561! |
|
[2024-10-23 06:19:59,957][02423] Fps is (10 sec: 3276.8, 60 sec: 3823.0, 300 sec: 3832.2). Total num frames: 2895872. Throughput: 0: 977.5. Samples: 723560. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2024-10-23 06:19:59,960][02423] Avg episode reward: [(0, '22.237')] |
|
[2024-10-23 06:20:02,015][04575] Updated weights for policy 0, policy_version 710 (0.0024) |
|
[2024-10-23 06:20:04,958][02423] Fps is (10 sec: 3686.3, 60 sec: 3891.2, 300 sec: 3873.8). Total num frames: 2920448. Throughput: 0: 952.0. Samples: 729356. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2024-10-23 06:20:04,963][02423] Avg episode reward: [(0, '20.936')] |
|
[2024-10-23 06:20:09,957][02423] Fps is (10 sec: 4915.2, 60 sec: 4027.8, 300 sec: 3887.7). Total num frames: 2945024. Throughput: 0: 1008.2. Samples: 736472. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2024-10-23 06:20:09,962][02423] Avg episode reward: [(0, '19.727')] |
|
[2024-10-23 06:20:11,286][04575] Updated weights for policy 0, policy_version 720 (0.0024) |
|
[2024-10-23 06:20:14,957][02423] Fps is (10 sec: 3686.5, 60 sec: 3891.2, 300 sec: 3846.1). Total num frames: 2957312. Throughput: 0: 997.1. Samples: 738710. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2024-10-23 06:20:14,965][02423] Avg episode reward: [(0, '18.845')] |
|
[2024-10-23 06:20:19,957][02423] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3860.0). Total num frames: 2977792. Throughput: 0: 952.7. Samples: 743846. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2024-10-23 06:20:19,964][02423] Avg episode reward: [(0, '19.081')] |
|
[2024-10-23 06:20:22,090][04575] Updated weights for policy 0, policy_version 730 (0.0027) |
|
[2024-10-23 06:20:24,957][02423] Fps is (10 sec: 4505.6, 60 sec: 4027.8, 300 sec: 3887.7). Total num frames: 3002368. Throughput: 0: 991.0. Samples: 750956. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) |
|
[2024-10-23 06:20:24,964][02423] Avg episode reward: [(0, '18.993')] |
|
[2024-10-23 06:20:29,958][02423] Fps is (10 sec: 4095.7, 60 sec: 3959.4, 300 sec: 3859.9). Total num frames: 3018752. Throughput: 0: 1010.9. Samples: 754052. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2024-10-23 06:20:29,962][02423] Avg episode reward: [(0, '19.165')] |
|
[2024-10-23 06:20:33,454][04575] Updated weights for policy 0, policy_version 740 (0.0021) |
|
[2024-10-23 06:20:34,957][02423] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3846.1). Total num frames: 3035136. Throughput: 0: 958.3. Samples: 758328. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2024-10-23 06:20:34,963][02423] Avg episode reward: [(0, '19.191')] |
|
[2024-10-23 06:20:39,957][02423] Fps is (10 sec: 4096.2, 60 sec: 3959.7, 300 sec: 3873.8). Total num frames: 3059712. Throughput: 0: 975.6. Samples: 765416. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2024-10-23 06:20:39,963][02423] Avg episode reward: [(0, '20.332')] |
|
[2024-10-23 06:20:39,974][04562] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000747_3059712.pth... |
|
[2024-10-23 06:20:40,095][04562] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000518_2121728.pth |
|
[2024-10-23 06:20:42,359][04575] Updated weights for policy 0, policy_version 750 (0.0022) |
|
[2024-10-23 06:20:44,957][02423] Fps is (10 sec: 4505.6, 60 sec: 4027.7, 300 sec: 3887.7). Total num frames: 3080192. Throughput: 0: 1005.7. Samples: 768818. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2024-10-23 06:20:44,960][02423] Avg episode reward: [(0, '19.225')] |
|
[2024-10-23 06:20:49,958][02423] Fps is (10 sec: 3276.7, 60 sec: 3822.9, 300 sec: 3846.1). Total num frames: 3092480. Throughput: 0: 979.9. Samples: 773454. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2024-10-23 06:20:49,960][02423] Avg episode reward: [(0, '19.567')] |
|
[2024-10-23 06:20:53,878][04575] Updated weights for policy 0, policy_version 760 (0.0042) |
|
[2024-10-23 06:20:54,957][02423] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3873.8). Total num frames: 3117056. Throughput: 0: 957.6. Samples: 779566. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2024-10-23 06:20:54,964][02423] Avg episode reward: [(0, '18.793')] |
|
[2024-10-23 06:20:59,960][02423] Fps is (10 sec: 4504.8, 60 sec: 4027.6, 300 sec: 3887.8). Total num frames: 3137536. Throughput: 0: 985.2. Samples: 783048. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) |
|
[2024-10-23 06:20:59,961][02423] Avg episode reward: [(0, '19.629')] |
|
[2024-10-23 06:21:04,132][04575] Updated weights for policy 0, policy_version 770 (0.0021) |
|
[2024-10-23 06:21:04,959][02423] Fps is (10 sec: 3685.9, 60 sec: 3891.1, 300 sec: 3859.9). Total num frames: 3153920. Throughput: 0: 996.0. Samples: 788668. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2024-10-23 06:21:04,964][02423] Avg episode reward: [(0, '19.107')] |
|
[2024-10-23 06:21:09,958][02423] Fps is (10 sec: 3687.1, 60 sec: 3822.9, 300 sec: 3860.0). Total num frames: 3174400. Throughput: 0: 955.6. Samples: 793960. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2024-10-23 06:21:09,960][02423] Avg episode reward: [(0, '20.237')] |
|
[2024-10-23 06:21:14,012][04575] Updated weights for policy 0, policy_version 780 (0.0014) |
|
[2024-10-23 06:21:14,957][02423] Fps is (10 sec: 4506.3, 60 sec: 4027.7, 300 sec: 3901.6). Total num frames: 3198976. Throughput: 0: 964.0. Samples: 797432. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2024-10-23 06:21:14,962][02423] Avg episode reward: [(0, '21.457')] |
|
[2024-10-23 06:21:19,957][02423] Fps is (10 sec: 4096.1, 60 sec: 3959.5, 300 sec: 3873.8). Total num frames: 3215360. Throughput: 0: 1012.7. Samples: 803898. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2024-10-23 06:21:19,961][02423] Avg episode reward: [(0, '21.602')] |
|
[2024-10-23 06:21:24,958][02423] Fps is (10 sec: 3276.5, 60 sec: 3822.9, 300 sec: 3846.1). Total num frames: 3231744. Throughput: 0: 955.5. Samples: 808414. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2024-10-23 06:21:24,961][02423] Avg episode reward: [(0, '20.754')] |
|
[2024-10-23 06:21:25,542][04575] Updated weights for policy 0, policy_version 790 (0.0023) |
|
[2024-10-23 06:21:29,957][02423] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3887.7). Total num frames: 3256320. Throughput: 0: 958.0. Samples: 811930. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2024-10-23 06:21:29,959][02423] Avg episode reward: [(0, '20.025')] |
|
[2024-10-23 06:21:34,293][04575] Updated weights for policy 0, policy_version 800 (0.0029) |
|
[2024-10-23 06:21:34,957][02423] Fps is (10 sec: 4505.9, 60 sec: 4027.7, 300 sec: 3915.5). Total num frames: 3276800. Throughput: 0: 1010.1. Samples: 818906. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2024-10-23 06:21:34,963][02423] Avg episode reward: [(0, '19.187')] |
|
[2024-10-23 06:21:39,969][02423] Fps is (10 sec: 3682.0, 60 sec: 3890.4, 300 sec: 3915.3). Total num frames: 3293184. Throughput: 0: 979.7. Samples: 823664. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) |
|
[2024-10-23 06:21:39,979][02423] Avg episode reward: [(0, '19.669')] |
|
[2024-10-23 06:21:44,957][02423] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3901.6). Total num frames: 3309568. Throughput: 0: 960.0. Samples: 826244. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) |
|
[2024-10-23 06:21:44,961][02423] Avg episode reward: [(0, '19.864')] |
|
[2024-10-23 06:21:46,686][04575] Updated weights for policy 0, policy_version 810 (0.0028) |
|
[2024-10-23 06:21:49,963][02423] Fps is (10 sec: 3278.9, 60 sec: 3890.9, 300 sec: 3901.5). Total num frames: 3325952. Throughput: 0: 945.2. Samples: 831206. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) |
|
[2024-10-23 06:21:49,965][02423] Avg episode reward: [(0, '19.906')] |
|
[2024-10-23 06:21:54,958][02423] Fps is (10 sec: 2867.1, 60 sec: 3686.4, 300 sec: 3860.0). Total num frames: 3338240. Throughput: 0: 926.4. Samples: 835646. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2024-10-23 06:21:54,964][02423] Avg episode reward: [(0, '20.415')] |
|
[2024-10-23 06:21:59,801][04575] Updated weights for policy 0, policy_version 820 (0.0051) |
|
[2024-10-23 06:21:59,957][02423] Fps is (10 sec: 3278.6, 60 sec: 3686.5, 300 sec: 3860.0). Total num frames: 3358720. Throughput: 0: 897.4. Samples: 837814. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2024-10-23 06:21:59,965][02423] Avg episode reward: [(0, '19.662')] |
|
[2024-10-23 06:22:04,957][02423] Fps is (10 sec: 4096.1, 60 sec: 3754.8, 300 sec: 3887.7). Total num frames: 3379200. Throughput: 0: 900.3. Samples: 844410. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2024-10-23 06:22:04,963][02423] Avg episode reward: [(0, '19.811')] |
|
[2024-10-23 06:22:09,161][04575] Updated weights for policy 0, policy_version 830 (0.0042) |
|
[2024-10-23 06:22:09,957][02423] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3887.7). Total num frames: 3399680. Throughput: 0: 944.5. Samples: 850918. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) |
|
[2024-10-23 06:22:09,965][02423] Avg episode reward: [(0, '19.361')] |
|
[2024-10-23 06:22:14,958][02423] Fps is (10 sec: 3686.1, 60 sec: 3618.1, 300 sec: 3859.9). Total num frames: 3416064. Throughput: 0: 912.4. Samples: 852990. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) |
|
[2024-10-23 06:22:14,966][02423] Avg episode reward: [(0, '19.862')] |
|
[2024-10-23 06:22:19,957][02423] Fps is (10 sec: 3686.4, 60 sec: 3686.4, 300 sec: 3873.9). Total num frames: 3436544. Throughput: 0: 885.6. Samples: 858758. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2024-10-23 06:22:19,961][02423] Avg episode reward: [(0, '20.360')] |
|
[2024-10-23 06:22:20,241][04575] Updated weights for policy 0, policy_version 840 (0.0031) |
|
[2024-10-23 06:22:24,958][02423] Fps is (10 sec: 4505.8, 60 sec: 3823.0, 300 sec: 3901.6). Total num frames: 3461120. Throughput: 0: 936.4. Samples: 865792. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) |
|
[2024-10-23 06:22:24,962][02423] Avg episode reward: [(0, '21.160')] |
|
[2024-10-23 06:22:29,957][02423] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3860.0). Total num frames: 3473408. Throughput: 0: 933.2. Samples: 868236. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) |
|
[2024-10-23 06:22:29,962][02423] Avg episode reward: [(0, '21.745')] |
|
[2024-10-23 06:22:31,585][04575] Updated weights for policy 0, policy_version 850 (0.0034) |
|
[2024-10-23 06:22:34,957][02423] Fps is (10 sec: 3276.9, 60 sec: 3618.1, 300 sec: 3860.0). Total num frames: 3493888. Throughput: 0: 934.0. Samples: 873230. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2024-10-23 06:22:34,962][02423] Avg episode reward: [(0, '22.391')] |
|
[2024-10-23 06:22:39,957][02423] Fps is (10 sec: 4505.6, 60 sec: 3755.4, 300 sec: 3901.7). Total num frames: 3518464. Throughput: 0: 993.5. Samples: 880352. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) |
|
[2024-10-23 06:22:39,965][02423] Avg episode reward: [(0, '24.164')] |
|
[2024-10-23 06:22:39,974][04562] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000859_3518464.pth... |
|
[2024-10-23 06:22:40,097][04562] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000632_2588672.pth |
|
[2024-10-23 06:22:40,107][04562] Saving new best policy, reward=24.164! |
|
[2024-10-23 06:22:40,463][04575] Updated weights for policy 0, policy_version 860 (0.0014) |
|
[2024-10-23 06:22:44,958][02423] Fps is (10 sec: 4095.8, 60 sec: 3754.6, 300 sec: 3873.8). Total num frames: 3534848. Throughput: 0: 1013.9. Samples: 883438. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2024-10-23 06:22:44,961][02423] Avg episode reward: [(0, '24.117')] |
|
[2024-10-23 06:22:49,957][02423] Fps is (10 sec: 3276.8, 60 sec: 3755.0, 300 sec: 3846.1). Total num frames: 3551232. Throughput: 0: 959.6. Samples: 887592. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2024-10-23 06:22:49,963][02423] Avg episode reward: [(0, '24.696')] |
|
[2024-10-23 06:22:49,973][04562] Saving new best policy, reward=24.696! |
|
[2024-10-23 06:22:52,069][04575] Updated weights for policy 0, policy_version 870 (0.0031) |
|
[2024-10-23 06:22:54,957][02423] Fps is (10 sec: 4096.2, 60 sec: 3959.5, 300 sec: 3887.7). Total num frames: 3575808. Throughput: 0: 963.3. Samples: 894268. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2024-10-23 06:22:54,965][02423] Avg episode reward: [(0, '25.200')] |
|
[2024-10-23 06:22:54,968][04562] Saving new best policy, reward=25.200! |
|
[2024-10-23 06:22:59,957][02423] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 3887.7). Total num frames: 3596288. Throughput: 0: 991.8. Samples: 897622. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2024-10-23 06:22:59,964][02423] Avg episode reward: [(0, '24.841')] |
|
[2024-10-23 06:23:02,516][04575] Updated weights for policy 0, policy_version 880 (0.0022) |
|
[2024-10-23 06:23:04,957][02423] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3846.1). Total num frames: 3608576. Throughput: 0: 972.4. Samples: 902514. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2024-10-23 06:23:04,959][02423] Avg episode reward: [(0, '24.853')] |
|
[2024-10-23 06:23:09,957][02423] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3873.9). Total num frames: 3633152. Throughput: 0: 951.4. Samples: 908606. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) |
|
[2024-10-23 06:23:09,966][02423] Avg episode reward: [(0, '23.153')] |
|
[2024-10-23 06:23:12,514][04575] Updated weights for policy 0, policy_version 890 (0.0031) |
|
[2024-10-23 06:23:14,957][02423] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 3887.7). Total num frames: 3653632. Throughput: 0: 973.6. Samples: 912050. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) |
|
[2024-10-23 06:23:14,959][02423] Avg episode reward: [(0, '22.122')] |
|
[2024-10-23 06:23:19,959][02423] Fps is (10 sec: 3685.8, 60 sec: 3891.1, 300 sec: 3873.8). Total num frames: 3670016. Throughput: 0: 989.7. Samples: 917766. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) |
|
[2024-10-23 06:23:19,964][02423] Avg episode reward: [(0, '21.723')] |
|
[2024-10-23 06:23:23,920][04575] Updated weights for policy 0, policy_version 900 (0.0025) |
|
[2024-10-23 06:23:24,957][02423] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3860.0). Total num frames: 3690496. Throughput: 0: 948.1. Samples: 923018. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) |
|
[2024-10-23 06:23:24,960][02423] Avg episode reward: [(0, '21.784')] |
|
[2024-10-23 06:23:29,957][02423] Fps is (10 sec: 4506.3, 60 sec: 4027.7, 300 sec: 3901.6). Total num frames: 3715072. Throughput: 0: 957.1. Samples: 926508. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) |
|
[2024-10-23 06:23:29,960][02423] Avg episode reward: [(0, '20.803')] |
|
[2024-10-23 06:23:32,490][04575] Updated weights for policy 0, policy_version 910 (0.0035) |
|
[2024-10-23 06:23:34,959][02423] Fps is (10 sec: 4095.2, 60 sec: 3959.3, 300 sec: 3873.8). Total num frames: 3731456. Throughput: 0: 1017.5. Samples: 933380. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) |
|
[2024-10-23 06:23:34,962][02423] Avg episode reward: [(0, '21.325')] |
|
[2024-10-23 06:23:39,957][02423] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3860.0). Total num frames: 3747840. Throughput: 0: 965.7. Samples: 937724. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) |
|
[2024-10-23 06:23:39,960][02423] Avg episode reward: [(0, '21.196')] |
|
[2024-10-23 06:23:43,759][04575] Updated weights for policy 0, policy_version 920 (0.0033) |
|
[2024-10-23 06:23:44,957][02423] Fps is (10 sec: 4096.8, 60 sec: 3959.5, 300 sec: 3887.7). Total num frames: 3772416. Throughput: 0: 970.9. Samples: 941314. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2024-10-23 06:23:44,962][02423] Avg episode reward: [(0, '19.197')] |
|
[2024-10-23 06:23:49,963][02423] Fps is (10 sec: 4503.1, 60 sec: 4027.4, 300 sec: 3901.5). Total num frames: 3792896. Throughput: 0: 1013.0. Samples: 948104. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2024-10-23 06:23:49,965][02423] Avg episode reward: [(0, '18.367')] |
|
[2024-10-23 06:23:54,948][04575] Updated weights for policy 0, policy_version 930 (0.0016) |
|
[2024-10-23 06:23:54,962][02423] Fps is (10 sec: 3684.6, 60 sec: 3890.9, 300 sec: 3873.8). Total num frames: 3809280. Throughput: 0: 978.2. Samples: 952630. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2024-10-23 06:23:54,968][02423] Avg episode reward: [(0, '19.493')] |
|
[2024-10-23 06:23:59,957][02423] Fps is (10 sec: 3688.5, 60 sec: 3891.2, 300 sec: 3873.8). Total num frames: 3829760. Throughput: 0: 964.0. Samples: 955430. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2024-10-23 06:23:59,964][02423] Avg episode reward: [(0, '19.746')] |
|
[2024-10-23 06:24:04,093][04575] Updated weights for policy 0, policy_version 940 (0.0024) |
|
[2024-10-23 06:24:04,957][02423] Fps is (10 sec: 4098.0, 60 sec: 4027.7, 300 sec: 3887.7). Total num frames: 3850240. Throughput: 0: 992.7. Samples: 962434. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2024-10-23 06:24:04,959][02423] Avg episode reward: [(0, '21.250')] |
|
[2024-10-23 06:24:09,957][02423] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3873.8). Total num frames: 3866624. Throughput: 0: 996.1. Samples: 967842. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2024-10-23 06:24:09,961][02423] Avg episode reward: [(0, '22.391')] |
|
[2024-10-23 06:24:14,958][02423] Fps is (10 sec: 3686.3, 60 sec: 3891.2, 300 sec: 3860.0). Total num frames: 3887104. Throughput: 0: 967.0. Samples: 970022. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2024-10-23 06:24:14,960][02423] Avg episode reward: [(0, '24.176')] |
|
[2024-10-23 06:24:15,521][04575] Updated weights for policy 0, policy_version 950 (0.0029) |
|
[2024-10-23 06:24:19,959][02423] Fps is (10 sec: 4095.2, 60 sec: 3959.4, 300 sec: 3887.7). Total num frames: 3907584. Throughput: 0: 964.4. Samples: 976780. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2024-10-23 06:24:19,965][02423] Avg episode reward: [(0, '23.984')] |
|
[2024-10-23 06:24:24,957][02423] Fps is (10 sec: 4096.1, 60 sec: 3959.5, 300 sec: 3887.7). Total num frames: 3928064. Throughput: 0: 1009.5. Samples: 983152. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2024-10-23 06:24:24,960][02423] Avg episode reward: [(0, '24.360')] |
|
[2024-10-23 06:24:25,369][04575] Updated weights for policy 0, policy_version 960 (0.0019) |
|
[2024-10-23 06:24:29,957][02423] Fps is (10 sec: 3687.1, 60 sec: 3822.9, 300 sec: 3860.0). Total num frames: 3944448. Throughput: 0: 974.4. Samples: 985162. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2024-10-23 06:24:29,959][02423] Avg episode reward: [(0, '24.393')] |
|
[2024-10-23 06:24:34,957][02423] Fps is (10 sec: 4096.0, 60 sec: 3959.6, 300 sec: 3887.8). Total num frames: 3969024. Throughput: 0: 959.5. Samples: 991276. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) |
|
[2024-10-23 06:24:34,965][02423] Avg episode reward: [(0, '23.557')] |
|
[2024-10-23 06:24:35,869][04575] Updated weights for policy 0, policy_version 970 (0.0023) |
|
[2024-10-23 06:24:39,957][02423] Fps is (10 sec: 4505.6, 60 sec: 4027.7, 300 sec: 3901.6). Total num frames: 3989504. Throughput: 0: 1013.7. Samples: 998242. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2024-10-23 06:24:39,963][02423] Avg episode reward: [(0, '23.407')] |
|
[2024-10-23 06:24:39,976][04562] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000974_3989504.pth... |
|
[2024-10-23 06:24:40,165][04562] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000747_3059712.pth |
|
[2024-10-23 06:24:44,839][04562] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth... |
|
[2024-10-23 06:24:44,840][04562] Stopping Batcher_0... |
|
[2024-10-23 06:24:44,847][04562] Loop batcher_evt_loop terminating... |
|
[2024-10-23 06:24:44,858][02423] Component Batcher_0 stopped! |
|
[2024-10-23 06:24:44,989][04575] Weights refcount: 2 0 |
|
[2024-10-23 06:24:44,998][04575] Stopping InferenceWorker_p0-w0... |
|
[2024-10-23 06:24:44,998][02423] Component InferenceWorker_p0-w0 stopped! |
|
[2024-10-23 06:24:44,999][04575] Loop inference_proc0-0_evt_loop terminating... |
|
[2024-10-23 06:24:45,116][04562] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000859_3518464.pth |
|
[2024-10-23 06:24:45,130][04562] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth... |
|
[2024-10-23 06:24:45,327][04562] Stopping LearnerWorker_p0... |
|
[2024-10-23 06:24:45,327][04562] Loop learner_proc0_evt_loop terminating... |
|
[2024-10-23 06:24:45,331][02423] Component LearnerWorker_p0 stopped! |
|
[2024-10-23 06:24:45,397][04580] Stopping RolloutWorker_w4... |
|
[2024-10-23 06:24:45,397][02423] Component RolloutWorker_w4 stopped! |
|
[2024-10-23 06:24:45,401][04580] Loop rollout_proc4_evt_loop terminating... |
|
[2024-10-23 06:24:45,423][02423] Component RolloutWorker_w2 stopped! |
|
[2024-10-23 06:24:45,425][04577] Stopping RolloutWorker_w2... |
|
[2024-10-23 06:24:45,427][04577] Loop rollout_proc2_evt_loop terminating... |
|
[2024-10-23 06:24:45,469][02423] Component RolloutWorker_w0 stopped! |
|
[2024-10-23 06:24:45,470][04579] Stopping RolloutWorker_w3... |
|
[2024-10-23 06:24:45,475][04579] Loop rollout_proc3_evt_loop terminating... |
|
[2024-10-23 06:24:45,471][02423] Component RolloutWorker_w3 stopped! |
|
[2024-10-23 06:24:45,477][04576] Stopping RolloutWorker_w0... |
|
[2024-10-23 06:24:45,478][04576] Loop rollout_proc0_evt_loop terminating... |
|
[2024-10-23 06:24:45,509][04578] Stopping RolloutWorker_w1... |
|
[2024-10-23 06:24:45,509][04578] Loop rollout_proc1_evt_loop terminating... |
|
[2024-10-23 06:24:45,509][02423] Component RolloutWorker_w1 stopped! |
|
[2024-10-23 06:24:45,514][04587] Stopping RolloutWorker_w7... |
|
[2024-10-23 06:24:45,518][04581] Stopping RolloutWorker_w5... |
|
[2024-10-23 06:24:45,518][04581] Loop rollout_proc5_evt_loop terminating... |
|
[2024-10-23 06:24:45,518][02423] Component RolloutWorker_w7 stopped! |
|
[2024-10-23 06:24:45,521][02423] Component RolloutWorker_w5 stopped! |
|
[2024-10-23 06:24:45,524][04587] Loop rollout_proc7_evt_loop terminating... |
|
[2024-10-23 06:24:45,628][02423] Component RolloutWorker_w6 stopped! |
|
[2024-10-23 06:24:45,632][02423] Waiting for process learner_proc0 to stop... |
|
[2024-10-23 06:24:45,634][04582] Stopping RolloutWorker_w6... |
|
[2024-10-23 06:24:45,635][04582] Loop rollout_proc6_evt_loop terminating... |
|
[2024-10-23 06:24:47,045][02423] Waiting for process inference_proc0-0 to join... |
|
[2024-10-23 06:24:47,051][02423] Waiting for process rollout_proc0 to join... |
|
[2024-10-23 06:24:49,122][02423] Waiting for process rollout_proc1 to join... |
|
[2024-10-23 06:24:49,124][02423] Waiting for process rollout_proc2 to join... |
|
[2024-10-23 06:24:49,129][02423] Waiting for process rollout_proc3 to join... |
|
[2024-10-23 06:24:49,134][02423] Waiting for process rollout_proc4 to join... |
|
[2024-10-23 06:24:49,138][02423] Waiting for process rollout_proc5 to join... |
|
[2024-10-23 06:24:49,140][02423] Waiting for process rollout_proc6 to join... |
|
[2024-10-23 06:24:49,144][02423] Waiting for process rollout_proc7 to join... |
|
[2024-10-23 06:24:49,147][02423] Batcher 0 profile tree view: |
|
batching: 27.3929, releasing_batches: 0.0267 |
|
[2024-10-23 06:24:49,149][02423] InferenceWorker_p0-w0 profile tree view: |
|
wait_policy: 0.0101 |
|
wait_policy_total: 405.7055 |
|
update_model: 8.9965 |
|
weight_update: 0.0032 |
|
one_step: 0.0027 |
|
handle_policy_step: 596.9414 |
|
deserialize: 15.0249, stack: 3.1059, obs_to_device_normalize: 121.6974, forward: 316.2653, send_messages: 29.0857 |
|
prepare_outputs: 82.3907 |
|
to_cpu: 47.0287 |
|
[2024-10-23 06:24:49,150][02423] Learner 0 profile tree view: |
|
misc: 0.0059, prepare_batch: 13.5154 |
|
train: 73.5838 |
|
epoch_init: 0.0090, minibatch_init: 0.0096, losses_postprocess: 0.6684, kl_divergence: 0.6118, after_optimizer: 33.5494 |
|
calculate_losses: 26.0542 |
|
losses_init: 0.0036, forward_head: 1.1607, bptt_initial: 17.4307, tail: 1.0999, advantages_returns: 0.2699, losses: 3.8101 |
|
bptt: 1.9995 |
|
bptt_forward_core: 1.9155 |
|
update: 11.9769 |
|
clip: 0.9129 |
|
[2024-10-23 06:24:49,152][02423] RolloutWorker_w0 profile tree view: |
|
wait_for_trajectories: 0.3280, enqueue_policy_requests: 96.6741, env_step: 828.3052, overhead: 12.7697, complete_rollouts: 7.2172 |
|
save_policy_outputs: 20.7797 |
|
split_output_tensors: 8.5742 |
|
[2024-10-23 06:24:49,153][02423] RolloutWorker_w7 profile tree view: |
|
wait_for_trajectories: 0.3792, enqueue_policy_requests: 100.0160, env_step: 818.1125, overhead: 13.3557, complete_rollouts: 7.2359 |
|
save_policy_outputs: 20.7195 |
|
split_output_tensors: 8.1594 |
|
[2024-10-23 06:24:49,156][02423] Loop Runner_EvtLoop terminating... |
|
[2024-10-23 06:24:49,157][02423] Runner profile tree view: |
|
main_loop: 1083.3421 |
|
[2024-10-23 06:24:49,159][02423] Collected {0: 4005888}, FPS: 3697.7 |
|
[2024-10-23 06:25:04,821][02423] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json |
|
[2024-10-23 06:25:04,823][02423] Overriding arg 'num_workers' with value 1 passed from command line |
|
[2024-10-23 06:25:04,825][02423] Adding new argument 'no_render'=True that is not in the saved config file! |
|
[2024-10-23 06:25:04,828][02423] Adding new argument 'save_video'=True that is not in the saved config file! |
|
[2024-10-23 06:25:04,829][02423] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file! |
|
[2024-10-23 06:25:04,831][02423] Adding new argument 'video_name'=None that is not in the saved config file! |
|
[2024-10-23 06:25:04,832][02423] Adding new argument 'max_num_frames'=1000000000.0 that is not in the saved config file! |
|
[2024-10-23 06:25:04,833][02423] Adding new argument 'max_num_episodes'=10 that is not in the saved config file! |
|
[2024-10-23 06:25:04,834][02423] Adding new argument 'push_to_hub'=False that is not in the saved config file! |
|
[2024-10-23 06:25:04,835][02423] Adding new argument 'hf_repository'=None that is not in the saved config file! |
|
[2024-10-23 06:25:04,836][02423] Adding new argument 'policy_index'=0 that is not in the saved config file! |
|
[2024-10-23 06:25:04,838][02423] Adding new argument 'eval_deterministic'=False that is not in the saved config file! |
|
[2024-10-23 06:25:04,839][02423] Adding new argument 'train_script'=None that is not in the saved config file! |
|
[2024-10-23 06:25:04,840][02423] Adding new argument 'enjoy_script'=None that is not in the saved config file! |
|
[2024-10-23 06:25:04,841][02423] Using frameskip 1 and render_action_repeat=4 for evaluation |
|
[2024-10-23 06:25:04,881][02423] Doom resolution: 160x120, resize resolution: (128, 72) |
|
[2024-10-23 06:25:04,885][02423] RunningMeanStd input shape: (3, 72, 128) |
|
[2024-10-23 06:25:04,887][02423] RunningMeanStd input shape: (1,) |
|
[2024-10-23 06:25:04,904][02423] ConvEncoder: input_channels=3 |
|
[2024-10-23 06:25:05,004][02423] Conv encoder output size: 512 |
|
[2024-10-23 06:25:05,006][02423] Policy head output size: 512 |
|
[2024-10-23 06:25:05,179][02423] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth... |
|
[2024-10-23 06:25:06,004][02423] Num frames 100... |
|
[2024-10-23 06:25:06,131][02423] Num frames 200... |
|
[2024-10-23 06:25:06,263][02423] Num frames 300... |
|
[2024-10-23 06:25:06,389][02423] Num frames 400... |
|
[2024-10-23 06:25:06,515][02423] Num frames 500... |
|
[2024-10-23 06:25:06,638][02423] Num frames 600... |
|
[2024-10-23 06:25:06,760][02423] Num frames 700... |
|
[2024-10-23 06:25:06,896][02423] Num frames 800... |
|
[2024-10-23 06:25:07,020][02423] Num frames 900... |
|
[2024-10-23 06:25:07,106][02423] Avg episode rewards: #0: 21.240, true rewards: #0: 9.240 |
|
[2024-10-23 06:25:07,107][02423] Avg episode reward: 21.240, avg true_objective: 9.240 |
|
[2024-10-23 06:25:07,206][02423] Num frames 1000... |
|
[2024-10-23 06:25:07,348][02423] Num frames 1100... |
|
[2024-10-23 06:25:07,527][02423] Num frames 1200... |
|
[2024-10-23 06:25:07,691][02423] Num frames 1300... |
|
[2024-10-23 06:25:07,875][02423] Num frames 1400... |
|
[2024-10-23 06:25:08,045][02423] Num frames 1500... |
|
[2024-10-23 06:25:08,218][02423] Num frames 1600... |
|
[2024-10-23 06:25:08,389][02423] Num frames 1700... |
|
[2024-10-23 06:25:08,566][02423] Num frames 1800... |
|
[2024-10-23 06:25:08,739][02423] Num frames 1900... |
|
[2024-10-23 06:25:08,917][02423] Num frames 2000... |
|
[2024-10-23 06:25:09,092][02423] Num frames 2100... |
|
[2024-10-23 06:25:09,272][02423] Num frames 2200... |
|
[2024-10-23 06:25:09,449][02423] Num frames 2300... |
|
[2024-10-23 06:25:09,669][02423] Avg episode rewards: #0: 27.980, true rewards: #0: 11.980 |
|
[2024-10-23 06:25:09,672][02423] Avg episode reward: 27.980, avg true_objective: 11.980 |
|
[2024-10-23 06:25:09,684][02423] Num frames 2400... |
|
[2024-10-23 06:25:09,804][02423] Num frames 2500... |
|
[2024-10-23 06:25:09,924][02423] Num frames 2600... |
|
[2024-10-23 06:25:10,052][02423] Num frames 2700... |
|
[2024-10-23 06:25:10,175][02423] Num frames 2800... |
|
[2024-10-23 06:25:10,298][02423] Num frames 2900... |
|
[2024-10-23 06:25:10,422][02423] Num frames 3000... |
|
[2024-10-23 06:25:10,554][02423] Num frames 3100... |
|
[2024-10-23 06:25:10,676][02423] Num frames 3200... |
|
[2024-10-23 06:25:10,796][02423] Num frames 3300... |
|
[2024-10-23 06:25:10,922][02423] Num frames 3400... |
|
[2024-10-23 06:25:11,051][02423] Num frames 3500... |
|
[2024-10-23 06:25:11,173][02423] Num frames 3600... |
|
[2024-10-23 06:25:11,299][02423] Num frames 3700... |
|
[2024-10-23 06:25:11,427][02423] Num frames 3800... |
|
[2024-10-23 06:25:11,523][02423] Avg episode rewards: #0: 30.753, true rewards: #0: 12.753 |
|
[2024-10-23 06:25:11,524][02423] Avg episode reward: 30.753, avg true_objective: 12.753 |
|
[2024-10-23 06:25:11,618][02423] Num frames 3900... |
|
[2024-10-23 06:25:11,736][02423] Num frames 4000... |
|
[2024-10-23 06:25:11,855][02423] Num frames 4100... |
|
[2024-10-23 06:25:11,974][02423] Num frames 4200... |
|
[2024-10-23 06:25:12,102][02423] Num frames 4300... |
|
[2024-10-23 06:25:12,224][02423] Num frames 4400... |
|
[2024-10-23 06:25:12,349][02423] Num frames 4500... |
|
[2024-10-23 06:25:12,522][02423] Avg episode rewards: #0: 26.235, true rewards: #0: 11.485 |
|
[2024-10-23 06:25:12,524][02423] Avg episode reward: 26.235, avg true_objective: 11.485 |
|
[2024-10-23 06:25:12,534][02423] Num frames 4600... |
|
[2024-10-23 06:25:12,657][02423] Num frames 4700... |
|
[2024-10-23 06:25:12,777][02423] Num frames 4800... |
|
[2024-10-23 06:25:12,900][02423] Num frames 4900... |
|
[2024-10-23 06:25:13,030][02423] Num frames 5000... |
|
[2024-10-23 06:25:13,154][02423] Num frames 5100... |
|
[2024-10-23 06:25:13,274][02423] Num frames 5200... |
|
[2024-10-23 06:25:13,395][02423] Num frames 5300... |
|
[2024-10-23 06:25:13,532][02423] Num frames 5400... |
|
[2024-10-23 06:25:13,655][02423] Num frames 5500... |
|
[2024-10-23 06:25:13,776][02423] Num frames 5600... |
|
[2024-10-23 06:25:13,896][02423] Num frames 5700... |
|
[2024-10-23 06:25:14,044][02423] Avg episode rewards: #0: 25.956, true rewards: #0: 11.556 |
|
[2024-10-23 06:25:14,045][02423] Avg episode reward: 25.956, avg true_objective: 11.556 |
|
[2024-10-23 06:25:14,075][02423] Num frames 5800... |
|
[2024-10-23 06:25:14,196][02423] Num frames 5900... |
|
[2024-10-23 06:25:14,322][02423] Num frames 6000... |
|
[2024-10-23 06:25:14,443][02423] Num frames 6100... |
|
[2024-10-23 06:25:14,570][02423] Num frames 6200... |
|
[2024-10-23 06:25:14,700][02423] Num frames 6300... |
|
[2024-10-23 06:25:14,821][02423] Num frames 6400... |
|
[2024-10-23 06:25:14,942][02423] Num frames 6500... |
|
[2024-10-23 06:25:15,070][02423] Num frames 6600... |
|
[2024-10-23 06:25:15,197][02423] Num frames 6700... |
|
[2024-10-23 06:25:15,321][02423] Num frames 6800... |
|
[2024-10-23 06:25:15,446][02423] Num frames 6900... |
|
[2024-10-23 06:25:15,588][02423] Num frames 7000... |
|
[2024-10-23 06:25:15,712][02423] Num frames 7100... |
|
[2024-10-23 06:25:15,834][02423] Num frames 7200... |
|
[2024-10-23 06:25:15,956][02423] Num frames 7300... |
|
[2024-10-23 06:25:16,088][02423] Num frames 7400... |
|
[2024-10-23 06:25:16,217][02423] Num frames 7500... |
|
[2024-10-23 06:25:16,340][02423] Num frames 7600... |
|
[2024-10-23 06:25:16,485][02423] Num frames 7700... |
|
[2024-10-23 06:25:16,607][02423] Num frames 7800... |
|
[2024-10-23 06:25:16,735][02423] Avg episode rewards: #0: 30.430, true rewards: #0: 13.097 |
|
[2024-10-23 06:25:16,737][02423] Avg episode reward: 30.430, avg true_objective: 13.097 |
|
[2024-10-23 06:25:16,793][02423] Num frames 7900... |
|
[2024-10-23 06:25:16,917][02423] Num frames 8000... |
|
[2024-10-23 06:25:17,040][02423] Num frames 8100... |
|
[2024-10-23 06:25:17,173][02423] Num frames 8200... |
|
[2024-10-23 06:25:17,295][02423] Num frames 8300... |
|
[2024-10-23 06:25:17,419][02423] Num frames 8400... |
|
[2024-10-23 06:25:17,548][02423] Num frames 8500... |
|
[2024-10-23 06:25:17,666][02423] Num frames 8600... |
|
[2024-10-23 06:25:17,782][02423] Num frames 8700... |
|
[2024-10-23 06:25:17,903][02423] Num frames 8800... |
|
[2024-10-23 06:25:18,022][02423] Num frames 8900... |
|
[2024-10-23 06:25:18,148][02423] Num frames 9000... |
|
[2024-10-23 06:25:18,218][02423] Avg episode rewards: #0: 29.157, true rewards: #0: 12.871 |
|
[2024-10-23 06:25:18,220][02423] Avg episode reward: 29.157, avg true_objective: 12.871 |
|
[2024-10-23 06:25:18,334][02423] Num frames 9100... |
|
[2024-10-23 06:25:18,459][02423] Num frames 9200... |
|
[2024-10-23 06:25:18,584][02423] Num frames 9300... |
|
[2024-10-23 06:25:18,704][02423] Num frames 9400... |
|
[2024-10-23 06:25:18,846][02423] Num frames 9500... |
|
[2024-10-23 06:25:18,977][02423] Num frames 9600... |
|
[2024-10-23 06:25:19,098][02423] Num frames 9700... |
|
[2024-10-23 06:25:19,229][02423] Num frames 9800... |
|
[2024-10-23 06:25:19,353][02423] Num frames 9900... |
|
[2024-10-23 06:25:19,417][02423] Avg episode rewards: #0: 27.507, true rewards: #0: 12.382 |
|
[2024-10-23 06:25:19,418][02423] Avg episode reward: 27.507, avg true_objective: 12.382 |
|
[2024-10-23 06:25:19,540][02423] Num frames 10000... |
|
[2024-10-23 06:25:19,660][02423] Num frames 10100... |
|
[2024-10-23 06:25:19,826][02423] Num frames 10200... |
|
[2024-10-23 06:25:19,995][02423] Num frames 10300... |
|
[2024-10-23 06:25:20,158][02423] Num frames 10400... |
|
[2024-10-23 06:25:20,338][02423] Num frames 10500... |
|
[2024-10-23 06:25:20,531][02423] Avg episode rewards: #0: 25.642, true rewards: #0: 11.753 |
|
[2024-10-23 06:25:20,533][02423] Avg episode reward: 25.642, avg true_objective: 11.753 |
|
[2024-10-23 06:25:20,577][02423] Num frames 10600... |
|
[2024-10-23 06:25:20,737][02423] Num frames 10700... |
|
[2024-10-23 06:25:20,904][02423] Num frames 10800... |
|
[2024-10-23 06:25:21,080][02423] Num frames 10900... |
|
[2024-10-23 06:25:21,268][02423] Num frames 11000... |
|
[2024-10-23 06:25:21,440][02423] Num frames 11100... |
|
[2024-10-23 06:25:21,622][02423] Num frames 11200... |
|
[2024-10-23 06:25:21,765][02423] Avg episode rewards: #0: 24.450, true rewards: #0: 11.250 |
|
[2024-10-23 06:25:21,767][02423] Avg episode reward: 24.450, avg true_objective: 11.250 |
|
[2024-10-23 06:26:30,662][02423] Replay video saved to /content/train_dir/default_experiment/replay.mp4! |
|
[2024-10-23 06:31:08,734][02423] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json |
|
[2024-10-23 06:31:08,736][02423] Overriding arg 'num_workers' with value 1 passed from command line |
|
[2024-10-23 06:31:08,738][02423] Adding new argument 'no_render'=True that is not in the saved config file! |
|
[2024-10-23 06:31:08,740][02423] Adding new argument 'save_video'=True that is not in the saved config file! |
|
[2024-10-23 06:31:08,741][02423] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file! |
|
[2024-10-23 06:31:08,743][02423] Adding new argument 'video_name'=None that is not in the saved config file! |
|
[2024-10-23 06:31:08,745][02423] Adding new argument 'max_num_frames'=100000 that is not in the saved config file! |
|
[2024-10-23 06:31:08,747][02423] Adding new argument 'max_num_episodes'=10 that is not in the saved config file! |
|
[2024-10-23 06:31:08,747][02423] Adding new argument 'push_to_hub'=True that is not in the saved config file! |
|
[2024-10-23 06:31:08,749][02423] Adding new argument 'hf_repository'='bcyeung/rl_course_vizdoom_health_gathering_supreme' that is not in the saved config file! |
|
[2024-10-23 06:31:08,750][02423] Adding new argument 'policy_index'=0 that is not in the saved config file! |
|
[2024-10-23 06:31:08,751][02423] Adding new argument 'eval_deterministic'=False that is not in the saved config file! |
|
[2024-10-23 06:31:08,752][02423] Adding new argument 'train_script'=None that is not in the saved config file! |
|
[2024-10-23 06:31:08,753][02423] Adding new argument 'enjoy_script'=None that is not in the saved config file! |
|
[2024-10-23 06:31:08,754][02423] Using frameskip 1 and render_action_repeat=4 for evaluation |
|
[2024-10-23 06:31:08,782][02423] RunningMeanStd input shape: (3, 72, 128) |
|
[2024-10-23 06:31:08,783][02423] RunningMeanStd input shape: (1,) |
|
[2024-10-23 06:31:08,796][02423] ConvEncoder: input_channels=3 |
|
[2024-10-23 06:31:08,832][02423] Conv encoder output size: 512 |
|
[2024-10-23 06:31:08,833][02423] Policy head output size: 512 |
|
[2024-10-23 06:31:08,853][02423] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth... |
|
[2024-10-23 06:31:09,281][02423] Num frames 100... |
|
[2024-10-23 06:31:09,403][02423] Num frames 200... |
|
[2024-10-23 06:31:09,547][02423] Num frames 300... |
|
[2024-10-23 06:31:09,852][02423] Num frames 400... |
|
[2024-10-23 06:31:10,039][02423] Num frames 500... |
|
[2024-10-23 06:31:10,226][02423] Num frames 600... |
|
[2024-10-23 06:31:10,487][02423] Num frames 700... |
|
[2024-10-23 06:31:10,752][02423] Avg episode rewards: #0: 14.740, true rewards: #0: 7.740 |
|
[2024-10-23 06:31:10,757][02423] Avg episode reward: 14.740, avg true_objective: 7.740 |
|
[2024-10-23 06:31:10,851][02423] Num frames 800... |
|
[2024-10-23 06:31:11,125][02423] Num frames 900... |
|
[2024-10-23 06:31:11,359][02423] Num frames 1000... |
|
[2024-10-23 06:31:11,648][02423] Num frames 1100... |
|
[2024-10-23 06:31:11,856][02423] Num frames 1200... |
|
[2024-10-23 06:31:12,158][02423] Num frames 1300... |
|
[2024-10-23 06:31:12,448][02423] Num frames 1400... |
|
[2024-10-23 06:31:12,679][02423] Num frames 1500... |
|
[2024-10-23 06:31:12,953][02423] Num frames 1600... |
|
[2024-10-23 06:31:13,304][02423] Num frames 1700... |
|
[2024-10-23 06:31:13,737][02423] Avg episode rewards: #0: 18.990, true rewards: #0: 8.990 |
|
[2024-10-23 06:31:13,742][02423] Avg episode reward: 18.990, avg true_objective: 8.990 |
|
[2024-10-23 06:31:13,748][02423] Num frames 1800... |
|
[2024-10-23 06:31:13,912][02423] Num frames 1900... |
|
[2024-10-23 06:31:14,080][02423] Num frames 2000... |
|
[2024-10-23 06:31:14,258][02423] Num frames 2100... |
|
[2024-10-23 06:31:14,426][02423] Num frames 2200... |
|
[2024-10-23 06:31:14,606][02423] Num frames 2300... |
|
[2024-10-23 06:31:14,782][02423] Num frames 2400... |
|
[2024-10-23 06:31:14,952][02423] Num frames 2500... |
|
[2024-10-23 06:31:15,126][02423] Num frames 2600... |
|
[2024-10-23 06:31:15,308][02423] Num frames 2700... |
|
[2024-10-23 06:31:15,487][02423] Num frames 2800... |
|
[2024-10-23 06:31:15,664][02423] Num frames 2900... |
|
[2024-10-23 06:31:15,796][02423] Num frames 3000... |
|
[2024-10-23 06:31:15,917][02423] Num frames 3100... |
|
[2024-10-23 06:31:16,006][02423] Avg episode rewards: #0: 24.083, true rewards: #0: 10.417 |
|
[2024-10-23 06:31:16,008][02423] Avg episode reward: 24.083, avg true_objective: 10.417 |
|
[2024-10-23 06:31:16,100][02423] Num frames 3200... |
|
[2024-10-23 06:31:16,220][02423] Num frames 3300... |
|
[2024-10-23 06:31:16,352][02423] Num frames 3400... |
|
[2024-10-23 06:31:16,486][02423] Num frames 3500... |
|
[2024-10-23 06:31:16,611][02423] Num frames 3600... |
|
[2024-10-23 06:31:16,736][02423] Num frames 3700... |
|
[2024-10-23 06:31:16,860][02423] Num frames 3800... |
|
[2024-10-23 06:31:16,985][02423] Num frames 3900... |
|
[2024-10-23 06:31:17,107][02423] Num frames 4000... |
|
[2024-10-23 06:31:17,273][02423] Avg episode rewards: #0: 22.963, true rewards: #0: 10.212 |
|
[2024-10-23 06:31:17,276][02423] Avg episode reward: 22.963, avg true_objective: 10.212 |
|
[2024-10-23 06:31:17,298][02423] Num frames 4100... |
|
[2024-10-23 06:31:17,428][02423] Num frames 4200... |
|
[2024-10-23 06:31:17,564][02423] Num frames 4300... |
|
[2024-10-23 06:31:17,687][02423] Num frames 4400... |
|
[2024-10-23 06:31:17,819][02423] Num frames 4500... |
|
[2024-10-23 06:31:17,939][02423] Num frames 4600... |
|
[2024-10-23 06:31:18,058][02423] Num frames 4700... |
|
[2024-10-23 06:31:18,181][02423] Num frames 4800... |
|
[2024-10-23 06:31:18,304][02423] Num frames 4900... |
|
[2024-10-23 06:31:18,383][02423] Avg episode rewards: #0: 21.234, true rewards: #0: 9.834 |
|
[2024-10-23 06:31:18,385][02423] Avg episode reward: 21.234, avg true_objective: 9.834 |
|
[2024-10-23 06:31:18,497][02423] Num frames 5000... |
|
[2024-10-23 06:31:18,621][02423] Num frames 5100... |
|
[2024-10-23 06:31:18,741][02423] Num frames 5200... |
|
[2024-10-23 06:31:18,866][02423] Num frames 5300... |
|
[2024-10-23 06:31:18,984][02423] Num frames 5400... |
|
[2024-10-23 06:31:19,105][02423] Num frames 5500... |
|
[2024-10-23 06:31:19,203][02423] Avg episode rewards: #0: 19.558, true rewards: #0: 9.225 |
|
[2024-10-23 06:31:19,205][02423] Avg episode reward: 19.558, avg true_objective: 9.225 |
|
[2024-10-23 06:31:19,289][02423] Num frames 5600... |
|
[2024-10-23 06:31:19,427][02423] Num frames 5700... |
|
[2024-10-23 06:31:19,561][02423] Num frames 5800... |
|
[2024-10-23 06:31:19,687][02423] Num frames 5900... |
|
[2024-10-23 06:31:19,811][02423] Num frames 6000... |
|
[2024-10-23 06:31:19,934][02423] Num frames 6100... |
|
[2024-10-23 06:31:20,054][02423] Num frames 6200... |
|
[2024-10-23 06:31:20,174][02423] Num frames 6300... |
|
[2024-10-23 06:31:20,297][02423] Num frames 6400... |
|
[2024-10-23 06:31:20,437][02423] Num frames 6500... |
|
[2024-10-23 06:31:20,532][02423] Avg episode rewards: #0: 20.039, true rewards: #0: 9.324 |
|
[2024-10-23 06:31:20,533][02423] Avg episode reward: 20.039, avg true_objective: 9.324 |
|
[2024-10-23 06:31:20,626][02423] Num frames 6600... |
|
[2024-10-23 06:31:20,746][02423] Num frames 6700... |
|
[2024-10-23 06:31:20,869][02423] Num frames 6800... |
|
[2024-10-23 06:31:20,993][02423] Num frames 6900... |
|
[2024-10-23 06:31:21,112][02423] Num frames 7000... |
|
[2024-10-23 06:31:21,261][02423] Avg episode rewards: #0: 18.719, true rewards: #0: 8.844 |
|
[2024-10-23 06:31:21,262][02423] Avg episode reward: 18.719, avg true_objective: 8.844 |
|
[2024-10-23 06:31:21,296][02423] Num frames 7100... |
|
[2024-10-23 06:31:21,419][02423] Num frames 7200... |
|
[2024-10-23 06:31:21,562][02423] Num frames 7300... |
|
[2024-10-23 06:31:21,684][02423] Num frames 7400... |
|
[2024-10-23 06:31:21,805][02423] Num frames 7500... |
|
[2024-10-23 06:31:21,926][02423] Num frames 7600... |
|
[2024-10-23 06:31:22,047][02423] Num frames 7700... |
|
[2024-10-23 06:31:22,172][02423] Num frames 7800... |
|
[2024-10-23 06:31:22,295][02423] Num frames 7900... |
|
[2024-10-23 06:31:22,436][02423] Avg episode rewards: #0: 18.852, true rewards: #0: 8.852 |
|
[2024-10-23 06:31:22,438][02423] Avg episode reward: 18.852, avg true_objective: 8.852 |
|
[2024-10-23 06:31:22,502][02423] Num frames 8000... |
|
[2024-10-23 06:31:22,630][02423] Num frames 8100... |
|
[2024-10-23 06:31:22,755][02423] Num frames 8200... |
|
[2024-10-23 06:31:22,877][02423] Num frames 8300... |
|
[2024-10-23 06:31:23,001][02423] Num frames 8400... |
|
[2024-10-23 06:31:23,120][02423] Num frames 8500... |
|
[2024-10-23 06:31:23,240][02423] Num frames 8600... |
|
[2024-10-23 06:31:23,369][02423] Num frames 8700... |
|
[2024-10-23 06:31:23,499][02423] Num frames 8800... |
|
[2024-10-23 06:31:23,633][02423] Num frames 8900... |
|
[2024-10-23 06:31:23,756][02423] Num frames 9000... |
|
[2024-10-23 06:31:23,883][02423] Num frames 9100... |
|
[2024-10-23 06:31:24,004][02423] Num frames 9200... |
|
[2024-10-23 06:31:24,124][02423] Num frames 9300... |
|
[2024-10-23 06:31:24,245][02423] Num frames 9400... |
|
[2024-10-23 06:31:24,347][02423] Avg episode rewards: #0: 20.039, true rewards: #0: 9.439 |
|
[2024-10-23 06:31:24,349][02423] Avg episode reward: 20.039, avg true_objective: 9.439 |
|
[2024-10-23 06:32:24,767][02423] Replay video saved to /content/train_dir/default_experiment/replay.mp4! |
|
|