Stoub's picture
Upload folder using huggingface_hub
6d29fae verified
raw
history blame
106 kB
[2025-01-15 14:26:06,217][00319] Saving configuration to /content/train_dir/default_experiment/config.json...
[2025-01-15 14:26:06,224][00319] Rollout worker 0 uses device cpu
[2025-01-15 14:26:06,227][00319] Rollout worker 1 uses device cpu
[2025-01-15 14:26:06,229][00319] Rollout worker 2 uses device cpu
[2025-01-15 14:26:06,231][00319] Rollout worker 3 uses device cpu
[2025-01-15 14:26:06,233][00319] Rollout worker 4 uses device cpu
[2025-01-15 14:26:06,235][00319] Rollout worker 5 uses device cpu
[2025-01-15 14:26:06,237][00319] Rollout worker 6 uses device cpu
[2025-01-15 14:26:06,240][00319] Rollout worker 7 uses device cpu
[2025-01-15 14:26:06,423][00319] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2025-01-15 14:26:06,426][00319] InferenceWorker_p0-w0: min num requests: 2
[2025-01-15 14:26:06,470][00319] Starting all processes...
[2025-01-15 14:26:06,473][00319] Starting process learner_proc0
[2025-01-15 14:26:06,535][00319] Starting all processes...
[2025-01-15 14:26:06,592][00319] Starting process inference_proc0-0
[2025-01-15 14:26:06,593][00319] Starting process rollout_proc0
[2025-01-15 14:26:06,595][00319] Starting process rollout_proc1
[2025-01-15 14:26:06,598][00319] Starting process rollout_proc2
[2025-01-15 14:26:06,598][00319] Starting process rollout_proc3
[2025-01-15 14:26:06,598][00319] Starting process rollout_proc4
[2025-01-15 14:26:06,598][00319] Starting process rollout_proc5
[2025-01-15 14:26:06,598][00319] Starting process rollout_proc6
[2025-01-15 14:26:06,598][00319] Starting process rollout_proc7
[2025-01-15 14:26:24,019][01020] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2025-01-15 14:26:24,019][01020] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0
[2025-01-15 14:26:24,073][01020] Num visible devices: 1
[2025-01-15 14:26:24,134][01020] Starting seed is not provided
[2025-01-15 14:26:24,134][01020] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2025-01-15 14:26:24,135][01020] Initializing actor-critic model on device cuda:0
[2025-01-15 14:26:24,137][01020] RunningMeanStd input shape: (3, 72, 128)
[2025-01-15 14:26:24,141][01020] RunningMeanStd input shape: (1,)
[2025-01-15 14:26:24,203][01020] ConvEncoder: input_channels=3
[2025-01-15 14:26:24,479][01040] Worker 4 uses CPU cores [0]
[2025-01-15 14:26:24,740][01037] Worker 0 uses CPU cores [0]
[2025-01-15 14:26:24,747][01042] Worker 6 uses CPU cores [0]
[2025-01-15 14:26:24,789][01038] Worker 3 uses CPU cores [1]
[2025-01-15 14:26:24,792][01036] Worker 1 uses CPU cores [1]
[2025-01-15 14:26:24,818][01043] Worker 7 uses CPU cores [1]
[2025-01-15 14:26:24,827][01041] Worker 5 uses CPU cores [1]
[2025-01-15 14:26:24,853][01039] Worker 2 uses CPU cores [0]
[2025-01-15 14:26:24,864][01035] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2025-01-15 14:26:24,864][01035] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0
[2025-01-15 14:26:24,884][01035] Num visible devices: 1
[2025-01-15 14:26:24,910][01020] Conv encoder output size: 512
[2025-01-15 14:26:24,911][01020] Policy head output size: 512
[2025-01-15 14:26:24,978][01020] Created Actor Critic model with architecture:
[2025-01-15 14:26:24,978][01020] ActorCriticSharedWeights(
(obs_normalizer): ObservationNormalizer(
(running_mean_std): RunningMeanStdDictInPlace(
(running_mean_std): ModuleDict(
(obs): RunningMeanStdInPlace()
)
)
)
(returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace)
(encoder): VizdoomEncoder(
(basic_encoder): ConvEncoder(
(enc): RecursiveScriptModule(
original_name=ConvEncoderImpl
(conv_head): RecursiveScriptModule(
original_name=Sequential
(0): RecursiveScriptModule(original_name=Conv2d)
(1): RecursiveScriptModule(original_name=ELU)
(2): RecursiveScriptModule(original_name=Conv2d)
(3): RecursiveScriptModule(original_name=ELU)
(4): RecursiveScriptModule(original_name=Conv2d)
(5): RecursiveScriptModule(original_name=ELU)
)
(mlp_layers): RecursiveScriptModule(
original_name=Sequential
(0): RecursiveScriptModule(original_name=Linear)
(1): RecursiveScriptModule(original_name=ELU)
)
)
)
)
(core): ModelCoreRNN(
(core): GRU(512, 512)
)
(decoder): MlpDecoder(
(mlp): Identity()
)
(critic_linear): Linear(in_features=512, out_features=1, bias=True)
(action_parameterization): ActionParameterizationDefault(
(distribution_linear): Linear(in_features=512, out_features=5, bias=True)
)
)
[2025-01-15 14:26:25,304][01020] Using optimizer <class 'torch.optim.adam.Adam'>
[2025-01-15 14:26:26,413][00319] Heartbeat connected on Batcher_0
[2025-01-15 14:26:26,424][00319] Heartbeat connected on InferenceWorker_p0-w0
[2025-01-15 14:26:26,435][00319] Heartbeat connected on RolloutWorker_w0
[2025-01-15 14:26:26,440][00319] Heartbeat connected on RolloutWorker_w1
[2025-01-15 14:26:26,450][00319] Heartbeat connected on RolloutWorker_w3
[2025-01-15 14:26:26,452][00319] Heartbeat connected on RolloutWorker_w2
[2025-01-15 14:26:26,458][00319] Heartbeat connected on RolloutWorker_w4
[2025-01-15 14:26:26,462][00319] Heartbeat connected on RolloutWorker_w5
[2025-01-15 14:26:26,468][00319] Heartbeat connected on RolloutWorker_w6
[2025-01-15 14:26:26,472][00319] Heartbeat connected on RolloutWorker_w7
[2025-01-15 14:26:28,661][01020] No checkpoints found
[2025-01-15 14:26:28,661][01020] Did not load from checkpoint, starting from scratch!
[2025-01-15 14:26:28,662][01020] Initialized policy 0 weights for model version 0
[2025-01-15 14:26:28,666][01020] LearnerWorker_p0 finished initialization!
[2025-01-15 14:26:28,668][01020] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2025-01-15 14:26:28,668][00319] Heartbeat connected on LearnerWorker_p0
[2025-01-15 14:26:28,763][01035] RunningMeanStd input shape: (3, 72, 128)
[2025-01-15 14:26:28,765][01035] RunningMeanStd input shape: (1,)
[2025-01-15 14:26:28,778][01035] ConvEncoder: input_channels=3
[2025-01-15 14:26:28,880][01035] Conv encoder output size: 512
[2025-01-15 14:26:28,880][01035] Policy head output size: 512
[2025-01-15 14:26:28,933][00319] Inference worker 0-0 is ready!
[2025-01-15 14:26:28,935][00319] All inference workers are ready! Signal rollout workers to start!
[2025-01-15 14:26:29,150][01040] Doom resolution: 160x120, resize resolution: (128, 72)
[2025-01-15 14:26:29,151][01037] Doom resolution: 160x120, resize resolution: (128, 72)
[2025-01-15 14:26:29,152][01042] Doom resolution: 160x120, resize resolution: (128, 72)
[2025-01-15 14:26:29,153][01039] Doom resolution: 160x120, resize resolution: (128, 72)
[2025-01-15 14:26:29,165][01036] Doom resolution: 160x120, resize resolution: (128, 72)
[2025-01-15 14:26:29,166][01041] Doom resolution: 160x120, resize resolution: (128, 72)
[2025-01-15 14:26:29,167][01038] Doom resolution: 160x120, resize resolution: (128, 72)
[2025-01-15 14:26:29,168][01043] Doom resolution: 160x120, resize resolution: (128, 72)
[2025-01-15 14:26:29,871][01041] Decorrelating experience for 0 frames...
[2025-01-15 14:26:30,392][01039] Decorrelating experience for 0 frames...
[2025-01-15 14:26:30,390][01040] Decorrelating experience for 0 frames...
[2025-01-15 14:26:30,396][01037] Decorrelating experience for 0 frames...
[2025-01-15 14:26:30,547][00319] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 0. Throughput: 0: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0)
[2025-01-15 14:26:30,767][01041] Decorrelating experience for 32 frames...
[2025-01-15 14:26:30,805][01036] Decorrelating experience for 0 frames...
[2025-01-15 14:26:31,126][01040] Decorrelating experience for 32 frames...
[2025-01-15 14:26:31,136][01039] Decorrelating experience for 32 frames...
[2025-01-15 14:26:31,625][01038] Decorrelating experience for 0 frames...
[2025-01-15 14:26:31,849][01036] Decorrelating experience for 32 frames...
[2025-01-15 14:26:32,152][01041] Decorrelating experience for 64 frames...
[2025-01-15 14:26:32,452][01037] Decorrelating experience for 32 frames...
[2025-01-15 14:26:32,451][01042] Decorrelating experience for 0 frames...
[2025-01-15 14:26:32,836][01040] Decorrelating experience for 64 frames...
[2025-01-15 14:26:33,256][01038] Decorrelating experience for 32 frames...
[2025-01-15 14:26:33,351][01039] Decorrelating experience for 64 frames...
[2025-01-15 14:26:33,589][01042] Decorrelating experience for 32 frames...
[2025-01-15 14:26:33,678][01036] Decorrelating experience for 64 frames...
[2025-01-15 14:26:33,800][01041] Decorrelating experience for 96 frames...
[2025-01-15 14:26:34,020][01043] Decorrelating experience for 0 frames...
[2025-01-15 14:26:34,828][01037] Decorrelating experience for 64 frames...
[2025-01-15 14:26:35,192][01039] Decorrelating experience for 96 frames...
[2025-01-15 14:26:35,548][00319] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 0.0. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0)
[2025-01-15 14:26:35,587][01040] Decorrelating experience for 96 frames...
[2025-01-15 14:26:35,740][01043] Decorrelating experience for 32 frames...
[2025-01-15 14:26:35,877][01036] Decorrelating experience for 96 frames...
[2025-01-15 14:26:38,325][01042] Decorrelating experience for 64 frames...
[2025-01-15 14:26:38,883][01043] Decorrelating experience for 64 frames...
[2025-01-15 14:26:39,232][01037] Decorrelating experience for 96 frames...
[2025-01-15 14:26:40,547][00319] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 79.0. Samples: 790. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0)
[2025-01-15 14:26:40,554][00319] Avg episode reward: [(0, '2.184')]
[2025-01-15 14:26:42,864][01038] Decorrelating experience for 64 frames...
[2025-01-15 14:26:43,626][01020] Signal inference workers to stop experience collection...
[2025-01-15 14:26:43,636][01035] InferenceWorker_p0-w0: stopping experience collection
[2025-01-15 14:26:44,020][01042] Decorrelating experience for 96 frames...
[2025-01-15 14:26:44,208][01038] Decorrelating experience for 96 frames...
[2025-01-15 14:26:44,567][01043] Decorrelating experience for 96 frames...
[2025-01-15 14:26:45,547][00319] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 193.7. Samples: 2906. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0)
[2025-01-15 14:26:45,550][00319] Avg episode reward: [(0, '3.321')]
[2025-01-15 14:26:45,918][01020] Signal inference workers to resume experience collection...
[2025-01-15 14:26:45,920][01035] InferenceWorker_p0-w0: resuming experience collection
[2025-01-15 14:26:50,548][00319] Fps is (10 sec: 2867.1, 60 sec: 1433.6, 300 sec: 1433.6). Total num frames: 28672. Throughput: 0: 360.9. Samples: 7218. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2025-01-15 14:26:50,550][00319] Avg episode reward: [(0, '3.774')]
[2025-01-15 14:26:54,601][01035] Updated weights for policy 0, policy_version 10 (0.0030)
[2025-01-15 14:26:55,547][00319] Fps is (10 sec: 4096.0, 60 sec: 1638.4, 300 sec: 1638.4). Total num frames: 40960. Throughput: 0: 394.2. Samples: 9856. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2025-01-15 14:26:55,551][00319] Avg episode reward: [(0, '4.134')]
[2025-01-15 14:27:00,548][00319] Fps is (10 sec: 2867.2, 60 sec: 1911.5, 300 sec: 1911.5). Total num frames: 57344. Throughput: 0: 464.5. Samples: 13936. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2025-01-15 14:27:00,554][00319] Avg episode reward: [(0, '4.416')]
[2025-01-15 14:27:05,547][00319] Fps is (10 sec: 3686.4, 60 sec: 2223.5, 300 sec: 2223.5). Total num frames: 77824. Throughput: 0: 579.4. Samples: 20280. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2025-01-15 14:27:05,551][00319] Avg episode reward: [(0, '4.450')]
[2025-01-15 14:27:05,734][01035] Updated weights for policy 0, policy_version 20 (0.0018)
[2025-01-15 14:27:10,548][00319] Fps is (10 sec: 4505.6, 60 sec: 2560.0, 300 sec: 2560.0). Total num frames: 102400. Throughput: 0: 595.1. Samples: 23806. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2025-01-15 14:27:10,557][00319] Avg episode reward: [(0, '4.476')]
[2025-01-15 14:27:10,568][01020] Saving new best policy, reward=4.476!
[2025-01-15 14:27:15,548][00319] Fps is (10 sec: 4095.9, 60 sec: 2639.6, 300 sec: 2639.6). Total num frames: 118784. Throughput: 0: 650.4. Samples: 29270. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2025-01-15 14:27:15,552][00319] Avg episode reward: [(0, '4.329')]
[2025-01-15 14:27:16,927][01035] Updated weights for policy 0, policy_version 30 (0.0037)
[2025-01-15 14:27:20,548][00319] Fps is (10 sec: 3276.8, 60 sec: 2703.3, 300 sec: 2703.3). Total num frames: 135168. Throughput: 0: 757.5. Samples: 34088. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2025-01-15 14:27:20,555][00319] Avg episode reward: [(0, '4.299')]
[2025-01-15 14:27:25,547][00319] Fps is (10 sec: 4096.1, 60 sec: 2904.4, 300 sec: 2904.4). Total num frames: 159744. Throughput: 0: 819.8. Samples: 37682. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0)
[2025-01-15 14:27:25,553][00319] Avg episode reward: [(0, '4.481')]
[2025-01-15 14:27:25,559][01020] Saving new best policy, reward=4.481!
[2025-01-15 14:27:26,405][01035] Updated weights for policy 0, policy_version 40 (0.0027)
[2025-01-15 14:27:30,547][00319] Fps is (10 sec: 4505.7, 60 sec: 3003.7, 300 sec: 3003.7). Total num frames: 180224. Throughput: 0: 927.0. Samples: 44620. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2025-01-15 14:27:30,552][00319] Avg episode reward: [(0, '4.494')]
[2025-01-15 14:27:30,565][01020] Saving new best policy, reward=4.494!
[2025-01-15 14:27:35,547][00319] Fps is (10 sec: 3276.8, 60 sec: 3208.5, 300 sec: 2961.7). Total num frames: 192512. Throughput: 0: 925.6. Samples: 48868. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2025-01-15 14:27:35,551][00319] Avg episode reward: [(0, '4.568')]
[2025-01-15 14:27:35,555][01020] Saving new best policy, reward=4.568!
[2025-01-15 14:27:38,142][01035] Updated weights for policy 0, policy_version 50 (0.0023)
[2025-01-15 14:27:40,547][00319] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3042.7). Total num frames: 212992. Throughput: 0: 927.6. Samples: 51596. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2025-01-15 14:27:40,550][00319] Avg episode reward: [(0, '4.362')]
[2025-01-15 14:27:45,547][00319] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 3167.6). Total num frames: 237568. Throughput: 0: 998.9. Samples: 58886. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2025-01-15 14:27:45,550][00319] Avg episode reward: [(0, '4.199')]
[2025-01-15 14:27:46,719][01035] Updated weights for policy 0, policy_version 60 (0.0015)
[2025-01-15 14:27:50,549][00319] Fps is (10 sec: 4095.4, 60 sec: 3754.6, 300 sec: 3174.3). Total num frames: 253952. Throughput: 0: 985.3. Samples: 64618. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2025-01-15 14:27:50,554][00319] Avg episode reward: [(0, '4.307')]
[2025-01-15 14:27:55,547][00319] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3180.4). Total num frames: 270336. Throughput: 0: 954.7. Samples: 66766. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2025-01-15 14:27:55,550][00319] Avg episode reward: [(0, '4.514')]
[2025-01-15 14:27:58,428][01035] Updated weights for policy 0, policy_version 70 (0.0041)
[2025-01-15 14:28:00,547][00319] Fps is (10 sec: 4096.6, 60 sec: 3959.5, 300 sec: 3276.8). Total num frames: 294912. Throughput: 0: 971.3. Samples: 72980. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2025-01-15 14:28:00,550][00319] Avg episode reward: [(0, '4.586')]
[2025-01-15 14:28:00,557][01020] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000072_294912.pth...
[2025-01-15 14:28:00,678][01020] Saving new best policy, reward=4.586!
[2025-01-15 14:28:05,550][00319] Fps is (10 sec: 4913.9, 60 sec: 4027.6, 300 sec: 3362.9). Total num frames: 319488. Throughput: 0: 1022.5. Samples: 80102. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2025-01-15 14:28:05,555][00319] Avg episode reward: [(0, '4.393')]
[2025-01-15 14:28:08,234][01035] Updated weights for policy 0, policy_version 80 (0.0025)
[2025-01-15 14:28:10,547][00319] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3317.8). Total num frames: 331776. Throughput: 0: 994.3. Samples: 82426. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2025-01-15 14:28:10,553][00319] Avg episode reward: [(0, '4.356')]
[2025-01-15 14:28:15,547][00319] Fps is (10 sec: 3277.6, 60 sec: 3891.2, 300 sec: 3354.8). Total num frames: 352256. Throughput: 0: 944.1. Samples: 87104. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2025-01-15 14:28:15,550][00319] Avg episode reward: [(0, '4.479')]
[2025-01-15 14:28:18,782][01035] Updated weights for policy 0, policy_version 90 (0.0025)
[2025-01-15 14:28:20,547][00319] Fps is (10 sec: 4505.6, 60 sec: 4027.8, 300 sec: 3425.7). Total num frames: 376832. Throughput: 0: 1009.6. Samples: 94302. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2025-01-15 14:28:20,551][00319] Avg episode reward: [(0, '4.553')]
[2025-01-15 14:28:25,547][00319] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 3454.9). Total num frames: 397312. Throughput: 0: 1029.2. Samples: 97910. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2025-01-15 14:28:25,553][00319] Avg episode reward: [(0, '4.770')]
[2025-01-15 14:28:25,558][01020] Saving new best policy, reward=4.770!
[2025-01-15 14:28:29,767][01035] Updated weights for policy 0, policy_version 100 (0.0017)
[2025-01-15 14:28:30,547][00319] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3413.3). Total num frames: 409600. Throughput: 0: 965.1. Samples: 102316. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2025-01-15 14:28:30,554][00319] Avg episode reward: [(0, '4.890')]
[2025-01-15 14:28:30,561][01020] Saving new best policy, reward=4.890!
[2025-01-15 14:28:35,547][00319] Fps is (10 sec: 3276.8, 60 sec: 3959.5, 300 sec: 3440.6). Total num frames: 430080. Throughput: 0: 963.4. Samples: 107968. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2025-01-15 14:28:35,552][00319] Avg episode reward: [(0, '4.588')]
[2025-01-15 14:28:39,408][01035] Updated weights for policy 0, policy_version 110 (0.0027)
[2025-01-15 14:28:40,547][00319] Fps is (10 sec: 4505.6, 60 sec: 4027.7, 300 sec: 3497.4). Total num frames: 454656. Throughput: 0: 993.2. Samples: 111460. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2025-01-15 14:28:40,551][00319] Avg episode reward: [(0, '4.622')]
[2025-01-15 14:28:45,547][00319] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3489.2). Total num frames: 471040. Throughput: 0: 994.3. Samples: 117724. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2025-01-15 14:28:45,552][00319] Avg episode reward: [(0, '4.585')]
[2025-01-15 14:28:50,548][00319] Fps is (10 sec: 3276.8, 60 sec: 3891.3, 300 sec: 3481.6). Total num frames: 487424. Throughput: 0: 932.1. Samples: 122042. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2025-01-15 14:28:50,557][00319] Avg episode reward: [(0, '4.683')]
[2025-01-15 14:28:51,406][01035] Updated weights for policy 0, policy_version 120 (0.0049)
[2025-01-15 14:28:55,547][00319] Fps is (10 sec: 4096.0, 60 sec: 4027.7, 300 sec: 3531.0). Total num frames: 512000. Throughput: 0: 954.5. Samples: 125378. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2025-01-15 14:28:55,551][00319] Avg episode reward: [(0, '4.827')]
[2025-01-15 14:28:59,843][01035] Updated weights for policy 0, policy_version 130 (0.0015)
[2025-01-15 14:29:00,549][00319] Fps is (10 sec: 4504.7, 60 sec: 3959.3, 300 sec: 3549.8). Total num frames: 532480. Throughput: 0: 1008.1. Samples: 132472. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2025-01-15 14:29:00,554][00319] Avg episode reward: [(0, '4.696')]
[2025-01-15 14:29:05,547][00319] Fps is (10 sec: 3686.4, 60 sec: 3823.1, 300 sec: 3541.1). Total num frames: 548864. Throughput: 0: 960.7. Samples: 137532. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2025-01-15 14:29:05,554][00319] Avg episode reward: [(0, '5.027')]
[2025-01-15 14:29:05,556][01020] Saving new best policy, reward=5.027!
[2025-01-15 14:29:10,547][00319] Fps is (10 sec: 3277.5, 60 sec: 3891.2, 300 sec: 3532.8). Total num frames: 565248. Throughput: 0: 928.0. Samples: 139670. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2025-01-15 14:29:10,555][00319] Avg episode reward: [(0, '5.043')]
[2025-01-15 14:29:10,563][01020] Saving new best policy, reward=5.043!
[2025-01-15 14:29:11,729][01035] Updated weights for policy 0, policy_version 140 (0.0024)
[2025-01-15 14:29:15,548][00319] Fps is (10 sec: 4095.9, 60 sec: 3959.5, 300 sec: 3574.7). Total num frames: 589824. Throughput: 0: 983.0. Samples: 146552. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2025-01-15 14:29:15,554][00319] Avg episode reward: [(0, '4.671')]
[2025-01-15 14:29:20,547][00319] Fps is (10 sec: 4505.6, 60 sec: 3891.2, 300 sec: 3590.0). Total num frames: 610304. Throughput: 0: 1004.7. Samples: 153180. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2025-01-15 14:29:20,551][00319] Avg episode reward: [(0, '4.508')]
[2025-01-15 14:29:21,160][01035] Updated weights for policy 0, policy_version 150 (0.0020)
[2025-01-15 14:29:25,547][00319] Fps is (10 sec: 3686.5, 60 sec: 3822.9, 300 sec: 3581.1). Total num frames: 626688. Throughput: 0: 974.9. Samples: 155330. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2025-01-15 14:29:25,550][00319] Avg episode reward: [(0, '4.704')]
[2025-01-15 14:29:30,548][00319] Fps is (10 sec: 3686.3, 60 sec: 3959.5, 300 sec: 3595.4). Total num frames: 647168. Throughput: 0: 953.7. Samples: 160640. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2025-01-15 14:29:30,555][00319] Avg episode reward: [(0, '4.693')]
[2025-01-15 14:29:32,036][01035] Updated weights for policy 0, policy_version 160 (0.0030)
[2025-01-15 14:29:35,547][00319] Fps is (10 sec: 4505.6, 60 sec: 4027.7, 300 sec: 3631.0). Total num frames: 671744. Throughput: 0: 1020.3. Samples: 167956. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2025-01-15 14:29:35,556][00319] Avg episode reward: [(0, '4.748')]
[2025-01-15 14:29:40,547][00319] Fps is (10 sec: 4096.1, 60 sec: 3891.2, 300 sec: 3621.7). Total num frames: 688128. Throughput: 0: 1017.2. Samples: 171152. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2025-01-15 14:29:40,554][00319] Avg episode reward: [(0, '4.542')]
[2025-01-15 14:29:42,708][01035] Updated weights for policy 0, policy_version 170 (0.0024)
[2025-01-15 14:29:45,548][00319] Fps is (10 sec: 3276.8, 60 sec: 3891.2, 300 sec: 3612.9). Total num frames: 704512. Throughput: 0: 956.0. Samples: 175490. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2025-01-15 14:29:45,550][00319] Avg episode reward: [(0, '4.606')]
[2025-01-15 14:29:50,547][00319] Fps is (10 sec: 3686.4, 60 sec: 3959.5, 300 sec: 3625.0). Total num frames: 724992. Throughput: 0: 992.0. Samples: 182174. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2025-01-15 14:29:50,552][00319] Avg episode reward: [(0, '4.762')]
[2025-01-15 14:29:52,239][01035] Updated weights for policy 0, policy_version 180 (0.0040)
[2025-01-15 14:29:55,547][00319] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 3656.4). Total num frames: 749568. Throughput: 0: 1024.5. Samples: 185772. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2025-01-15 14:29:55,559][00319] Avg episode reward: [(0, '5.036')]
[2025-01-15 14:30:00,548][00319] Fps is (10 sec: 4095.9, 60 sec: 3891.3, 300 sec: 3647.4). Total num frames: 765952. Throughput: 0: 990.3. Samples: 191116. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2025-01-15 14:30:00,556][00319] Avg episode reward: [(0, '5.109')]
[2025-01-15 14:30:00,572][01020] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000187_765952.pth...
[2025-01-15 14:30:00,754][01020] Saving new best policy, reward=5.109!
[2025-01-15 14:30:03,943][01035] Updated weights for policy 0, policy_version 190 (0.0026)
[2025-01-15 14:30:05,547][00319] Fps is (10 sec: 3276.8, 60 sec: 3891.2, 300 sec: 3638.8). Total num frames: 782336. Throughput: 0: 955.6. Samples: 196184. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2025-01-15 14:30:05,553][00319] Avg episode reward: [(0, '5.293')]
[2025-01-15 14:30:05,556][01020] Saving new best policy, reward=5.293!
[2025-01-15 14:30:10,548][00319] Fps is (10 sec: 4096.0, 60 sec: 4027.7, 300 sec: 3667.8). Total num frames: 806912. Throughput: 0: 984.7. Samples: 199640. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2025-01-15 14:30:10,553][00319] Avg episode reward: [(0, '5.671')]
[2025-01-15 14:30:10,563][01020] Saving new best policy, reward=5.671!
[2025-01-15 14:30:12,835][01035] Updated weights for policy 0, policy_version 200 (0.0020)
[2025-01-15 14:30:15,547][00319] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 3677.3). Total num frames: 827392. Throughput: 0: 1019.8. Samples: 206532. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2025-01-15 14:30:15,554][00319] Avg episode reward: [(0, '5.620')]
[2025-01-15 14:30:20,547][00319] Fps is (10 sec: 3276.9, 60 sec: 3822.9, 300 sec: 3650.8). Total num frames: 839680. Throughput: 0: 950.5. Samples: 210730. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2025-01-15 14:30:20,552][00319] Avg episode reward: [(0, '5.741')]
[2025-01-15 14:30:20,563][01020] Saving new best policy, reward=5.741!
[2025-01-15 14:30:24,619][01035] Updated weights for policy 0, policy_version 210 (0.0027)
[2025-01-15 14:30:25,547][00319] Fps is (10 sec: 3686.4, 60 sec: 3959.5, 300 sec: 3677.7). Total num frames: 864256. Throughput: 0: 941.4. Samples: 213514. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2025-01-15 14:30:25,550][00319] Avg episode reward: [(0, '5.798')]
[2025-01-15 14:30:25,555][01020] Saving new best policy, reward=5.798!
[2025-01-15 14:30:30,547][00319] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 3686.4). Total num frames: 884736. Throughput: 0: 1001.0. Samples: 220534. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2025-01-15 14:30:30,557][00319] Avg episode reward: [(0, '5.834')]
[2025-01-15 14:30:30,564][01020] Saving new best policy, reward=5.834!
[2025-01-15 14:30:34,405][01035] Updated weights for policy 0, policy_version 220 (0.0024)
[2025-01-15 14:30:35,547][00319] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3678.0). Total num frames: 901120. Throughput: 0: 976.8. Samples: 226132. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2025-01-15 14:30:35,551][00319] Avg episode reward: [(0, '6.158')]
[2025-01-15 14:30:35,558][01020] Saving new best policy, reward=6.158!
[2025-01-15 14:30:40,547][00319] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3670.0). Total num frames: 917504. Throughput: 0: 943.6. Samples: 228232. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2025-01-15 14:30:40,553][00319] Avg episode reward: [(0, '6.248')]
[2025-01-15 14:30:40,563][01020] Saving new best policy, reward=6.248!
[2025-01-15 14:30:45,486][01035] Updated weights for policy 0, policy_version 230 (0.0031)
[2025-01-15 14:30:45,547][00319] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3694.4). Total num frames: 942080. Throughput: 0: 958.9. Samples: 234268. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2025-01-15 14:30:45,550][00319] Avg episode reward: [(0, '5.973')]
[2025-01-15 14:30:50,547][00319] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 3702.2). Total num frames: 962560. Throughput: 0: 1011.0. Samples: 241678. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2025-01-15 14:30:50,554][00319] Avg episode reward: [(0, '5.890')]
[2025-01-15 14:30:55,547][00319] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3694.1). Total num frames: 978944. Throughput: 0: 985.8. Samples: 244000. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0)
[2025-01-15 14:30:55,555][00319] Avg episode reward: [(0, '5.898')]
[2025-01-15 14:30:56,223][01035] Updated weights for policy 0, policy_version 240 (0.0040)
[2025-01-15 14:31:00,547][00319] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3686.4). Total num frames: 995328. Throughput: 0: 933.4. Samples: 248534. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2025-01-15 14:31:00,550][00319] Avg episode reward: [(0, '6.219')]
[2025-01-15 14:31:05,547][00319] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3708.7). Total num frames: 1019904. Throughput: 0: 1003.6. Samples: 255894. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2025-01-15 14:31:05,551][00319] Avg episode reward: [(0, '6.216')]
[2025-01-15 14:31:05,611][01035] Updated weights for policy 0, policy_version 250 (0.0025)
[2025-01-15 14:31:10,548][00319] Fps is (10 sec: 4915.2, 60 sec: 3959.5, 300 sec: 3730.3). Total num frames: 1044480. Throughput: 0: 1021.5. Samples: 259480. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2025-01-15 14:31:10,550][00319] Avg episode reward: [(0, '6.386')]
[2025-01-15 14:31:10,557][01020] Saving new best policy, reward=6.386!
[2025-01-15 14:31:15,547][00319] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3708.0). Total num frames: 1056768. Throughput: 0: 969.1. Samples: 264142. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2025-01-15 14:31:15,553][00319] Avg episode reward: [(0, '6.958')]
[2025-01-15 14:31:15,561][01020] Saving new best policy, reward=6.958!
[2025-01-15 14:31:17,485][01035] Updated weights for policy 0, policy_version 260 (0.0015)
[2025-01-15 14:31:20,547][00319] Fps is (10 sec: 3276.8, 60 sec: 3959.5, 300 sec: 3714.6). Total num frames: 1077248. Throughput: 0: 972.9. Samples: 269912. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2025-01-15 14:31:20,557][00319] Avg episode reward: [(0, '6.633')]
[2025-01-15 14:31:25,547][00319] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 3735.0). Total num frames: 1101824. Throughput: 0: 1005.9. Samples: 273496. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2025-01-15 14:31:25,556][00319] Avg episode reward: [(0, '6.221')]
[2025-01-15 14:31:25,819][01035] Updated weights for policy 0, policy_version 270 (0.0025)
[2025-01-15 14:31:30,547][00319] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3790.5). Total num frames: 1118208. Throughput: 0: 1009.5. Samples: 279694. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2025-01-15 14:31:30,553][00319] Avg episode reward: [(0, '6.188')]
[2025-01-15 14:31:35,549][00319] Fps is (10 sec: 3276.3, 60 sec: 3891.1, 300 sec: 3846.1). Total num frames: 1134592. Throughput: 0: 941.7. Samples: 284058. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2025-01-15 14:31:35,557][00319] Avg episode reward: [(0, '6.280')]
[2025-01-15 14:31:37,844][01035] Updated weights for policy 0, policy_version 280 (0.0017)
[2025-01-15 14:31:40,547][00319] Fps is (10 sec: 4096.0, 60 sec: 4027.7, 300 sec: 3929.4). Total num frames: 1159168. Throughput: 0: 963.2. Samples: 287342. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2025-01-15 14:31:40,553][00319] Avg episode reward: [(0, '6.465')]
[2025-01-15 14:31:45,547][00319] Fps is (10 sec: 4916.0, 60 sec: 4027.7, 300 sec: 3915.5). Total num frames: 1183744. Throughput: 0: 1024.0. Samples: 294614. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2025-01-15 14:31:45,553][00319] Avg episode reward: [(0, '6.662')]
[2025-01-15 14:31:46,695][01035] Updated weights for policy 0, policy_version 290 (0.0027)
[2025-01-15 14:31:50,547][00319] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3915.5). Total num frames: 1196032. Throughput: 0: 977.8. Samples: 299896. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2025-01-15 14:31:50,554][00319] Avg episode reward: [(0, '6.912')]
[2025-01-15 14:31:55,547][00319] Fps is (10 sec: 3276.8, 60 sec: 3959.5, 300 sec: 3929.4). Total num frames: 1216512. Throughput: 0: 947.2. Samples: 302106. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2025-01-15 14:31:55,551][00319] Avg episode reward: [(0, '7.495')]
[2025-01-15 14:31:55,557][01020] Saving new best policy, reward=7.495!
[2025-01-15 14:31:58,008][01035] Updated weights for policy 0, policy_version 300 (0.0018)
[2025-01-15 14:32:00,547][00319] Fps is (10 sec: 4096.0, 60 sec: 4027.7, 300 sec: 3929.4). Total num frames: 1236992. Throughput: 0: 995.4. Samples: 308936. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2025-01-15 14:32:00,553][00319] Avg episode reward: [(0, '8.346')]
[2025-01-15 14:32:00,567][01020] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000303_1241088.pth...
[2025-01-15 14:32:00,693][01020] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000072_294912.pth
[2025-01-15 14:32:00,703][01020] Saving new best policy, reward=8.346!
[2025-01-15 14:32:05,547][00319] Fps is (10 sec: 4505.6, 60 sec: 4027.7, 300 sec: 3929.4). Total num frames: 1261568. Throughput: 0: 1013.4. Samples: 315514. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2025-01-15 14:32:05,553][00319] Avg episode reward: [(0, '8.951')]
[2025-01-15 14:32:05,556][01020] Saving new best policy, reward=8.951!
[2025-01-15 14:32:08,563][01035] Updated weights for policy 0, policy_version 310 (0.0021)
[2025-01-15 14:32:10,553][00319] Fps is (10 sec: 3684.3, 60 sec: 3822.6, 300 sec: 3915.4). Total num frames: 1273856. Throughput: 0: 979.3. Samples: 317572. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2025-01-15 14:32:10,556][00319] Avg episode reward: [(0, '8.760')]
[2025-01-15 14:32:15,548][00319] Fps is (10 sec: 3276.7, 60 sec: 3959.5, 300 sec: 3929.4). Total num frames: 1294336. Throughput: 0: 961.0. Samples: 322938. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2025-01-15 14:32:15,550][00319] Avg episode reward: [(0, '9.610')]
[2025-01-15 14:32:15,556][01020] Saving new best policy, reward=9.610!
[2025-01-15 14:32:18,373][01035] Updated weights for policy 0, policy_version 320 (0.0026)
[2025-01-15 14:32:20,547][00319] Fps is (10 sec: 4508.1, 60 sec: 4027.7, 300 sec: 3929.4). Total num frames: 1318912. Throughput: 0: 1027.5. Samples: 330296. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2025-01-15 14:32:20,556][00319] Avg episode reward: [(0, '9.042')]
[2025-01-15 14:32:25,547][00319] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3915.5). Total num frames: 1335296. Throughput: 0: 1023.2. Samples: 333384. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2025-01-15 14:32:25,551][00319] Avg episode reward: [(0, '9.842')]
[2025-01-15 14:32:25,555][01020] Saving new best policy, reward=9.842!
[2025-01-15 14:32:30,096][01035] Updated weights for policy 0, policy_version 330 (0.0030)
[2025-01-15 14:32:30,550][00319] Fps is (10 sec: 3276.0, 60 sec: 3891.0, 300 sec: 3929.3). Total num frames: 1351680. Throughput: 0: 958.2. Samples: 337734. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2025-01-15 14:32:30,553][00319] Avg episode reward: [(0, '10.339')]
[2025-01-15 14:32:30,565][01020] Saving new best policy, reward=10.339!
[2025-01-15 14:32:35,547][00319] Fps is (10 sec: 4096.0, 60 sec: 4027.8, 300 sec: 3943.3). Total num frames: 1376256. Throughput: 0: 986.1. Samples: 344270. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2025-01-15 14:32:35,554][00319] Avg episode reward: [(0, '11.226')]
[2025-01-15 14:32:35,559][01020] Saving new best policy, reward=11.226!
[2025-01-15 14:32:38,740][01035] Updated weights for policy 0, policy_version 340 (0.0022)
[2025-01-15 14:32:40,548][00319] Fps is (10 sec: 4506.6, 60 sec: 3959.4, 300 sec: 3929.4). Total num frames: 1396736. Throughput: 0: 1014.4. Samples: 347754. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2025-01-15 14:32:40,555][00319] Avg episode reward: [(0, '12.752')]
[2025-01-15 14:32:40,571][01020] Saving new best policy, reward=12.752!
[2025-01-15 14:32:45,547][00319] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3929.4). Total num frames: 1413120. Throughput: 0: 985.0. Samples: 353262. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2025-01-15 14:32:45,551][00319] Avg episode reward: [(0, '12.777')]
[2025-01-15 14:32:45,561][01020] Saving new best policy, reward=12.777!
[2025-01-15 14:32:50,470][01035] Updated weights for policy 0, policy_version 350 (0.0018)
[2025-01-15 14:32:50,548][00319] Fps is (10 sec: 3686.3, 60 sec: 3959.4, 300 sec: 3943.3). Total num frames: 1433600. Throughput: 0: 950.4. Samples: 358284. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2025-01-15 14:32:50,552][00319] Avg episode reward: [(0, '12.580')]
[2025-01-15 14:32:55,547][00319] Fps is (10 sec: 4505.6, 60 sec: 4027.7, 300 sec: 3943.3). Total num frames: 1458176. Throughput: 0: 987.2. Samples: 361992. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2025-01-15 14:32:55,551][00319] Avg episode reward: [(0, '12.216')]
[2025-01-15 14:32:58,901][01035] Updated weights for policy 0, policy_version 360 (0.0013)
[2025-01-15 14:33:00,551][00319] Fps is (10 sec: 4504.2, 60 sec: 4027.5, 300 sec: 3929.4). Total num frames: 1478656. Throughput: 0: 1026.8. Samples: 369148. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2025-01-15 14:33:00,555][00319] Avg episode reward: [(0, '12.019')]
[2025-01-15 14:33:05,547][00319] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3929.4). Total num frames: 1490944. Throughput: 0: 962.4. Samples: 373606. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2025-01-15 14:33:05,551][00319] Avg episode reward: [(0, '12.435')]
[2025-01-15 14:33:10,449][01035] Updated weights for policy 0, policy_version 370 (0.0045)
[2025-01-15 14:33:10,547][00319] Fps is (10 sec: 3687.7, 60 sec: 4028.1, 300 sec: 3943.3). Total num frames: 1515520. Throughput: 0: 955.7. Samples: 376390. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2025-01-15 14:33:10,555][00319] Avg episode reward: [(0, '12.868')]
[2025-01-15 14:33:10,563][01020] Saving new best policy, reward=12.868!
[2025-01-15 14:33:15,547][00319] Fps is (10 sec: 4505.6, 60 sec: 4027.7, 300 sec: 3929.4). Total num frames: 1536000. Throughput: 0: 1018.9. Samples: 383580. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2025-01-15 14:33:15,556][00319] Avg episode reward: [(0, '12.594')]
[2025-01-15 14:33:20,366][01035] Updated weights for policy 0, policy_version 380 (0.0013)
[2025-01-15 14:33:20,547][00319] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3929.4). Total num frames: 1556480. Throughput: 0: 1006.0. Samples: 389542. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2025-01-15 14:33:20,550][00319] Avg episode reward: [(0, '13.119')]
[2025-01-15 14:33:20,569][01020] Saving new best policy, reward=13.119!
[2025-01-15 14:33:25,547][00319] Fps is (10 sec: 3276.8, 60 sec: 3891.2, 300 sec: 3929.4). Total num frames: 1568768. Throughput: 0: 973.0. Samples: 391538. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2025-01-15 14:33:25,550][00319] Avg episode reward: [(0, '13.344')]
[2025-01-15 14:33:25,556][01020] Saving new best policy, reward=13.344!
[2025-01-15 14:33:30,547][00319] Fps is (10 sec: 3686.4, 60 sec: 4027.9, 300 sec: 3943.3). Total num frames: 1593344. Throughput: 0: 986.9. Samples: 397672. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2025-01-15 14:33:30,556][00319] Avg episode reward: [(0, '12.872')]
[2025-01-15 14:33:30,977][01035] Updated weights for policy 0, policy_version 390 (0.0015)
[2025-01-15 14:33:35,555][00319] Fps is (10 sec: 4911.5, 60 sec: 4027.2, 300 sec: 3943.2). Total num frames: 1617920. Throughput: 0: 1034.9. Samples: 404860. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2025-01-15 14:33:35,558][00319] Avg episode reward: [(0, '13.084')]
[2025-01-15 14:33:40,548][00319] Fps is (10 sec: 3686.1, 60 sec: 3891.2, 300 sec: 3929.4). Total num frames: 1630208. Throughput: 0: 1003.2. Samples: 407136. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2025-01-15 14:33:40,551][00319] Avg episode reward: [(0, '14.340')]
[2025-01-15 14:33:40,562][01020] Saving new best policy, reward=14.340!
[2025-01-15 14:33:42,302][01035] Updated weights for policy 0, policy_version 400 (0.0034)
[2025-01-15 14:33:45,547][00319] Fps is (10 sec: 3279.3, 60 sec: 3959.5, 300 sec: 3943.3). Total num frames: 1650688. Throughput: 0: 945.2. Samples: 411680. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2025-01-15 14:33:45,551][00319] Avg episode reward: [(0, '13.281')]
[2025-01-15 14:33:50,547][00319] Fps is (10 sec: 4505.9, 60 sec: 4027.8, 300 sec: 3943.3). Total num frames: 1675264. Throughput: 0: 1008.5. Samples: 418988. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2025-01-15 14:33:50,555][00319] Avg episode reward: [(0, '14.859')]
[2025-01-15 14:33:50,566][01020] Saving new best policy, reward=14.859!
[2025-01-15 14:33:51,287][01035] Updated weights for policy 0, policy_version 410 (0.0018)
[2025-01-15 14:33:55,555][00319] Fps is (10 sec: 4502.2, 60 sec: 3959.0, 300 sec: 3943.2). Total num frames: 1695744. Throughput: 0: 1026.5. Samples: 422590. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2025-01-15 14:33:55,565][00319] Avg episode reward: [(0, '14.835')]
[2025-01-15 14:34:00,550][00319] Fps is (10 sec: 3276.0, 60 sec: 3823.0, 300 sec: 3929.3). Total num frames: 1708032. Throughput: 0: 970.7. Samples: 427262. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2025-01-15 14:34:00,563][00319] Avg episode reward: [(0, '14.581')]
[2025-01-15 14:34:00,642][01020] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000418_1712128.pth...
[2025-01-15 14:34:00,842][01020] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000187_765952.pth
[2025-01-15 14:34:02,854][01035] Updated weights for policy 0, policy_version 420 (0.0031)
[2025-01-15 14:34:05,547][00319] Fps is (10 sec: 3689.2, 60 sec: 4027.7, 300 sec: 3957.2). Total num frames: 1732608. Throughput: 0: 968.9. Samples: 433142. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2025-01-15 14:34:05,556][00319] Avg episode reward: [(0, '16.013')]
[2025-01-15 14:34:05,559][01020] Saving new best policy, reward=16.013!
[2025-01-15 14:34:10,548][00319] Fps is (10 sec: 4506.6, 60 sec: 3959.4, 300 sec: 3943.3). Total num frames: 1753088. Throughput: 0: 1002.1. Samples: 436634. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2025-01-15 14:34:10,554][00319] Avg episode reward: [(0, '15.782')]
[2025-01-15 14:34:11,489][01035] Updated weights for policy 0, policy_version 430 (0.0030)
[2025-01-15 14:34:15,547][00319] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3943.3). Total num frames: 1773568. Throughput: 0: 1002.6. Samples: 442790. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2025-01-15 14:34:15,554][00319] Avg episode reward: [(0, '15.737')]
[2025-01-15 14:34:20,547][00319] Fps is (10 sec: 3276.9, 60 sec: 3822.9, 300 sec: 3929.4). Total num frames: 1785856. Throughput: 0: 942.4. Samples: 447262. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2025-01-15 14:34:20,550][00319] Avg episode reward: [(0, '15.924')]
[2025-01-15 14:34:23,102][01035] Updated weights for policy 0, policy_version 440 (0.0026)
[2025-01-15 14:34:25,547][00319] Fps is (10 sec: 4096.0, 60 sec: 4096.0, 300 sec: 3957.2). Total num frames: 1814528. Throughput: 0: 973.4. Samples: 450938. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2025-01-15 14:34:25,550][00319] Avg episode reward: [(0, '15.102')]
[2025-01-15 14:34:30,547][00319] Fps is (10 sec: 4915.2, 60 sec: 4027.7, 300 sec: 3943.3). Total num frames: 1835008. Throughput: 0: 1035.9. Samples: 458294. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2025-01-15 14:34:30,550][00319] Avg episode reward: [(0, '15.279')]
[2025-01-15 14:34:32,574][01035] Updated weights for policy 0, policy_version 450 (0.0026)
[2025-01-15 14:34:35,547][00319] Fps is (10 sec: 3686.4, 60 sec: 3891.7, 300 sec: 3943.3). Total num frames: 1851392. Throughput: 0: 980.0. Samples: 463088. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2025-01-15 14:34:35,553][00319] Avg episode reward: [(0, '15.952')]
[2025-01-15 14:34:40,548][00319] Fps is (10 sec: 3276.7, 60 sec: 3959.5, 300 sec: 3943.3). Total num frames: 1867776. Throughput: 0: 949.3. Samples: 465300. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2025-01-15 14:34:40,551][00319] Avg episode reward: [(0, '16.919')]
[2025-01-15 14:34:40,559][01020] Saving new best policy, reward=16.919!
[2025-01-15 14:34:43,354][01035] Updated weights for policy 0, policy_version 460 (0.0020)
[2025-01-15 14:34:45,547][00319] Fps is (10 sec: 4096.0, 60 sec: 4027.7, 300 sec: 3957.2). Total num frames: 1892352. Throughput: 0: 1001.7. Samples: 472338. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2025-01-15 14:34:45,551][00319] Avg episode reward: [(0, '18.907')]
[2025-01-15 14:34:45,561][01020] Saving new best policy, reward=18.907!
[2025-01-15 14:34:50,547][00319] Fps is (10 sec: 4505.7, 60 sec: 3959.5, 300 sec: 3943.3). Total num frames: 1912832. Throughput: 0: 1014.4. Samples: 478792. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2025-01-15 14:34:50,556][00319] Avg episode reward: [(0, '20.573')]
[2025-01-15 14:34:50,570][01020] Saving new best policy, reward=20.573!
[2025-01-15 14:34:54,224][01035] Updated weights for policy 0, policy_version 470 (0.0044)
[2025-01-15 14:34:55,547][00319] Fps is (10 sec: 3686.4, 60 sec: 3891.7, 300 sec: 3943.3). Total num frames: 1929216. Throughput: 0: 982.5. Samples: 480848. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2025-01-15 14:34:55,563][00319] Avg episode reward: [(0, '21.089')]
[2025-01-15 14:34:55,576][01020] Saving new best policy, reward=21.089!
[2025-01-15 14:35:00,548][00319] Fps is (10 sec: 3686.4, 60 sec: 4027.9, 300 sec: 3957.2). Total num frames: 1949696. Throughput: 0: 968.0. Samples: 486352. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2025-01-15 14:35:00,551][00319] Avg episode reward: [(0, '20.789')]
[2025-01-15 14:35:03,710][01035] Updated weights for policy 0, policy_version 480 (0.0032)
[2025-01-15 14:35:05,548][00319] Fps is (10 sec: 4505.5, 60 sec: 4027.7, 300 sec: 3957.2). Total num frames: 1974272. Throughput: 0: 1030.3. Samples: 493624. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2025-01-15 14:35:05,553][00319] Avg episode reward: [(0, '21.276')]
[2025-01-15 14:35:05,558][01020] Saving new best policy, reward=21.276!
[2025-01-15 14:35:10,547][00319] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3943.3). Total num frames: 1990656. Throughput: 0: 1015.5. Samples: 496634. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2025-01-15 14:35:10,553][00319] Avg episode reward: [(0, '19.427')]
[2025-01-15 14:35:15,337][01035] Updated weights for policy 0, policy_version 490 (0.0022)
[2025-01-15 14:35:15,547][00319] Fps is (10 sec: 3276.9, 60 sec: 3891.2, 300 sec: 3957.2). Total num frames: 2007040. Throughput: 0: 949.6. Samples: 501026. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2025-01-15 14:35:15,554][00319] Avg episode reward: [(0, '19.458')]
[2025-01-15 14:35:20,548][00319] Fps is (10 sec: 4095.9, 60 sec: 4096.0, 300 sec: 3957.1). Total num frames: 2031616. Throughput: 0: 997.9. Samples: 507992. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2025-01-15 14:35:20,553][00319] Avg episode reward: [(0, '18.957')]
[2025-01-15 14:35:23,836][01035] Updated weights for policy 0, policy_version 500 (0.0028)
[2025-01-15 14:35:25,547][00319] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 3957.2). Total num frames: 2052096. Throughput: 0: 1029.1. Samples: 511608. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0)
[2025-01-15 14:35:25,551][00319] Avg episode reward: [(0, '19.348')]
[2025-01-15 14:35:30,547][00319] Fps is (10 sec: 3686.5, 60 sec: 3891.2, 300 sec: 3957.2). Total num frames: 2068480. Throughput: 0: 989.3. Samples: 516856. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2025-01-15 14:35:30,553][00319] Avg episode reward: [(0, '19.320')]
[2025-01-15 14:35:35,547][00319] Fps is (10 sec: 3276.8, 60 sec: 3891.2, 300 sec: 3957.2). Total num frames: 2084864. Throughput: 0: 956.1. Samples: 521818. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2025-01-15 14:35:35,556][00319] Avg episode reward: [(0, '19.336')]
[2025-01-15 14:35:35,859][01035] Updated weights for policy 0, policy_version 510 (0.0025)
[2025-01-15 14:35:40,547][00319] Fps is (10 sec: 4096.0, 60 sec: 4027.8, 300 sec: 3957.2). Total num frames: 2109440. Throughput: 0: 991.7. Samples: 525476. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2025-01-15 14:35:40,556][00319] Avg episode reward: [(0, '18.906')]
[2025-01-15 14:35:44,464][01035] Updated weights for policy 0, policy_version 520 (0.0032)
[2025-01-15 14:35:45,547][00319] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 3957.2). Total num frames: 2129920. Throughput: 0: 1030.4. Samples: 532718. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2025-01-15 14:35:45,559][00319] Avg episode reward: [(0, '18.916')]
[2025-01-15 14:35:50,547][00319] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3957.2). Total num frames: 2146304. Throughput: 0: 967.8. Samples: 537176. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2025-01-15 14:35:50,554][00319] Avg episode reward: [(0, '19.563')]
[2025-01-15 14:35:55,547][00319] Fps is (10 sec: 3686.4, 60 sec: 3959.5, 300 sec: 3971.0). Total num frames: 2166784. Throughput: 0: 964.9. Samples: 540054. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2025-01-15 14:35:55,554][00319] Avg episode reward: [(0, '19.285')]
[2025-01-15 14:35:55,721][01035] Updated weights for policy 0, policy_version 530 (0.0018)
[2025-01-15 14:36:00,548][00319] Fps is (10 sec: 4505.6, 60 sec: 4027.7, 300 sec: 3971.0). Total num frames: 2191360. Throughput: 0: 1030.1. Samples: 547382. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2025-01-15 14:36:00,555][00319] Avg episode reward: [(0, '20.429')]
[2025-01-15 14:36:00,636][01020] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000536_2195456.pth...
[2025-01-15 14:36:00,766][01020] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000303_1241088.pth
[2025-01-15 14:36:05,547][00319] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3943.3). Total num frames: 2207744. Throughput: 0: 1002.0. Samples: 553080. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2025-01-15 14:36:05,553][00319] Avg episode reward: [(0, '21.605')]
[2025-01-15 14:36:05,563][01020] Saving new best policy, reward=21.605!
[2025-01-15 14:36:05,572][01035] Updated weights for policy 0, policy_version 540 (0.0014)
[2025-01-15 14:36:10,547][00319] Fps is (10 sec: 3276.8, 60 sec: 3891.2, 300 sec: 3957.2). Total num frames: 2224128. Throughput: 0: 969.5. Samples: 555236. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2025-01-15 14:36:10,555][00319] Avg episode reward: [(0, '20.959')]
[2025-01-15 14:36:15,547][00319] Fps is (10 sec: 4096.0, 60 sec: 4027.7, 300 sec: 3971.0). Total num frames: 2248704. Throughput: 0: 992.2. Samples: 561504. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2025-01-15 14:36:15,555][00319] Avg episode reward: [(0, '19.412')]
[2025-01-15 14:36:15,862][01035] Updated weights for policy 0, policy_version 550 (0.0019)
[2025-01-15 14:36:20,547][00319] Fps is (10 sec: 4915.2, 60 sec: 4027.8, 300 sec: 3971.0). Total num frames: 2273280. Throughput: 0: 1043.3. Samples: 568768. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2025-01-15 14:36:20,550][00319] Avg episode reward: [(0, '18.201')]
[2025-01-15 14:36:25,547][00319] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3957.2). Total num frames: 2285568. Throughput: 0: 1011.8. Samples: 571006. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2025-01-15 14:36:25,550][00319] Avg episode reward: [(0, '18.842')]
[2025-01-15 14:36:27,073][01035] Updated weights for policy 0, policy_version 560 (0.0031)
[2025-01-15 14:36:30,548][00319] Fps is (10 sec: 3276.8, 60 sec: 3959.5, 300 sec: 3971.1). Total num frames: 2306048. Throughput: 0: 955.9. Samples: 575732. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2025-01-15 14:36:30,554][00319] Avg episode reward: [(0, '17.511')]
[2025-01-15 14:36:35,547][00319] Fps is (10 sec: 4096.0, 60 sec: 4027.7, 300 sec: 3957.2). Total num frames: 2326528. Throughput: 0: 998.1. Samples: 582092. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2025-01-15 14:36:35,553][00319] Avg episode reward: [(0, '16.576')]
[2025-01-15 14:36:36,810][01035] Updated weights for policy 0, policy_version 570 (0.0032)
[2025-01-15 14:36:40,547][00319] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3943.3). Total num frames: 2347008. Throughput: 0: 1012.3. Samples: 585606. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2025-01-15 14:36:40,557][00319] Avg episode reward: [(0, '18.491')]
[2025-01-15 14:36:45,551][00319] Fps is (10 sec: 3275.8, 60 sec: 3822.7, 300 sec: 3943.2). Total num frames: 2359296. Throughput: 0: 944.3. Samples: 589880. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2025-01-15 14:36:45,553][00319] Avg episode reward: [(0, '21.387')]
[2025-01-15 14:36:49,122][01035] Updated weights for policy 0, policy_version 580 (0.0029)
[2025-01-15 14:36:50,547][00319] Fps is (10 sec: 3276.8, 60 sec: 3891.2, 300 sec: 3943.3). Total num frames: 2379776. Throughput: 0: 945.2. Samples: 595616. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2025-01-15 14:36:50,550][00319] Avg episode reward: [(0, '20.458')]
[2025-01-15 14:36:55,547][00319] Fps is (10 sec: 4507.0, 60 sec: 3959.5, 300 sec: 3957.2). Total num frames: 2404352. Throughput: 0: 977.0. Samples: 599200. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2025-01-15 14:36:55,553][00319] Avg episode reward: [(0, '21.244')]
[2025-01-15 14:36:57,942][01035] Updated weights for policy 0, policy_version 590 (0.0017)
[2025-01-15 14:37:00,548][00319] Fps is (10 sec: 4095.7, 60 sec: 3822.9, 300 sec: 3929.4). Total num frames: 2420736. Throughput: 0: 978.3. Samples: 605530. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2025-01-15 14:37:00,553][00319] Avg episode reward: [(0, '21.671')]
[2025-01-15 14:37:00,643][01020] Saving new best policy, reward=21.671!
[2025-01-15 14:37:05,547][00319] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3943.3). Total num frames: 2437120. Throughput: 0: 908.0. Samples: 609626. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2025-01-15 14:37:05,555][00319] Avg episode reward: [(0, '20.868')]
[2025-01-15 14:37:09,892][01035] Updated weights for policy 0, policy_version 600 (0.0033)
[2025-01-15 14:37:10,547][00319] Fps is (10 sec: 3686.6, 60 sec: 3891.2, 300 sec: 3943.3). Total num frames: 2457600. Throughput: 0: 927.9. Samples: 612762. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2025-01-15 14:37:10,551][00319] Avg episode reward: [(0, '18.477')]
[2025-01-15 14:37:15,548][00319] Fps is (10 sec: 4505.3, 60 sec: 3891.2, 300 sec: 3943.3). Total num frames: 2482176. Throughput: 0: 972.7. Samples: 619504. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0)
[2025-01-15 14:37:15,550][00319] Avg episode reward: [(0, '19.138')]
[2025-01-15 14:37:20,548][00319] Fps is (10 sec: 3686.4, 60 sec: 3686.4, 300 sec: 3929.4). Total num frames: 2494464. Throughput: 0: 938.5. Samples: 624326. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2025-01-15 14:37:20,553][00319] Avg episode reward: [(0, '19.477')]
[2025-01-15 14:37:21,366][01035] Updated weights for policy 0, policy_version 610 (0.0028)
[2025-01-15 14:37:25,547][00319] Fps is (10 sec: 2867.4, 60 sec: 3754.7, 300 sec: 3929.4). Total num frames: 2510848. Throughput: 0: 906.4. Samples: 626392. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2025-01-15 14:37:25,549][00319] Avg episode reward: [(0, '19.629')]
[2025-01-15 14:37:30,551][00319] Fps is (10 sec: 4094.6, 60 sec: 3822.7, 300 sec: 3929.3). Total num frames: 2535424. Throughput: 0: 958.2. Samples: 632998. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2025-01-15 14:37:30,555][00319] Avg episode reward: [(0, '20.951')]
[2025-01-15 14:37:31,152][01035] Updated weights for policy 0, policy_version 620 (0.0019)
[2025-01-15 14:37:35,552][00319] Fps is (10 sec: 4503.5, 60 sec: 3822.6, 300 sec: 3929.3). Total num frames: 2555904. Throughput: 0: 975.2. Samples: 639504. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2025-01-15 14:37:35,559][00319] Avg episode reward: [(0, '20.654')]
[2025-01-15 14:37:40,548][00319] Fps is (10 sec: 3277.8, 60 sec: 3686.4, 300 sec: 3915.5). Total num frames: 2568192. Throughput: 0: 942.3. Samples: 641604. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2025-01-15 14:37:40,553][00319] Avg episode reward: [(0, '20.775')]
[2025-01-15 14:37:42,948][01035] Updated weights for policy 0, policy_version 630 (0.0016)
[2025-01-15 14:37:45,547][00319] Fps is (10 sec: 3688.1, 60 sec: 3891.4, 300 sec: 3929.4). Total num frames: 2592768. Throughput: 0: 918.3. Samples: 646852. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2025-01-15 14:37:45,550][00319] Avg episode reward: [(0, '20.850')]
[2025-01-15 14:37:50,547][00319] Fps is (10 sec: 4915.3, 60 sec: 3959.5, 300 sec: 3929.4). Total num frames: 2617344. Throughput: 0: 990.8. Samples: 654214. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2025-01-15 14:37:50,555][00319] Avg episode reward: [(0, '20.632')]
[2025-01-15 14:37:51,394][01035] Updated weights for policy 0, policy_version 640 (0.0028)
[2025-01-15 14:37:55,548][00319] Fps is (10 sec: 4095.9, 60 sec: 3822.9, 300 sec: 3915.5). Total num frames: 2633728. Throughput: 0: 994.6. Samples: 657518. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2025-01-15 14:37:55,552][00319] Avg episode reward: [(0, '20.768')]
[2025-01-15 14:38:00,548][00319] Fps is (10 sec: 2867.2, 60 sec: 3754.7, 300 sec: 3915.5). Total num frames: 2646016. Throughput: 0: 941.2. Samples: 661858. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2025-01-15 14:38:00,552][00319] Avg episode reward: [(0, '22.166')]
[2025-01-15 14:38:00,637][01020] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000647_2650112.pth...
[2025-01-15 14:38:00,767][01020] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000418_1712128.pth
[2025-01-15 14:38:00,787][01020] Saving new best policy, reward=22.166!
[2025-01-15 14:38:03,217][01035] Updated weights for policy 0, policy_version 650 (0.0018)
[2025-01-15 14:38:05,547][00319] Fps is (10 sec: 3686.5, 60 sec: 3891.2, 300 sec: 3915.5). Total num frames: 2670592. Throughput: 0: 976.7. Samples: 668278. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2025-01-15 14:38:05,551][00319] Avg episode reward: [(0, '23.143')]
[2025-01-15 14:38:05,555][01020] Saving new best policy, reward=23.143!
[2025-01-15 14:38:10,548][00319] Fps is (10 sec: 4915.2, 60 sec: 3959.5, 300 sec: 3929.4). Total num frames: 2695168. Throughput: 0: 1005.0. Samples: 671616. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2025-01-15 14:38:10,551][00319] Avg episode reward: [(0, '24.000')]
[2025-01-15 14:38:10,559][01020] Saving new best policy, reward=24.000!
[2025-01-15 14:38:13,131][01035] Updated weights for policy 0, policy_version 660 (0.0021)
[2025-01-15 14:38:15,547][00319] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3901.6). Total num frames: 2707456. Throughput: 0: 978.1. Samples: 677008. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2025-01-15 14:38:15,557][00319] Avg episode reward: [(0, '22.705')]
[2025-01-15 14:38:20,548][00319] Fps is (10 sec: 2867.2, 60 sec: 3822.9, 300 sec: 3915.5). Total num frames: 2723840. Throughput: 0: 942.0. Samples: 681892. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2025-01-15 14:38:20,553][00319] Avg episode reward: [(0, '21.615')]
[2025-01-15 14:38:24,391][01035] Updated weights for policy 0, policy_version 670 (0.0024)
[2025-01-15 14:38:25,547][00319] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3915.5). Total num frames: 2748416. Throughput: 0: 963.0. Samples: 684940. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2025-01-15 14:38:25,555][00319] Avg episode reward: [(0, '19.633')]
[2025-01-15 14:38:30,547][00319] Fps is (10 sec: 4505.7, 60 sec: 3891.4, 300 sec: 3901.7). Total num frames: 2768896. Throughput: 0: 1001.2. Samples: 691906. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2025-01-15 14:38:30,553][00319] Avg episode reward: [(0, '20.233')]
[2025-01-15 14:38:35,552][00319] Fps is (10 sec: 3275.3, 60 sec: 3754.7, 300 sec: 3901.6). Total num frames: 2781184. Throughput: 0: 931.7. Samples: 696146. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2025-01-15 14:38:35,555][00319] Avg episode reward: [(0, '20.413')]
[2025-01-15 14:38:35,935][01035] Updated weights for policy 0, policy_version 680 (0.0017)
[2025-01-15 14:38:40,548][00319] Fps is (10 sec: 3276.7, 60 sec: 3891.2, 300 sec: 3901.6). Total num frames: 2801664. Throughput: 0: 910.6. Samples: 698494. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2025-01-15 14:38:40,550][00319] Avg episode reward: [(0, '19.494')]
[2025-01-15 14:38:45,160][01035] Updated weights for policy 0, policy_version 690 (0.0019)
[2025-01-15 14:38:45,547][00319] Fps is (10 sec: 4507.7, 60 sec: 3891.2, 300 sec: 3901.6). Total num frames: 2826240. Throughput: 0: 972.4. Samples: 705616. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2025-01-15 14:38:45,551][00319] Avg episode reward: [(0, '19.842')]
[2025-01-15 14:38:50,547][00319] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3887.8). Total num frames: 2842624. Throughput: 0: 960.6. Samples: 711504. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2025-01-15 14:38:50,555][00319] Avg episode reward: [(0, '20.014')]
[2025-01-15 14:38:55,548][00319] Fps is (10 sec: 3276.7, 60 sec: 3754.7, 300 sec: 3901.6). Total num frames: 2859008. Throughput: 0: 931.6. Samples: 713540. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2025-01-15 14:38:55,552][00319] Avg episode reward: [(0, '18.594')]
[2025-01-15 14:38:57,555][01035] Updated weights for policy 0, policy_version 700 (0.0027)
[2025-01-15 14:39:00,548][00319] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3887.7). Total num frames: 2879488. Throughput: 0: 931.4. Samples: 718922. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2025-01-15 14:39:00,553][00319] Avg episode reward: [(0, '19.298')]
[2025-01-15 14:39:05,547][00319] Fps is (10 sec: 4096.1, 60 sec: 3822.9, 300 sec: 3887.7). Total num frames: 2899968. Throughput: 0: 968.0. Samples: 725450. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2025-01-15 14:39:05,554][00319] Avg episode reward: [(0, '19.563')]
[2025-01-15 14:39:07,667][01035] Updated weights for policy 0, policy_version 710 (0.0013)
[2025-01-15 14:39:10,548][00319] Fps is (10 sec: 3276.7, 60 sec: 3618.1, 300 sec: 3860.0). Total num frames: 2912256. Throughput: 0: 952.4. Samples: 727800. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2025-01-15 14:39:10,555][00319] Avg episode reward: [(0, '20.562')]
[2025-01-15 14:39:15,547][00319] Fps is (10 sec: 2867.2, 60 sec: 3686.4, 300 sec: 3873.8). Total num frames: 2928640. Throughput: 0: 885.6. Samples: 731758. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2025-01-15 14:39:15,555][00319] Avg episode reward: [(0, '19.793')]
[2025-01-15 14:39:19,660][01035] Updated weights for policy 0, policy_version 720 (0.0018)
[2025-01-15 14:39:20,547][00319] Fps is (10 sec: 3686.5, 60 sec: 3754.7, 300 sec: 3846.1). Total num frames: 2949120. Throughput: 0: 934.1. Samples: 738178. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2025-01-15 14:39:20,552][00319] Avg episode reward: [(0, '20.366')]
[2025-01-15 14:39:25,547][00319] Fps is (10 sec: 4505.6, 60 sec: 3754.7, 300 sec: 3860.0). Total num frames: 2973696. Throughput: 0: 956.6. Samples: 741542. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2025-01-15 14:39:25,554][00319] Avg episode reward: [(0, '18.769')]
[2025-01-15 14:39:30,548][00319] Fps is (10 sec: 3686.3, 60 sec: 3618.1, 300 sec: 3846.1). Total num frames: 2985984. Throughput: 0: 905.1. Samples: 746348. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2025-01-15 14:39:30,555][00319] Avg episode reward: [(0, '19.704')]
[2025-01-15 14:39:31,415][01035] Updated weights for policy 0, policy_version 730 (0.0028)
[2025-01-15 14:39:35,547][00319] Fps is (10 sec: 2867.2, 60 sec: 3686.7, 300 sec: 3846.1). Total num frames: 3002368. Throughput: 0: 889.4. Samples: 751526. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2025-01-15 14:39:35,550][00319] Avg episode reward: [(0, '21.175')]
[2025-01-15 14:39:40,548][00319] Fps is (10 sec: 4096.1, 60 sec: 3754.7, 300 sec: 3846.1). Total num frames: 3026944. Throughput: 0: 920.5. Samples: 754964. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2025-01-15 14:39:40,551][00319] Avg episode reward: [(0, '22.870')]
[2025-01-15 14:39:40,834][01035] Updated weights for policy 0, policy_version 740 (0.0028)
[2025-01-15 14:39:45,550][00319] Fps is (10 sec: 4504.7, 60 sec: 3686.3, 300 sec: 3846.0). Total num frames: 3047424. Throughput: 0: 946.5. Samples: 761518. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2025-01-15 14:39:45,554][00319] Avg episode reward: [(0, '23.876')]
[2025-01-15 14:39:50,551][00319] Fps is (10 sec: 3275.7, 60 sec: 3617.9, 300 sec: 3832.1). Total num frames: 3059712. Throughput: 0: 893.0. Samples: 765636. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2025-01-15 14:39:50,554][00319] Avg episode reward: [(0, '25.103')]
[2025-01-15 14:39:50,567][01020] Saving new best policy, reward=25.103!
[2025-01-15 14:39:53,081][01035] Updated weights for policy 0, policy_version 750 (0.0039)
[2025-01-15 14:39:55,547][00319] Fps is (10 sec: 3277.5, 60 sec: 3686.4, 300 sec: 3832.2). Total num frames: 3080192. Throughput: 0: 904.4. Samples: 768498. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2025-01-15 14:39:55,550][00319] Avg episode reward: [(0, '25.903')]
[2025-01-15 14:39:55,562][01020] Saving new best policy, reward=25.903!
[2025-01-15 14:40:00,548][00319] Fps is (10 sec: 4507.2, 60 sec: 3754.7, 300 sec: 3832.2). Total num frames: 3104768. Throughput: 0: 972.9. Samples: 775540. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2025-01-15 14:40:00,559][00319] Avg episode reward: [(0, '23.974')]
[2025-01-15 14:40:00,566][01020] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000758_3104768.pth...
[2025-01-15 14:40:00,692][01020] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000536_2195456.pth
[2025-01-15 14:40:02,084][01035] Updated weights for policy 0, policy_version 760 (0.0040)
[2025-01-15 14:40:05,548][00319] Fps is (10 sec: 4096.0, 60 sec: 3686.4, 300 sec: 3832.2). Total num frames: 3121152. Throughput: 0: 945.7. Samples: 780736. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2025-01-15 14:40:05,558][00319] Avg episode reward: [(0, '23.494')]
[2025-01-15 14:40:10,547][00319] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3832.2). Total num frames: 3137536. Throughput: 0: 917.0. Samples: 782808. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2025-01-15 14:40:10,551][00319] Avg episode reward: [(0, '23.539')]
[2025-01-15 14:40:13,840][01035] Updated weights for policy 0, policy_version 770 (0.0033)
[2025-01-15 14:40:15,547][00319] Fps is (10 sec: 4096.1, 60 sec: 3891.2, 300 sec: 3832.2). Total num frames: 3162112. Throughput: 0: 951.4. Samples: 789162. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2025-01-15 14:40:15,550][00319] Avg episode reward: [(0, '22.622')]
[2025-01-15 14:40:20,553][00319] Fps is (10 sec: 4503.1, 60 sec: 3890.8, 300 sec: 3832.1). Total num frames: 3182592. Throughput: 0: 991.6. Samples: 796154. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2025-01-15 14:40:20,556][00319] Avg episode reward: [(0, '22.293')]
[2025-01-15 14:40:24,365][01035] Updated weights for policy 0, policy_version 780 (0.0023)
[2025-01-15 14:40:25,550][00319] Fps is (10 sec: 3275.9, 60 sec: 3686.2, 300 sec: 3818.3). Total num frames: 3194880. Throughput: 0: 963.3. Samples: 798314. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2025-01-15 14:40:25,556][00319] Avg episode reward: [(0, '23.078')]
[2025-01-15 14:40:30,547][00319] Fps is (10 sec: 3278.6, 60 sec: 3823.0, 300 sec: 3832.2). Total num frames: 3215360. Throughput: 0: 923.2. Samples: 803062. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2025-01-15 14:40:30,550][00319] Avg episode reward: [(0, '23.770')]
[2025-01-15 14:40:34,350][01035] Updated weights for policy 0, policy_version 790 (0.0018)
[2025-01-15 14:40:35,547][00319] Fps is (10 sec: 4506.8, 60 sec: 3959.5, 300 sec: 3832.2). Total num frames: 3239936. Throughput: 0: 993.2. Samples: 810326. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2025-01-15 14:40:35,551][00319] Avg episode reward: [(0, '22.298')]
[2025-01-15 14:40:40,548][00319] Fps is (10 sec: 4095.9, 60 sec: 3822.9, 300 sec: 3818.3). Total num frames: 3256320. Throughput: 0: 1004.1. Samples: 813684. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2025-01-15 14:40:40,553][00319] Avg episode reward: [(0, '22.417')]
[2025-01-15 14:40:45,547][00319] Fps is (10 sec: 3276.8, 60 sec: 3754.8, 300 sec: 3818.3). Total num frames: 3272704. Throughput: 0: 938.6. Samples: 817778. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2025-01-15 14:40:45,553][00319] Avg episode reward: [(0, '22.694')]
[2025-01-15 14:40:46,645][01035] Updated weights for policy 0, policy_version 800 (0.0029)
[2025-01-15 14:40:50,548][00319] Fps is (10 sec: 3686.4, 60 sec: 3891.4, 300 sec: 3818.3). Total num frames: 3293184. Throughput: 0: 953.7. Samples: 823654. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2025-01-15 14:40:50,552][00319] Avg episode reward: [(0, '22.257')]
[2025-01-15 14:40:55,366][01035] Updated weights for policy 0, policy_version 810 (0.0022)
[2025-01-15 14:40:55,548][00319] Fps is (10 sec: 4505.4, 60 sec: 3959.4, 300 sec: 3818.3). Total num frames: 3317760. Throughput: 0: 984.4. Samples: 827106. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2025-01-15 14:40:55,555][00319] Avg episode reward: [(0, '21.285')]
[2025-01-15 14:41:00,547][00319] Fps is (10 sec: 4096.1, 60 sec: 3822.9, 300 sec: 3818.3). Total num frames: 3334144. Throughput: 0: 975.1. Samples: 833040. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2025-01-15 14:41:00,553][00319] Avg episode reward: [(0, '22.298')]
[2025-01-15 14:41:05,547][00319] Fps is (10 sec: 2867.3, 60 sec: 3754.7, 300 sec: 3804.4). Total num frames: 3346432. Throughput: 0: 915.7. Samples: 837354. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2025-01-15 14:41:05,556][00319] Avg episode reward: [(0, '20.653')]
[2025-01-15 14:41:07,302][01035] Updated weights for policy 0, policy_version 820 (0.0015)
[2025-01-15 14:41:10,547][00319] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3804.4). Total num frames: 3371008. Throughput: 0: 948.5. Samples: 840994. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2025-01-15 14:41:10,555][00319] Avg episode reward: [(0, '22.123')]
[2025-01-15 14:41:15,547][00319] Fps is (10 sec: 4505.6, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 3391488. Throughput: 0: 995.3. Samples: 847850. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2025-01-15 14:41:15,556][00319] Avg episode reward: [(0, '23.431')]
[2025-01-15 14:41:17,101][01035] Updated weights for policy 0, policy_version 830 (0.0024)
[2025-01-15 14:41:20,548][00319] Fps is (10 sec: 3686.2, 60 sec: 3755.0, 300 sec: 3804.4). Total num frames: 3407872. Throughput: 0: 936.0. Samples: 852448. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2025-01-15 14:41:20,551][00319] Avg episode reward: [(0, '24.039')]
[2025-01-15 14:41:25,547][00319] Fps is (10 sec: 3686.4, 60 sec: 3891.4, 300 sec: 3804.4). Total num frames: 3428352. Throughput: 0: 908.2. Samples: 854554. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2025-01-15 14:41:25,555][00319] Avg episode reward: [(0, '24.554')]
[2025-01-15 14:41:28,224][01035] Updated weights for policy 0, policy_version 840 (0.0020)
[2025-01-15 14:41:30,547][00319] Fps is (10 sec: 4096.2, 60 sec: 3891.2, 300 sec: 3804.4). Total num frames: 3448832. Throughput: 0: 971.8. Samples: 861510. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2025-01-15 14:41:30,553][00319] Avg episode reward: [(0, '23.488')]
[2025-01-15 14:41:35,547][00319] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3804.4). Total num frames: 3469312. Throughput: 0: 975.3. Samples: 867542. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2025-01-15 14:41:35,554][00319] Avg episode reward: [(0, '22.519')]
[2025-01-15 14:41:39,718][01035] Updated weights for policy 0, policy_version 850 (0.0016)
[2025-01-15 14:41:40,547][00319] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3804.5). Total num frames: 3481600. Throughput: 0: 943.5. Samples: 869564. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2025-01-15 14:41:40,554][00319] Avg episode reward: [(0, '21.903')]
[2025-01-15 14:41:45,547][00319] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3804.4). Total num frames: 3502080. Throughput: 0: 933.9. Samples: 875066. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2025-01-15 14:41:45,555][00319] Avg episode reward: [(0, '21.283')]
[2025-01-15 14:41:49,060][01035] Updated weights for policy 0, policy_version 860 (0.0028)
[2025-01-15 14:41:50,547][00319] Fps is (10 sec: 4505.6, 60 sec: 3891.2, 300 sec: 3804.4). Total num frames: 3526656. Throughput: 0: 995.2. Samples: 882138. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2025-01-15 14:41:50,555][00319] Avg episode reward: [(0, '23.079')]
[2025-01-15 14:41:55,548][00319] Fps is (10 sec: 4095.9, 60 sec: 3754.7, 300 sec: 3804.4). Total num frames: 3543040. Throughput: 0: 974.9. Samples: 884866. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2025-01-15 14:41:55,553][00319] Avg episode reward: [(0, '24.458')]
[2025-01-15 14:42:00,548][00319] Fps is (10 sec: 3276.7, 60 sec: 3754.6, 300 sec: 3804.4). Total num frames: 3559424. Throughput: 0: 915.4. Samples: 889044. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2025-01-15 14:42:00,551][00319] Avg episode reward: [(0, '26.574')]
[2025-01-15 14:42:00,564][01020] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000869_3559424.pth...
[2025-01-15 14:42:00,710][01020] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000647_2650112.pth
[2025-01-15 14:42:00,731][01020] Saving new best policy, reward=26.574!
[2025-01-15 14:42:01,472][01035] Updated weights for policy 0, policy_version 870 (0.0023)
[2025-01-15 14:42:05,547][00319] Fps is (10 sec: 3686.5, 60 sec: 3891.2, 300 sec: 3804.4). Total num frames: 3579904. Throughput: 0: 957.1. Samples: 895516. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2025-01-15 14:42:05,554][00319] Avg episode reward: [(0, '25.814')]
[2025-01-15 14:42:10,548][00319] Fps is (10 sec: 4095.9, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 3600384. Throughput: 0: 985.6. Samples: 898906. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2025-01-15 14:42:10,557][00319] Avg episode reward: [(0, '25.924')]
[2025-01-15 14:42:10,831][01035] Updated weights for policy 0, policy_version 880 (0.0019)
[2025-01-15 14:42:15,548][00319] Fps is (10 sec: 3686.3, 60 sec: 3754.6, 300 sec: 3804.4). Total num frames: 3616768. Throughput: 0: 941.1. Samples: 903862. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2025-01-15 14:42:15,556][00319] Avg episode reward: [(0, '25.199')]
[2025-01-15 14:42:20,547][00319] Fps is (10 sec: 3277.0, 60 sec: 3754.7, 300 sec: 3804.4). Total num frames: 3633152. Throughput: 0: 919.6. Samples: 908926. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2025-01-15 14:42:20,554][00319] Avg episode reward: [(0, '24.154')]
[2025-01-15 14:42:22,464][01035] Updated weights for policy 0, policy_version 890 (0.0031)
[2025-01-15 14:42:25,547][00319] Fps is (10 sec: 4096.1, 60 sec: 3822.9, 300 sec: 3804.5). Total num frames: 3657728. Throughput: 0: 953.7. Samples: 912482. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2025-01-15 14:42:25,553][00319] Avg episode reward: [(0, '23.135')]
[2025-01-15 14:42:30,547][00319] Fps is (10 sec: 4505.6, 60 sec: 3822.9, 300 sec: 3804.5). Total num frames: 3678208. Throughput: 0: 985.4. Samples: 919410. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2025-01-15 14:42:30,554][00319] Avg episode reward: [(0, '23.108')]
[2025-01-15 14:42:32,741][01035] Updated weights for policy 0, policy_version 900 (0.0018)
[2025-01-15 14:42:35,549][00319] Fps is (10 sec: 3276.4, 60 sec: 3686.3, 300 sec: 3804.4). Total num frames: 3690496. Throughput: 0: 924.6. Samples: 923744. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2025-01-15 14:42:35,557][00319] Avg episode reward: [(0, '23.057')]
[2025-01-15 14:42:40,548][00319] Fps is (10 sec: 3686.3, 60 sec: 3891.2, 300 sec: 3804.4). Total num frames: 3715072. Throughput: 0: 926.1. Samples: 926542. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2025-01-15 14:42:40,551][00319] Avg episode reward: [(0, '23.003')]
[2025-01-15 14:42:43,051][01035] Updated weights for policy 0, policy_version 910 (0.0023)
[2025-01-15 14:42:45,547][00319] Fps is (10 sec: 4506.1, 60 sec: 3891.2, 300 sec: 3790.5). Total num frames: 3735552. Throughput: 0: 985.5. Samples: 933392. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2025-01-15 14:42:45,550][00319] Avg episode reward: [(0, '22.486')]
[2025-01-15 14:42:50,547][00319] Fps is (10 sec: 3686.5, 60 sec: 3754.7, 300 sec: 3790.5). Total num frames: 3751936. Throughput: 0: 966.8. Samples: 939022. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2025-01-15 14:42:50,554][00319] Avg episode reward: [(0, '22.578')]
[2025-01-15 14:42:54,983][01035] Updated weights for policy 0, policy_version 920 (0.0034)
[2025-01-15 14:42:55,547][00319] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3804.4). Total num frames: 3768320. Throughput: 0: 938.0. Samples: 941114. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2025-01-15 14:42:55,556][00319] Avg episode reward: [(0, '22.550')]
[2025-01-15 14:43:00,547][00319] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3804.4). Total num frames: 3792896. Throughput: 0: 967.5. Samples: 947400. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2025-01-15 14:43:00,552][00319] Avg episode reward: [(0, '25.572')]
[2025-01-15 14:43:03,697][01035] Updated weights for policy 0, policy_version 930 (0.0025)
[2025-01-15 14:43:05,547][00319] Fps is (10 sec: 4505.6, 60 sec: 3891.2, 300 sec: 3790.5). Total num frames: 3813376. Throughput: 0: 1009.6. Samples: 954356. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2025-01-15 14:43:05,553][00319] Avg episode reward: [(0, '26.734')]
[2025-01-15 14:43:05,555][01020] Saving new best policy, reward=26.734!
[2025-01-15 14:43:10,551][00319] Fps is (10 sec: 3685.0, 60 sec: 3822.7, 300 sec: 3804.4). Total num frames: 3829760. Throughput: 0: 978.8. Samples: 956532. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2025-01-15 14:43:10,554][00319] Avg episode reward: [(0, '27.464')]
[2025-01-15 14:43:10,570][01020] Saving new best policy, reward=27.464!
[2025-01-15 14:43:15,547][00319] Fps is (10 sec: 3276.8, 60 sec: 3823.0, 300 sec: 3804.4). Total num frames: 3846144. Throughput: 0: 928.2. Samples: 961178. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2025-01-15 14:43:15,550][00319] Avg episode reward: [(0, '25.662')]
[2025-01-15 14:43:15,705][01035] Updated weights for policy 0, policy_version 940 (0.0016)
[2025-01-15 14:43:20,548][00319] Fps is (10 sec: 4097.4, 60 sec: 3959.4, 300 sec: 3804.4). Total num frames: 3870720. Throughput: 0: 991.5. Samples: 968362. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2025-01-15 14:43:20,555][00319] Avg episode reward: [(0, '25.439')]
[2025-01-15 14:43:24,576][01035] Updated weights for policy 0, policy_version 950 (0.0025)
[2025-01-15 14:43:25,549][00319] Fps is (10 sec: 4504.7, 60 sec: 3891.1, 300 sec: 3804.4). Total num frames: 3891200. Throughput: 0: 1010.3. Samples: 972006. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2025-01-15 14:43:25,556][00319] Avg episode reward: [(0, '23.513')]
[2025-01-15 14:43:30,547][00319] Fps is (10 sec: 3686.5, 60 sec: 3822.9, 300 sec: 3818.4). Total num frames: 3907584. Throughput: 0: 958.8. Samples: 976538. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2025-01-15 14:43:30,555][00319] Avg episode reward: [(0, '23.840')]
[2025-01-15 14:43:35,547][00319] Fps is (10 sec: 3687.1, 60 sec: 3959.5, 300 sec: 3818.3). Total num frames: 3928064. Throughput: 0: 967.2. Samples: 982548. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0)
[2025-01-15 14:43:35,550][00319] Avg episode reward: [(0, '22.529')]
[2025-01-15 14:43:35,822][01035] Updated weights for policy 0, policy_version 960 (0.0015)
[2025-01-15 14:43:40,548][00319] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 3818.3). Total num frames: 3952640. Throughput: 0: 998.3. Samples: 986036. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2025-01-15 14:43:40,556][00319] Avg episode reward: [(0, '24.425')]
[2025-01-15 14:43:45,547][00319] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3818.3). Total num frames: 3969024. Throughput: 0: 988.6. Samples: 991888. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2025-01-15 14:43:45,558][00319] Avg episode reward: [(0, '23.863')]
[2025-01-15 14:43:46,458][01035] Updated weights for policy 0, policy_version 970 (0.0015)
[2025-01-15 14:43:50,548][00319] Fps is (10 sec: 3276.7, 60 sec: 3891.2, 300 sec: 3818.3). Total num frames: 3985408. Throughput: 0: 932.0. Samples: 996298. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2025-01-15 14:43:50,556][00319] Avg episode reward: [(0, '24.449')]
[2025-01-15 14:43:54,765][00319] Component Batcher_0 stopped!
[2025-01-15 14:43:54,779][01020] Stopping Batcher_0...
[2025-01-15 14:43:54,779][01020] Loop batcher_evt_loop terminating...
[2025-01-15 14:43:54,780][01020] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth...
[2025-01-15 14:43:54,821][01035] Weights refcount: 2 0
[2025-01-15 14:43:54,823][01035] Stopping InferenceWorker_p0-w0...
[2025-01-15 14:43:54,824][01035] Loop inference_proc0-0_evt_loop terminating...
[2025-01-15 14:43:54,823][00319] Component InferenceWorker_p0-w0 stopped!
[2025-01-15 14:43:54,897][01020] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000758_3104768.pth
[2025-01-15 14:43:54,915][01020] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth...
[2025-01-15 14:43:55,078][00319] Component LearnerWorker_p0 stopped!
[2025-01-15 14:43:55,087][01020] Stopping LearnerWorker_p0...
[2025-01-15 14:43:55,088][01020] Loop learner_proc0_evt_loop terminating...
[2025-01-15 14:43:55,156][00319] Component RolloutWorker_w6 stopped!
[2025-01-15 14:43:55,160][01042] Stopping RolloutWorker_w6...
[2025-01-15 14:43:55,170][00319] Component RolloutWorker_w2 stopped!
[2025-01-15 14:43:55,166][01042] Loop rollout_proc6_evt_loop terminating...
[2025-01-15 14:43:55,169][01039] Stopping RolloutWorker_w2...
[2025-01-15 14:43:55,174][01039] Loop rollout_proc2_evt_loop terminating...
[2025-01-15 14:43:55,189][00319] Component RolloutWorker_w4 stopped!
[2025-01-15 14:43:55,191][01040] Stopping RolloutWorker_w4...
[2025-01-15 14:43:55,195][01040] Loop rollout_proc4_evt_loop terminating...
[2025-01-15 14:43:55,203][00319] Component RolloutWorker_w1 stopped!
[2025-01-15 14:43:55,210][01036] Stopping RolloutWorker_w1...
[2025-01-15 14:43:55,210][01036] Loop rollout_proc1_evt_loop terminating...
[2025-01-15 14:43:55,218][01037] Stopping RolloutWorker_w0...
[2025-01-15 14:43:55,218][00319] Component RolloutWorker_w0 stopped!
[2025-01-15 14:43:55,225][01037] Loop rollout_proc0_evt_loop terminating...
[2025-01-15 14:43:55,226][00319] Component RolloutWorker_w3 stopped!
[2025-01-15 14:43:55,232][01038] Stopping RolloutWorker_w3...
[2025-01-15 14:43:55,235][01038] Loop rollout_proc3_evt_loop terminating...
[2025-01-15 14:43:55,238][00319] Component RolloutWorker_w7 stopped!
[2025-01-15 14:43:55,243][01043] Stopping RolloutWorker_w7...
[2025-01-15 14:43:55,249][00319] Component RolloutWorker_w5 stopped!
[2025-01-15 14:43:55,255][00319] Waiting for process learner_proc0 to stop...
[2025-01-15 14:43:55,256][01041] Stopping RolloutWorker_w5...
[2025-01-15 14:43:55,245][01043] Loop rollout_proc7_evt_loop terminating...
[2025-01-15 14:43:55,264][01041] Loop rollout_proc5_evt_loop terminating...
[2025-01-15 14:43:56,735][00319] Waiting for process inference_proc0-0 to join...
[2025-01-15 14:43:56,748][00319] Waiting for process rollout_proc0 to join...
[2025-01-15 14:43:58,697][00319] Waiting for process rollout_proc1 to join...
[2025-01-15 14:43:58,750][00319] Waiting for process rollout_proc2 to join...
[2025-01-15 14:43:58,756][00319] Waiting for process rollout_proc3 to join...
[2025-01-15 14:43:58,760][00319] Waiting for process rollout_proc4 to join...
[2025-01-15 14:43:58,764][00319] Waiting for process rollout_proc5 to join...
[2025-01-15 14:43:58,768][00319] Waiting for process rollout_proc6 to join...
[2025-01-15 14:43:58,772][00319] Waiting for process rollout_proc7 to join...
[2025-01-15 14:43:58,776][00319] Batcher 0 profile tree view:
batching: 27.4826, releasing_batches: 0.0303
[2025-01-15 14:43:58,778][00319] InferenceWorker_p0-w0 profile tree view:
wait_policy: 0.0000
wait_policy_total: 418.7503
update_model: 8.5261
weight_update: 0.0030
one_step: 0.0081
handle_policy_step: 574.0382
deserialize: 14.6473, stack: 3.3419, obs_to_device_normalize: 122.9754, forward: 288.1993, send_messages: 28.4491
prepare_outputs: 87.3286
to_cpu: 52.8217
[2025-01-15 14:43:58,781][00319] Learner 0 profile tree view:
misc: 0.0047, prepare_batch: 15.0550
train: 74.3916
epoch_init: 0.0057, minibatch_init: 0.0066, losses_postprocess: 0.6715, kl_divergence: 0.7288, after_optimizer: 33.4940
calculate_losses: 26.7837
losses_init: 0.0049, forward_head: 1.2731, bptt_initial: 18.0487, tail: 1.0836, advantages_returns: 0.2323, losses: 3.8355
bptt: 1.9727
bptt_forward_core: 1.8768
update: 12.0016
clip: 0.8933
[2025-01-15 14:43:58,783][00319] RolloutWorker_w0 profile tree view:
wait_for_trajectories: 0.3361, enqueue_policy_requests: 104.0313, env_step: 807.3174, overhead: 13.6173, complete_rollouts: 7.4427
save_policy_outputs: 21.5908
split_output_tensors: 8.4833
[2025-01-15 14:43:58,785][00319] RolloutWorker_w7 profile tree view:
wait_for_trajectories: 0.3348, enqueue_policy_requests: 103.1740, env_step: 807.6032, overhead: 13.3103, complete_rollouts: 6.8850
save_policy_outputs: 21.6953
split_output_tensors: 8.5687
[2025-01-15 14:43:58,787][00319] Loop Runner_EvtLoop terminating...
[2025-01-15 14:43:58,790][00319] Runner profile tree view:
main_loop: 1072.3200
[2025-01-15 14:43:58,792][00319] Collected {0: 4005888}, FPS: 3735.7
[2025-01-15 14:44:29,403][00319] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json
[2025-01-15 14:44:29,406][00319] Overriding arg 'num_workers' with value 1 passed from command line
[2025-01-15 14:44:29,409][00319] Adding new argument 'no_render'=True that is not in the saved config file!
[2025-01-15 14:44:29,411][00319] Adding new argument 'save_video'=True that is not in the saved config file!
[2025-01-15 14:44:29,413][00319] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file!
[2025-01-15 14:44:29,415][00319] Adding new argument 'video_name'=None that is not in the saved config file!
[2025-01-15 14:44:29,419][00319] Adding new argument 'max_num_frames'=1000000000.0 that is not in the saved config file!
[2025-01-15 14:44:29,420][00319] Adding new argument 'max_num_episodes'=10 that is not in the saved config file!
[2025-01-15 14:44:29,423][00319] Adding new argument 'push_to_hub'=False that is not in the saved config file!
[2025-01-15 14:44:29,426][00319] Adding new argument 'hf_repository'=None that is not in the saved config file!
[2025-01-15 14:44:29,427][00319] Adding new argument 'policy_index'=0 that is not in the saved config file!
[2025-01-15 14:44:29,428][00319] Adding new argument 'eval_deterministic'=False that is not in the saved config file!
[2025-01-15 14:44:29,429][00319] Adding new argument 'train_script'=None that is not in the saved config file!
[2025-01-15 14:44:29,431][00319] Adding new argument 'enjoy_script'=None that is not in the saved config file!
[2025-01-15 14:44:29,432][00319] Using frameskip 1 and render_action_repeat=4 for evaluation
[2025-01-15 14:44:29,463][00319] Doom resolution: 160x120, resize resolution: (128, 72)
[2025-01-15 14:44:29,468][00319] RunningMeanStd input shape: (3, 72, 128)
[2025-01-15 14:44:29,471][00319] RunningMeanStd input shape: (1,)
[2025-01-15 14:44:29,486][00319] ConvEncoder: input_channels=3
[2025-01-15 14:44:29,590][00319] Conv encoder output size: 512
[2025-01-15 14:44:29,592][00319] Policy head output size: 512
[2025-01-15 14:44:29,779][00319] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth...
[2025-01-15 14:44:30,593][00319] Num frames 100...
[2025-01-15 14:44:30,721][00319] Num frames 200...
[2025-01-15 14:44:30,841][00319] Num frames 300...
[2025-01-15 14:44:30,962][00319] Num frames 400...
[2025-01-15 14:44:31,084][00319] Num frames 500...
[2025-01-15 14:44:31,202][00319] Num frames 600...
[2025-01-15 14:44:31,331][00319] Num frames 700...
[2025-01-15 14:44:31,447][00319] Num frames 800...
[2025-01-15 14:44:31,565][00319] Num frames 900...
[2025-01-15 14:44:31,683][00319] Num frames 1000...
[2025-01-15 14:44:31,799][00319] Num frames 1100...
[2025-01-15 14:44:31,923][00319] Num frames 1200...
[2025-01-15 14:44:32,043][00319] Num frames 1300...
[2025-01-15 14:44:32,170][00319] Num frames 1400...
[2025-01-15 14:44:32,296][00319] Num frames 1500...
[2025-01-15 14:44:32,362][00319] Avg episode rewards: #0: 33.040, true rewards: #0: 15.040
[2025-01-15 14:44:32,365][00319] Avg episode reward: 33.040, avg true_objective: 15.040
[2025-01-15 14:44:32,478][00319] Num frames 1600...
[2025-01-15 14:44:32,597][00319] Num frames 1700...
[2025-01-15 14:44:32,718][00319] Num frames 1800...
[2025-01-15 14:44:32,843][00319] Num frames 1900...
[2025-01-15 14:44:32,963][00319] Num frames 2000...
[2025-01-15 14:44:33,086][00319] Num frames 2100...
[2025-01-15 14:44:33,211][00319] Num frames 2200...
[2025-01-15 14:44:33,342][00319] Num frames 2300...
[2025-01-15 14:44:33,404][00319] Avg episode rewards: #0: 26.020, true rewards: #0: 11.520
[2025-01-15 14:44:33,406][00319] Avg episode reward: 26.020, avg true_objective: 11.520
[2025-01-15 14:44:33,522][00319] Num frames 2400...
[2025-01-15 14:44:33,646][00319] Num frames 2500...
[2025-01-15 14:44:33,773][00319] Num frames 2600...
[2025-01-15 14:44:33,892][00319] Num frames 2700...
[2025-01-15 14:44:34,010][00319] Num frames 2800...
[2025-01-15 14:44:34,137][00319] Num frames 2900...
[2025-01-15 14:44:34,257][00319] Num frames 3000...
[2025-01-15 14:44:34,385][00319] Num frames 3100...
[2025-01-15 14:44:34,505][00319] Num frames 3200...
[2025-01-15 14:44:34,630][00319] Num frames 3300...
[2025-01-15 14:44:34,752][00319] Num frames 3400...
[2025-01-15 14:44:34,837][00319] Avg episode rewards: #0: 25.077, true rewards: #0: 11.410
[2025-01-15 14:44:34,839][00319] Avg episode reward: 25.077, avg true_objective: 11.410
[2025-01-15 14:44:34,932][00319] Num frames 3500...
[2025-01-15 14:44:35,052][00319] Num frames 3600...
[2025-01-15 14:44:35,180][00319] Num frames 3700...
[2025-01-15 14:44:35,300][00319] Num frames 3800...
[2025-01-15 14:44:35,432][00319] Num frames 3900...
[2025-01-15 14:44:35,554][00319] Num frames 4000...
[2025-01-15 14:44:35,676][00319] Num frames 4100...
[2025-01-15 14:44:35,799][00319] Num frames 4200...
[2025-01-15 14:44:35,920][00319] Num frames 4300...
[2025-01-15 14:44:36,042][00319] Num frames 4400...
[2025-01-15 14:44:36,202][00319] Num frames 4500...
[2025-01-15 14:44:36,374][00319] Num frames 4600...
[2025-01-15 14:44:36,545][00319] Num frames 4700...
[2025-01-15 14:44:36,710][00319] Num frames 4800...
[2025-01-15 14:44:36,873][00319] Num frames 4900...
[2025-01-15 14:44:37,042][00319] Num frames 5000...
[2025-01-15 14:44:37,208][00319] Num frames 5100...
[2025-01-15 14:44:37,371][00319] Num frames 5200...
[2025-01-15 14:44:37,555][00319] Num frames 5300...
[2025-01-15 14:44:37,727][00319] Num frames 5400...
[2025-01-15 14:44:37,888][00319] Num frames 5500...
[2025-01-15 14:44:37,986][00319] Avg episode rewards: #0: 32.807, true rewards: #0: 13.807
[2025-01-15 14:44:37,989][00319] Avg episode reward: 32.807, avg true_objective: 13.807
[2025-01-15 14:44:38,130][00319] Num frames 5600...
[2025-01-15 14:44:38,302][00319] Num frames 5700...
[2025-01-15 14:44:38,469][00319] Num frames 5800...
[2025-01-15 14:44:38,645][00319] Num frames 5900...
[2025-01-15 14:44:38,823][00319] Num frames 6000...
[2025-01-15 14:44:39,020][00319] Num frames 6100...
[2025-01-15 14:44:39,193][00319] Num frames 6200...
[2025-01-15 14:44:39,362][00319] Num frames 6300...
[2025-01-15 14:44:39,486][00319] Num frames 6400...
[2025-01-15 14:44:39,607][00319] Avg episode rewards: #0: 30.702, true rewards: #0: 12.902
[2025-01-15 14:44:39,609][00319] Avg episode reward: 30.702, avg true_objective: 12.902
[2025-01-15 14:44:39,670][00319] Num frames 6500...
[2025-01-15 14:44:39,788][00319] Num frames 6600...
[2025-01-15 14:44:39,906][00319] Num frames 6700...
[2025-01-15 14:44:40,041][00319] Num frames 6800...
[2025-01-15 14:44:40,168][00319] Num frames 6900...
[2025-01-15 14:44:40,265][00319] Avg episode rewards: #0: 27.218, true rewards: #0: 11.552
[2025-01-15 14:44:40,267][00319] Avg episode reward: 27.218, avg true_objective: 11.552
[2025-01-15 14:44:40,349][00319] Num frames 7000...
[2025-01-15 14:44:40,470][00319] Num frames 7100...
[2025-01-15 14:44:40,601][00319] Num frames 7200...
[2025-01-15 14:44:40,719][00319] Num frames 7300...
[2025-01-15 14:44:40,836][00319] Num frames 7400...
[2025-01-15 14:44:40,953][00319] Num frames 7500...
[2025-01-15 14:44:41,077][00319] Num frames 7600...
[2025-01-15 14:44:41,130][00319] Avg episode rewards: #0: 25.143, true rewards: #0: 10.857
[2025-01-15 14:44:41,132][00319] Avg episode reward: 25.143, avg true_objective: 10.857
[2025-01-15 14:44:41,251][00319] Num frames 7700...
[2025-01-15 14:44:41,370][00319] Num frames 7800...
[2025-01-15 14:44:41,490][00319] Num frames 7900...
[2025-01-15 14:44:41,617][00319] Num frames 8000...
[2025-01-15 14:44:41,731][00319] Avg episode rewards: #0: 23.060, true rewards: #0: 10.060
[2025-01-15 14:44:41,734][00319] Avg episode reward: 23.060, avg true_objective: 10.060
[2025-01-15 14:44:41,797][00319] Num frames 8100...
[2025-01-15 14:44:41,914][00319] Num frames 8200...
[2025-01-15 14:44:42,035][00319] Num frames 8300...
[2025-01-15 14:44:42,192][00319] Num frames 8400...
[2025-01-15 14:44:42,308][00319] Num frames 8500...
[2025-01-15 14:44:42,429][00319] Num frames 8600...
[2025-01-15 14:44:42,548][00319] Num frames 8700...
[2025-01-15 14:44:42,632][00319] Avg episode rewards: #0: 21.803, true rewards: #0: 9.692
[2025-01-15 14:44:42,635][00319] Avg episode reward: 21.803, avg true_objective: 9.692
[2025-01-15 14:44:42,735][00319] Num frames 8800...
[2025-01-15 14:44:42,854][00319] Num frames 8900...
[2025-01-15 14:44:42,974][00319] Num frames 9000...
[2025-01-15 14:44:43,098][00319] Num frames 9100...
[2025-01-15 14:44:43,222][00319] Num frames 9200...
[2025-01-15 14:44:43,341][00319] Num frames 9300...
[2025-01-15 14:44:43,460][00319] Num frames 9400...
[2025-01-15 14:44:43,583][00319] Num frames 9500...
[2025-01-15 14:44:43,715][00319] Num frames 9600...
[2025-01-15 14:44:43,836][00319] Num frames 9700...
[2025-01-15 14:44:43,959][00319] Num frames 9800...
[2025-01-15 14:44:44,083][00319] Num frames 9900...
[2025-01-15 14:44:44,208][00319] Num frames 10000...
[2025-01-15 14:44:44,329][00319] Num frames 10100...
[2025-01-15 14:44:44,454][00319] Num frames 10200...
[2025-01-15 14:44:44,575][00319] Num frames 10300...
[2025-01-15 14:44:44,706][00319] Num frames 10400...
[2025-01-15 14:44:44,829][00319] Num frames 10500...
[2025-01-15 14:44:44,959][00319] Num frames 10600...
[2025-01-15 14:44:45,085][00319] Num frames 10700...
[2025-01-15 14:44:45,208][00319] Num frames 10800...
[2025-01-15 14:44:45,294][00319] Avg episode rewards: #0: 25.223, true rewards: #0: 10.823
[2025-01-15 14:44:45,296][00319] Avg episode reward: 25.223, avg true_objective: 10.823
[2025-01-15 14:45:52,412][00319] Replay video saved to /content/train_dir/default_experiment/replay.mp4!
[2025-01-15 14:46:49,681][00319] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json
[2025-01-15 14:46:49,684][00319] Overriding arg 'num_workers' with value 1 passed from command line
[2025-01-15 14:46:49,686][00319] Adding new argument 'no_render'=True that is not in the saved config file!
[2025-01-15 14:46:49,689][00319] Adding new argument 'save_video'=True that is not in the saved config file!
[2025-01-15 14:46:49,691][00319] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file!
[2025-01-15 14:46:49,696][00319] Adding new argument 'video_name'=None that is not in the saved config file!
[2025-01-15 14:46:49,698][00319] Adding new argument 'max_num_frames'=100000 that is not in the saved config file!
[2025-01-15 14:46:49,700][00319] Adding new argument 'max_num_episodes'=10 that is not in the saved config file!
[2025-01-15 14:46:49,705][00319] Adding new argument 'push_to_hub'=True that is not in the saved config file!
[2025-01-15 14:46:49,707][00319] Adding new argument 'hf_repository'='Stoub/rl_course_vizdoom_health_gathering_supreme' that is not in the saved config file!
[2025-01-15 14:46:49,709][00319] Adding new argument 'policy_index'=0 that is not in the saved config file!
[2025-01-15 14:46:49,710][00319] Adding new argument 'eval_deterministic'=False that is not in the saved config file!
[2025-01-15 14:46:49,711][00319] Adding new argument 'train_script'=None that is not in the saved config file!
[2025-01-15 14:46:49,714][00319] Adding new argument 'enjoy_script'=None that is not in the saved config file!
[2025-01-15 14:46:49,715][00319] Using frameskip 1 and render_action_repeat=4 for evaluation
[2025-01-15 14:46:49,758][00319] RunningMeanStd input shape: (3, 72, 128)
[2025-01-15 14:46:49,762][00319] RunningMeanStd input shape: (1,)
[2025-01-15 14:46:49,781][00319] ConvEncoder: input_channels=3
[2025-01-15 14:46:49,843][00319] Conv encoder output size: 512
[2025-01-15 14:46:49,854][00319] Policy head output size: 512
[2025-01-15 14:46:49,883][00319] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth...
[2025-01-15 14:46:50,562][00319] Num frames 100...
[2025-01-15 14:46:50,737][00319] Num frames 200...
[2025-01-15 14:46:50,911][00319] Num frames 300...
[2025-01-15 14:46:51,085][00319] Num frames 400...
[2025-01-15 14:46:51,254][00319] Num frames 500...
[2025-01-15 14:46:51,422][00319] Num frames 600...
[2025-01-15 14:46:51,588][00319] Num frames 700...
[2025-01-15 14:46:51,765][00319] Num frames 800...
[2025-01-15 14:46:51,937][00319] Num frames 900...
[2025-01-15 14:46:52,088][00319] Num frames 1000...
[2025-01-15 14:46:52,212][00319] Avg episode rewards: #0: 24.560, true rewards: #0: 10.560
[2025-01-15 14:46:52,214][00319] Avg episode reward: 24.560, avg true_objective: 10.560
[2025-01-15 14:46:52,270][00319] Num frames 1100...
[2025-01-15 14:46:52,394][00319] Num frames 1200...
[2025-01-15 14:46:52,522][00319] Num frames 1300...
[2025-01-15 14:46:52,641][00319] Num frames 1400...
[2025-01-15 14:46:52,766][00319] Num frames 1500...
[2025-01-15 14:46:52,892][00319] Num frames 1600...
[2025-01-15 14:46:53,014][00319] Num frames 1700...
[2025-01-15 14:46:53,140][00319] Num frames 1800...
[2025-01-15 14:46:53,266][00319] Num frames 1900...
[2025-01-15 14:46:53,381][00319] Num frames 2000...
[2025-01-15 14:46:53,508][00319] Num frames 2100...
[2025-01-15 14:46:53,625][00319] Num frames 2200...
[2025-01-15 14:46:53,744][00319] Num frames 2300...
[2025-01-15 14:46:53,878][00319] Avg episode rewards: #0: 27.840, true rewards: #0: 11.840
[2025-01-15 14:46:53,880][00319] Avg episode reward: 27.840, avg true_objective: 11.840
[2025-01-15 14:46:53,923][00319] Num frames 2400...
[2025-01-15 14:46:54,043][00319] Num frames 2500...
[2025-01-15 14:46:54,173][00319] Num frames 2600...
[2025-01-15 14:46:54,290][00319] Num frames 2700...
[2025-01-15 14:46:54,407][00319] Num frames 2800...
[2025-01-15 14:46:54,535][00319] Num frames 2900...
[2025-01-15 14:46:54,654][00319] Num frames 3000...
[2025-01-15 14:46:54,772][00319] Num frames 3100...
[2025-01-15 14:46:54,889][00319] Num frames 3200...
[2025-01-15 14:46:55,018][00319] Num frames 3300...
[2025-01-15 14:46:55,147][00319] Num frames 3400...
[2025-01-15 14:46:55,268][00319] Num frames 3500...
[2025-01-15 14:46:55,389][00319] Num frames 3600...
[2025-01-15 14:46:55,511][00319] Num frames 3700...
[2025-01-15 14:46:55,642][00319] Num frames 3800...
[2025-01-15 14:46:55,761][00319] Num frames 3900...
[2025-01-15 14:46:55,878][00319] Num frames 4000...
[2025-01-15 14:46:56,003][00319] Num frames 4100...
[2025-01-15 14:46:56,131][00319] Num frames 4200...
[2025-01-15 14:46:56,252][00319] Num frames 4300...
[2025-01-15 14:46:56,373][00319] Num frames 4400...
[2025-01-15 14:46:56,516][00319] Avg episode rewards: #0: 37.893, true rewards: #0: 14.893
[2025-01-15 14:46:56,518][00319] Avg episode reward: 37.893, avg true_objective: 14.893
[2025-01-15 14:46:56,561][00319] Num frames 4500...
[2025-01-15 14:46:56,683][00319] Num frames 4600...
[2025-01-15 14:46:56,801][00319] Num frames 4700...
[2025-01-15 14:46:56,924][00319] Num frames 4800...
[2025-01-15 14:46:57,045][00319] Num frames 4900...
[2025-01-15 14:46:57,174][00319] Num frames 5000...
[2025-01-15 14:46:57,294][00319] Num frames 5100...
[2025-01-15 14:46:57,418][00319] Num frames 5200...
[2025-01-15 14:46:57,538][00319] Num frames 5300...
[2025-01-15 14:46:57,668][00319] Num frames 5400...
[2025-01-15 14:46:57,794][00319] Num frames 5500...
[2025-01-15 14:46:57,916][00319] Num frames 5600...
[2025-01-15 14:46:58,039][00319] Num frames 5700...
[2025-01-15 14:46:58,164][00319] Num frames 5800...
[2025-01-15 14:46:58,286][00319] Num frames 5900...
[2025-01-15 14:46:58,406][00319] Num frames 6000...
[2025-01-15 14:46:58,522][00319] Num frames 6100...
[2025-01-15 14:46:58,652][00319] Num frames 6200...
[2025-01-15 14:46:58,774][00319] Num frames 6300...
[2025-01-15 14:46:58,894][00319] Num frames 6400...
[2025-01-15 14:46:59,020][00319] Num frames 6500...
[2025-01-15 14:46:59,162][00319] Avg episode rewards: #0: 41.420, true rewards: #0: 16.420
[2025-01-15 14:46:59,164][00319] Avg episode reward: 41.420, avg true_objective: 16.420
[2025-01-15 14:46:59,207][00319] Num frames 6600...
[2025-01-15 14:46:59,322][00319] Num frames 6700...
[2025-01-15 14:46:59,441][00319] Num frames 6800...
[2025-01-15 14:46:59,560][00319] Num frames 6900...
[2025-01-15 14:46:59,689][00319] Num frames 7000...
[2025-01-15 14:46:59,808][00319] Num frames 7100...
[2025-01-15 14:46:59,935][00319] Num frames 7200...
[2025-01-15 14:47:00,057][00319] Num frames 7300...
[2025-01-15 14:47:00,185][00319] Num frames 7400...
[2025-01-15 14:47:00,304][00319] Num frames 7500...
[2025-01-15 14:47:00,421][00319] Num frames 7600...
[2025-01-15 14:47:00,541][00319] Num frames 7700...
[2025-01-15 14:47:00,622][00319] Avg episode rewards: #0: 38.840, true rewards: #0: 15.440
[2025-01-15 14:47:00,624][00319] Avg episode reward: 38.840, avg true_objective: 15.440
[2025-01-15 14:47:00,732][00319] Num frames 7800...
[2025-01-15 14:47:00,854][00319] Num frames 7900...
[2025-01-15 14:47:00,981][00319] Num frames 8000...
[2025-01-15 14:47:01,110][00319] Num frames 8100...
[2025-01-15 14:47:01,232][00319] Num frames 8200...
[2025-01-15 14:47:01,352][00319] Num frames 8300...
[2025-01-15 14:47:01,471][00319] Num frames 8400...
[2025-01-15 14:47:01,568][00319] Avg episode rewards: #0: 35.055, true rewards: #0: 14.055
[2025-01-15 14:47:01,570][00319] Avg episode reward: 35.055, avg true_objective: 14.055
[2025-01-15 14:47:01,650][00319] Num frames 8500...
[2025-01-15 14:47:01,778][00319] Num frames 8600...
[2025-01-15 14:47:01,899][00319] Num frames 8700...
[2025-01-15 14:47:02,032][00319] Num frames 8800...
[2025-01-15 14:47:02,203][00319] Num frames 8900...
[2025-01-15 14:47:02,369][00319] Num frames 9000...
[2025-01-15 14:47:02,535][00319] Num frames 9100...
[2025-01-15 14:47:02,694][00319] Num frames 9200...
[2025-01-15 14:47:02,864][00319] Num frames 9300...
[2025-01-15 14:47:03,027][00319] Avg episode rewards: #0: 33.087, true rewards: #0: 13.373
[2025-01-15 14:47:03,030][00319] Avg episode reward: 33.087, avg true_objective: 13.373
[2025-01-15 14:47:03,106][00319] Num frames 9400...
[2025-01-15 14:47:03,265][00319] Num frames 9500...
[2025-01-15 14:47:03,429][00319] Num frames 9600...
[2025-01-15 14:47:03,590][00319] Num frames 9700...
[2025-01-15 14:47:03,757][00319] Num frames 9800...
[2025-01-15 14:47:03,928][00319] Num frames 9900...
[2025-01-15 14:47:04,101][00319] Num frames 10000...
[2025-01-15 14:47:04,275][00319] Num frames 10100...
[2025-01-15 14:47:04,439][00319] Num frames 10200...
[2025-01-15 14:47:04,612][00319] Num frames 10300...
[2025-01-15 14:47:04,767][00319] Avg episode rewards: #0: 31.566, true rewards: #0: 12.941
[2025-01-15 14:47:04,769][00319] Avg episode reward: 31.566, avg true_objective: 12.941
[2025-01-15 14:47:04,865][00319] Num frames 10400...
[2025-01-15 14:47:05,043][00319] Num frames 10500...
[2025-01-15 14:47:05,223][00319] Num frames 10600...
[2025-01-15 14:47:05,391][00319] Num frames 10700...
[2025-01-15 14:47:05,567][00319] Num frames 10800...
[2025-01-15 14:47:05,711][00319] Num frames 10900...
[2025-01-15 14:47:05,831][00319] Num frames 11000...
[2025-01-15 14:47:05,960][00319] Num frames 11100...
[2025-01-15 14:47:06,084][00319] Num frames 11200...
[2025-01-15 14:47:06,208][00319] Num frames 11300...
[2025-01-15 14:47:06,326][00319] Num frames 11400...
[2025-01-15 14:47:06,447][00319] Num frames 11500...
[2025-01-15 14:47:06,511][00319] Avg episode rewards: #0: 31.228, true rewards: #0: 12.783
[2025-01-15 14:47:06,513][00319] Avg episode reward: 31.228, avg true_objective: 12.783
[2025-01-15 14:47:06,624][00319] Num frames 11600...
[2025-01-15 14:47:06,747][00319] Num frames 11700...
[2025-01-15 14:47:06,863][00319] Num frames 11800...
[2025-01-15 14:47:06,994][00319] Num frames 11900...
[2025-01-15 14:47:07,121][00319] Num frames 12000...
[2025-01-15 14:47:07,244][00319] Num frames 12100...
[2025-01-15 14:47:07,365][00319] Num frames 12200...
[2025-01-15 14:47:07,484][00319] Num frames 12300...
[2025-01-15 14:47:07,606][00319] Num frames 12400...
[2025-01-15 14:47:07,729][00319] Num frames 12500...
[2025-01-15 14:47:07,849][00319] Num frames 12600...
[2025-01-15 14:47:07,980][00319] Num frames 12700...
[2025-01-15 14:47:08,109][00319] Num frames 12800...
[2025-01-15 14:47:08,229][00319] Num frames 12900...
[2025-01-15 14:47:08,347][00319] Num frames 13000...
[2025-01-15 14:47:08,415][00319] Avg episode rewards: #0: 31.209, true rewards: #0: 13.009
[2025-01-15 14:47:08,417][00319] Avg episode reward: 31.209, avg true_objective: 13.009
[2025-01-15 14:48:28,537][00319] Replay video saved to /content/train_dir/default_experiment/replay.mp4!