ahmadsy's picture
Upload folder using huggingface_hub
615b656 verified
[2024-12-03 15:58:23,503][01348] Saving configuration to /content/train_dir/default_experiment/config.json...
[2024-12-03 15:58:23,507][01348] Rollout worker 0 uses device cpu
[2024-12-03 15:58:23,509][01348] Rollout worker 1 uses device cpu
[2024-12-03 15:58:23,511][01348] Rollout worker 2 uses device cpu
[2024-12-03 15:58:23,513][01348] Rollout worker 3 uses device cpu
[2024-12-03 15:58:23,515][01348] Rollout worker 4 uses device cpu
[2024-12-03 15:58:23,516][01348] Rollout worker 5 uses device cpu
[2024-12-03 15:58:23,517][01348] Rollout worker 6 uses device cpu
[2024-12-03 15:58:23,523][01348] Rollout worker 7 uses device cpu
[2024-12-03 15:58:23,720][01348] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2024-12-03 15:58:23,722][01348] InferenceWorker_p0-w0: min num requests: 2
[2024-12-03 15:58:23,769][01348] Starting all processes...
[2024-12-03 15:58:23,771][01348] Starting process learner_proc0
[2024-12-03 15:58:23,864][01348] Starting all processes...
[2024-12-03 15:58:23,972][01348] Starting process inference_proc0-0
[2024-12-03 15:58:23,972][01348] Starting process rollout_proc0
[2024-12-03 15:58:23,973][01348] Starting process rollout_proc1
[2024-12-03 15:58:23,973][01348] Starting process rollout_proc2
[2024-12-03 15:58:23,973][01348] Starting process rollout_proc3
[2024-12-03 15:58:23,973][01348] Starting process rollout_proc4
[2024-12-03 15:58:23,973][01348] Starting process rollout_proc5
[2024-12-03 15:58:23,973][01348] Starting process rollout_proc6
[2024-12-03 15:58:23,973][01348] Starting process rollout_proc7
[2024-12-03 15:58:42,746][03366] Worker 3 uses CPU cores [1]
[2024-12-03 15:58:42,929][03345] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2024-12-03 15:58:42,929][03345] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0
[2024-12-03 15:58:42,995][03345] Num visible devices: 1
[2024-12-03 15:58:43,033][03345] Starting seed is not provided
[2024-12-03 15:58:43,033][03345] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2024-12-03 15:58:43,033][03345] Initializing actor-critic model on device cuda:0
[2024-12-03 15:58:43,034][03345] RunningMeanStd input shape: (3, 72, 128)
[2024-12-03 15:58:43,036][03345] RunningMeanStd input shape: (1,)
[2024-12-03 15:58:43,052][03369] Worker 6 uses CPU cores [0]
[2024-12-03 15:58:43,080][03345] ConvEncoder: input_channels=3
[2024-12-03 15:58:43,176][03365] Worker 0 uses CPU cores [0]
[2024-12-03 15:58:43,419][03362] Worker 1 uses CPU cores [1]
[2024-12-03 15:58:43,431][03367] Worker 4 uses CPU cores [0]
[2024-12-03 15:58:43,446][03358] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2024-12-03 15:58:43,447][03358] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0
[2024-12-03 15:58:43,476][03370] Worker 7 uses CPU cores [1]
[2024-12-03 15:58:43,488][03364] Worker 2 uses CPU cores [0]
[2024-12-03 15:58:43,502][03358] Num visible devices: 1
[2024-12-03 15:58:43,547][03368] Worker 5 uses CPU cores [1]
[2024-12-03 15:58:43,575][03345] Conv encoder output size: 512
[2024-12-03 15:58:43,576][03345] Policy head output size: 512
[2024-12-03 15:58:43,626][03345] Created Actor Critic model with architecture:
[2024-12-03 15:58:43,627][03345] ActorCriticSharedWeights(
(obs_normalizer): ObservationNormalizer(
(running_mean_std): RunningMeanStdDictInPlace(
(running_mean_std): ModuleDict(
(obs): RunningMeanStdInPlace()
)
)
)
(returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace)
(encoder): VizdoomEncoder(
(basic_encoder): ConvEncoder(
(enc): RecursiveScriptModule(
original_name=ConvEncoderImpl
(conv_head): RecursiveScriptModule(
original_name=Sequential
(0): RecursiveScriptModule(original_name=Conv2d)
(1): RecursiveScriptModule(original_name=ELU)
(2): RecursiveScriptModule(original_name=Conv2d)
(3): RecursiveScriptModule(original_name=ELU)
(4): RecursiveScriptModule(original_name=Conv2d)
(5): RecursiveScriptModule(original_name=ELU)
)
(mlp_layers): RecursiveScriptModule(
original_name=Sequential
(0): RecursiveScriptModule(original_name=Linear)
(1): RecursiveScriptModule(original_name=ELU)
)
)
)
)
(core): ModelCoreRNN(
(core): GRU(512, 512)
)
(decoder): MlpDecoder(
(mlp): Identity()
)
(critic_linear): Linear(in_features=512, out_features=1, bias=True)
(action_parameterization): ActionParameterizationDefault(
(distribution_linear): Linear(in_features=512, out_features=5, bias=True)
)
)
[2024-12-03 15:58:43,706][01348] Heartbeat connected on Batcher_0
[2024-12-03 15:58:43,721][01348] Heartbeat connected on InferenceWorker_p0-w0
[2024-12-03 15:58:43,735][01348] Heartbeat connected on RolloutWorker_w1
[2024-12-03 15:58:43,741][01348] Heartbeat connected on RolloutWorker_w0
[2024-12-03 15:58:43,745][01348] Heartbeat connected on RolloutWorker_w2
[2024-12-03 15:58:43,749][01348] Heartbeat connected on RolloutWorker_w3
[2024-12-03 15:58:43,754][01348] Heartbeat connected on RolloutWorker_w4
[2024-12-03 15:58:43,758][01348] Heartbeat connected on RolloutWorker_w5
[2024-12-03 15:58:43,763][01348] Heartbeat connected on RolloutWorker_w6
[2024-12-03 15:58:43,768][01348] Heartbeat connected on RolloutWorker_w7
[2024-12-03 15:58:43,914][03345] Using optimizer <class 'torch.optim.adam.Adam'>
[2024-12-03 15:58:47,235][03345] No checkpoints found
[2024-12-03 15:58:47,235][03345] Did not load from checkpoint, starting from scratch!
[2024-12-03 15:58:47,236][03345] Initialized policy 0 weights for model version 0
[2024-12-03 15:58:47,239][03345] LearnerWorker_p0 finished initialization!
[2024-12-03 15:58:47,240][03345] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2024-12-03 15:58:47,240][01348] Heartbeat connected on LearnerWorker_p0
[2024-12-03 15:58:47,349][03358] RunningMeanStd input shape: (3, 72, 128)
[2024-12-03 15:58:47,350][03358] RunningMeanStd input shape: (1,)
[2024-12-03 15:58:47,362][03358] ConvEncoder: input_channels=3
[2024-12-03 15:58:47,466][03358] Conv encoder output size: 512
[2024-12-03 15:58:47,467][03358] Policy head output size: 512
[2024-12-03 15:58:47,517][01348] Inference worker 0-0 is ready!
[2024-12-03 15:58:47,518][01348] All inference workers are ready! Signal rollout workers to start!
[2024-12-03 15:58:47,713][03369] Doom resolution: 160x120, resize resolution: (128, 72)
[2024-12-03 15:58:47,717][03367] Doom resolution: 160x120, resize resolution: (128, 72)
[2024-12-03 15:58:47,722][03365] Doom resolution: 160x120, resize resolution: (128, 72)
[2024-12-03 15:58:47,724][03364] Doom resolution: 160x120, resize resolution: (128, 72)
[2024-12-03 15:58:47,731][03362] Doom resolution: 160x120, resize resolution: (128, 72)
[2024-12-03 15:58:47,732][03370] Doom resolution: 160x120, resize resolution: (128, 72)
[2024-12-03 15:58:47,726][03366] Doom resolution: 160x120, resize resolution: (128, 72)
[2024-12-03 15:58:47,737][03368] Doom resolution: 160x120, resize resolution: (128, 72)
[2024-12-03 15:58:48,770][03366] Decorrelating experience for 0 frames...
[2024-12-03 15:58:48,770][03368] Decorrelating experience for 0 frames...
[2024-12-03 15:58:49,386][03367] Decorrelating experience for 0 frames...
[2024-12-03 15:58:49,389][03365] Decorrelating experience for 0 frames...
[2024-12-03 15:58:49,393][03369] Decorrelating experience for 0 frames...
[2024-12-03 15:58:49,404][03364] Decorrelating experience for 0 frames...
[2024-12-03 15:58:49,533][03368] Decorrelating experience for 32 frames...
[2024-12-03 15:58:49,637][03362] Decorrelating experience for 0 frames...
[2024-12-03 15:58:50,314][03370] Decorrelating experience for 0 frames...
[2024-12-03 15:58:50,427][03362] Decorrelating experience for 32 frames...
[2024-12-03 15:58:50,708][03367] Decorrelating experience for 32 frames...
[2024-12-03 15:58:50,711][01348] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 0. Throughput: 0: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0)
[2024-12-03 15:58:50,726][03364] Decorrelating experience for 32 frames...
[2024-12-03 15:58:50,925][03366] Decorrelating experience for 32 frames...
[2024-12-03 15:58:51,074][03365] Decorrelating experience for 32 frames...
[2024-12-03 15:58:51,213][03369] Decorrelating experience for 32 frames...
[2024-12-03 15:58:51,665][03366] Decorrelating experience for 64 frames...
[2024-12-03 15:58:52,239][03368] Decorrelating experience for 64 frames...
[2024-12-03 15:58:53,101][03367] Decorrelating experience for 64 frames...
[2024-12-03 15:58:53,131][03364] Decorrelating experience for 64 frames...
[2024-12-03 15:58:53,309][03366] Decorrelating experience for 96 frames...
[2024-12-03 15:58:53,490][03365] Decorrelating experience for 64 frames...
[2024-12-03 15:58:53,605][03369] Decorrelating experience for 64 frames...
[2024-12-03 15:58:54,172][03370] Decorrelating experience for 32 frames...
[2024-12-03 15:58:54,262][03368] Decorrelating experience for 96 frames...
[2024-12-03 15:58:54,879][03364] Decorrelating experience for 96 frames...
[2024-12-03 15:58:55,317][03365] Decorrelating experience for 96 frames...
[2024-12-03 15:58:55,711][01348] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 0.0. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0)
[2024-12-03 15:58:55,742][03367] Decorrelating experience for 96 frames...
[2024-12-03 15:58:56,508][03369] Decorrelating experience for 96 frames...
[2024-12-03 15:58:57,811][03362] Decorrelating experience for 64 frames...
[2024-12-03 15:59:00,035][03345] Signal inference workers to stop experience collection...
[2024-12-03 15:59:00,053][03358] InferenceWorker_p0-w0: stopping experience collection
[2024-12-03 15:59:00,214][03370] Decorrelating experience for 64 frames...
[2024-12-03 15:59:00,291][03362] Decorrelating experience for 96 frames...
[2024-12-03 15:59:00,700][03370] Decorrelating experience for 96 frames...
[2024-12-03 15:59:00,711][01348] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 210.2. Samples: 2102. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0)
[2024-12-03 15:59:00,713][01348] Avg episode reward: [(0, '2.849')]
[2024-12-03 15:59:02,968][03345] Signal inference workers to resume experience collection...
[2024-12-03 15:59:02,970][03358] InferenceWorker_p0-w0: resuming experience collection
[2024-12-03 15:59:05,711][01348] Fps is (10 sec: 1638.4, 60 sec: 1092.3, 300 sec: 1092.3). Total num frames: 16384. Throughput: 0: 331.5. Samples: 4972. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 15:59:05,716][01348] Avg episode reward: [(0, '3.353')]
[2024-12-03 15:59:10,711][01348] Fps is (10 sec: 2867.2, 60 sec: 1433.6, 300 sec: 1433.6). Total num frames: 28672. Throughput: 0: 373.2. Samples: 7464. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 15:59:10,714][01348] Avg episode reward: [(0, '3.628')]
[2024-12-03 15:59:15,711][01348] Fps is (10 sec: 2048.0, 60 sec: 1474.6, 300 sec: 1474.6). Total num frames: 36864. Throughput: 0: 383.3. Samples: 9582. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 15:59:15,713][01348] Avg episode reward: [(0, '3.821')]
[2024-12-03 15:59:16,582][03358] Updated weights for policy 0, policy_version 10 (0.0145)
[2024-12-03 15:59:20,711][01348] Fps is (10 sec: 2047.9, 60 sec: 1638.4, 300 sec: 1638.4). Total num frames: 49152. Throughput: 0: 452.9. Samples: 13586. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 15:59:20,717][01348] Avg episode reward: [(0, '4.272')]
[2024-12-03 15:59:25,711][01348] Fps is (10 sec: 2867.1, 60 sec: 1872.4, 300 sec: 1872.4). Total num frames: 65536. Throughput: 0: 455.5. Samples: 15944. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2024-12-03 15:59:25,713][01348] Avg episode reward: [(0, '4.266')]
[2024-12-03 15:59:30,712][01348] Fps is (10 sec: 2457.5, 60 sec: 1843.2, 300 sec: 1843.2). Total num frames: 73728. Throughput: 0: 472.5. Samples: 18902. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 15:59:30,714][01348] Avg episode reward: [(0, '4.310')]
[2024-12-03 15:59:31,949][03358] Updated weights for policy 0, policy_version 20 (0.0023)
[2024-12-03 15:59:35,711][01348] Fps is (10 sec: 3276.8, 60 sec: 2184.5, 300 sec: 2184.5). Total num frames: 98304. Throughput: 0: 568.6. Samples: 25586. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 15:59:35,718][01348] Avg episode reward: [(0, '4.354')]
[2024-12-03 15:59:40,325][03358] Updated weights for policy 0, policy_version 30 (0.0019)
[2024-12-03 15:59:40,711][01348] Fps is (10 sec: 4915.6, 60 sec: 2457.6, 300 sec: 2457.6). Total num frames: 122880. Throughput: 0: 648.8. Samples: 29194. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 15:59:40,713][01348] Avg episode reward: [(0, '4.434')]
[2024-12-03 15:59:40,715][03345] Saving new best policy, reward=4.434!
[2024-12-03 15:59:45,711][01348] Fps is (10 sec: 4096.1, 60 sec: 2532.1, 300 sec: 2532.1). Total num frames: 139264. Throughput: 0: 720.1. Samples: 34508. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 15:59:45,716][01348] Avg episode reward: [(0, '4.362')]
[2024-12-03 15:59:50,715][01348] Fps is (10 sec: 3685.0, 60 sec: 2662.2, 300 sec: 2662.2). Total num frames: 159744. Throughput: 0: 788.4. Samples: 40452. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 15:59:50,718][01348] Avg episode reward: [(0, '4.457')]
[2024-12-03 15:59:50,727][03345] Saving new best policy, reward=4.457!
[2024-12-03 15:59:51,599][03358] Updated weights for policy 0, policy_version 40 (0.0034)
[2024-12-03 15:59:55,711][01348] Fps is (10 sec: 4096.0, 60 sec: 3003.7, 300 sec: 2772.7). Total num frames: 180224. Throughput: 0: 808.2. Samples: 43832. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 15:59:55,718][01348] Avg episode reward: [(0, '4.614')]
[2024-12-03 15:59:55,726][03345] Saving new best policy, reward=4.614!
[2024-12-03 16:00:00,713][01348] Fps is (10 sec: 4096.9, 60 sec: 3345.0, 300 sec: 2867.1). Total num frames: 200704. Throughput: 0: 896.9. Samples: 49946. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:00:00,718][01348] Avg episode reward: [(0, '4.498')]
[2024-12-03 16:00:02,073][03358] Updated weights for policy 0, policy_version 50 (0.0019)
[2024-12-03 16:00:05,711][01348] Fps is (10 sec: 3686.4, 60 sec: 3345.1, 300 sec: 2894.5). Total num frames: 217088. Throughput: 0: 917.0. Samples: 54850. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:00:05,713][01348] Avg episode reward: [(0, '4.317')]
[2024-12-03 16:00:10,711][01348] Fps is (10 sec: 3687.0, 60 sec: 3481.6, 300 sec: 2969.6). Total num frames: 237568. Throughput: 0: 940.0. Samples: 58242. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:00:10,713][01348] Avg episode reward: [(0, '4.423')]
[2024-12-03 16:00:11,774][03358] Updated weights for policy 0, policy_version 60 (0.0014)
[2024-12-03 16:00:15,714][01348] Fps is (10 sec: 4094.8, 60 sec: 3686.2, 300 sec: 3035.8). Total num frames: 258048. Throughput: 0: 1026.4. Samples: 65094. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-12-03 16:00:15,719][01348] Avg episode reward: [(0, '4.525')]
[2024-12-03 16:00:15,733][03345] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000063_258048.pth...
[2024-12-03 16:00:20,712][01348] Fps is (10 sec: 3686.1, 60 sec: 3754.6, 300 sec: 3049.2). Total num frames: 274432. Throughput: 0: 968.7. Samples: 69178. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2024-12-03 16:00:20,714][01348] Avg episode reward: [(0, '4.532')]
[2024-12-03 16:00:23,395][03358] Updated weights for policy 0, policy_version 70 (0.0023)
[2024-12-03 16:00:25,711][01348] Fps is (10 sec: 3687.5, 60 sec: 3822.9, 300 sec: 3104.3). Total num frames: 294912. Throughput: 0: 962.0. Samples: 72482. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:00:25,718][01348] Avg episode reward: [(0, '4.475')]
[2024-12-03 16:00:30,711][01348] Fps is (10 sec: 4505.9, 60 sec: 4096.1, 300 sec: 3194.9). Total num frames: 319488. Throughput: 0: 998.9. Samples: 79460. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:00:30,718][01348] Avg episode reward: [(0, '4.597')]
[2024-12-03 16:00:33,900][03358] Updated weights for policy 0, policy_version 80 (0.0019)
[2024-12-03 16:00:35,711][01348] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3159.8). Total num frames: 331776. Throughput: 0: 960.1. Samples: 83654. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2024-12-03 16:00:35,717][01348] Avg episode reward: [(0, '4.568')]
[2024-12-03 16:00:40,711][01348] Fps is (10 sec: 2867.2, 60 sec: 3754.7, 300 sec: 3165.1). Total num frames: 348160. Throughput: 0: 941.7. Samples: 86208. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-12-03 16:00:40,713][01348] Avg episode reward: [(0, '4.571')]
[2024-12-03 16:00:44,261][03358] Updated weights for policy 0, policy_version 90 (0.0013)
[2024-12-03 16:00:45,711][01348] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3241.2). Total num frames: 372736. Throughput: 0: 957.7. Samples: 93040. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:00:45,713][01348] Avg episode reward: [(0, '4.374')]
[2024-12-03 16:00:50,711][01348] Fps is (10 sec: 4505.6, 60 sec: 3891.4, 300 sec: 3276.8). Total num frames: 393216. Throughput: 0: 973.7. Samples: 98666. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:00:50,714][01348] Avg episode reward: [(0, '4.257')]
[2024-12-03 16:00:55,711][01348] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3244.0). Total num frames: 405504. Throughput: 0: 942.8. Samples: 100668. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-12-03 16:00:55,717][01348] Avg episode reward: [(0, '4.375')]
[2024-12-03 16:00:55,846][03358] Updated weights for policy 0, policy_version 100 (0.0018)
[2024-12-03 16:01:00,711][01348] Fps is (10 sec: 3686.4, 60 sec: 3823.0, 300 sec: 3308.3). Total num frames: 430080. Throughput: 0: 931.7. Samples: 107016. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:01:00,717][01348] Avg episode reward: [(0, '4.532')]
[2024-12-03 16:01:05,714][01348] Fps is (10 sec: 4094.9, 60 sec: 3822.8, 300 sec: 3307.1). Total num frames: 446464. Throughput: 0: 977.2. Samples: 113152. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-12-03 16:01:05,716][01348] Avg episode reward: [(0, '4.487')]
[2024-12-03 16:01:05,747][03358] Updated weights for policy 0, policy_version 110 (0.0029)
[2024-12-03 16:01:10,711][01348] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3306.1). Total num frames: 462848. Throughput: 0: 946.9. Samples: 115092. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:01:10,718][01348] Avg episode reward: [(0, '4.430')]
[2024-12-03 16:01:15,711][01348] Fps is (10 sec: 3687.4, 60 sec: 3754.8, 300 sec: 3333.3). Total num frames: 483328. Throughput: 0: 915.9. Samples: 120676. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:01:15,718][01348] Avg episode reward: [(0, '4.415')]
[2024-12-03 16:01:16,950][03358] Updated weights for policy 0, policy_version 120 (0.0026)
[2024-12-03 16:01:20,711][01348] Fps is (10 sec: 4505.6, 60 sec: 3891.3, 300 sec: 3386.0). Total num frames: 507904. Throughput: 0: 975.8. Samples: 127566. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2024-12-03 16:01:20,714][01348] Avg episode reward: [(0, '4.676')]
[2024-12-03 16:01:20,718][03345] Saving new best policy, reward=4.676!
[2024-12-03 16:01:25,711][01348] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3356.1). Total num frames: 520192. Throughput: 0: 970.2. Samples: 129868. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-12-03 16:01:25,720][01348] Avg episode reward: [(0, '4.588')]
[2024-12-03 16:01:28,594][03358] Updated weights for policy 0, policy_version 130 (0.0027)
[2024-12-03 16:01:30,711][01348] Fps is (10 sec: 3276.8, 60 sec: 3686.4, 300 sec: 3379.2). Total num frames: 540672. Throughput: 0: 927.3. Samples: 134770. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:01:30,713][01348] Avg episode reward: [(0, '4.517')]
[2024-12-03 16:01:35,711][01348] Fps is (10 sec: 4505.6, 60 sec: 3891.2, 300 sec: 3425.7). Total num frames: 565248. Throughput: 0: 959.0. Samples: 141820. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:01:35,716][01348] Avg episode reward: [(0, '4.672')]
[2024-12-03 16:01:37,136][03358] Updated weights for policy 0, policy_version 140 (0.0019)
[2024-12-03 16:01:40,711][01348] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3421.4). Total num frames: 581632. Throughput: 0: 984.0. Samples: 144946. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2024-12-03 16:01:40,716][01348] Avg episode reward: [(0, '4.429')]
[2024-12-03 16:01:45,711][01348] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3417.2). Total num frames: 598016. Throughput: 0: 936.8. Samples: 149170. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-12-03 16:01:45,717][01348] Avg episode reward: [(0, '4.266')]
[2024-12-03 16:01:48,831][03358] Updated weights for policy 0, policy_version 150 (0.0031)
[2024-12-03 16:01:50,711][01348] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3436.1). Total num frames: 618496. Throughput: 0: 952.6. Samples: 156016. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2024-12-03 16:01:50,715][01348] Avg episode reward: [(0, '4.423')]
[2024-12-03 16:01:55,711][01348] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 3476.1). Total num frames: 643072. Throughput: 0: 986.2. Samples: 159472. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2024-12-03 16:01:55,716][01348] Avg episode reward: [(0, '4.545')]
[2024-12-03 16:01:59,561][03358] Updated weights for policy 0, policy_version 160 (0.0026)
[2024-12-03 16:02:00,711][01348] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3449.3). Total num frames: 655360. Throughput: 0: 966.0. Samples: 164146. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2024-12-03 16:02:00,715][01348] Avg episode reward: [(0, '4.551')]
[2024-12-03 16:02:05,711][01348] Fps is (10 sec: 3276.8, 60 sec: 3823.1, 300 sec: 3465.8). Total num frames: 675840. Throughput: 0: 946.0. Samples: 170138. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2024-12-03 16:02:05,718][01348] Avg episode reward: [(0, '4.350')]
[2024-12-03 16:02:09,455][03358] Updated weights for policy 0, policy_version 170 (0.0025)
[2024-12-03 16:02:10,711][01348] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 3502.1). Total num frames: 700416. Throughput: 0: 973.2. Samples: 173660. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:02:10,716][01348] Avg episode reward: [(0, '4.372')]
[2024-12-03 16:02:15,711][01348] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3476.6). Total num frames: 712704. Throughput: 0: 980.1. Samples: 178874. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:02:15,713][01348] Avg episode reward: [(0, '4.462')]
[2024-12-03 16:02:15,727][03345] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000174_712704.pth...
[2024-12-03 16:02:20,711][01348] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3491.4). Total num frames: 733184. Throughput: 0: 935.0. Samples: 183894. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:02:20,715][01348] Avg episode reward: [(0, '4.462')]
[2024-12-03 16:02:21,487][03358] Updated weights for policy 0, policy_version 180 (0.0024)
[2024-12-03 16:02:25,711][01348] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3505.4). Total num frames: 753664. Throughput: 0: 940.1. Samples: 187252. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:02:25,715][01348] Avg episode reward: [(0, '4.598')]
[2024-12-03 16:02:30,711][01348] Fps is (10 sec: 4095.8, 60 sec: 3891.2, 300 sec: 3518.8). Total num frames: 774144. Throughput: 0: 983.1. Samples: 193410. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2024-12-03 16:02:30,720][01348] Avg episode reward: [(0, '4.729')]
[2024-12-03 16:02:30,723][03345] Saving new best policy, reward=4.729!
[2024-12-03 16:02:32,209][03358] Updated weights for policy 0, policy_version 190 (0.0022)
[2024-12-03 16:02:35,711][01348] Fps is (10 sec: 3276.8, 60 sec: 3686.4, 300 sec: 3495.3). Total num frames: 786432. Throughput: 0: 927.5. Samples: 197754. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:02:35,713][01348] Avg episode reward: [(0, '4.672')]
[2024-12-03 16:02:40,718][01348] Fps is (10 sec: 3686.6, 60 sec: 3822.9, 300 sec: 3526.1). Total num frames: 811008. Throughput: 0: 928.9. Samples: 201272. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:02:40,722][01348] Avg episode reward: [(0, '4.473')]
[2024-12-03 16:02:42,341][03358] Updated weights for policy 0, policy_version 200 (0.0028)
[2024-12-03 16:02:45,711][01348] Fps is (10 sec: 4505.6, 60 sec: 3891.2, 300 sec: 3538.2). Total num frames: 831488. Throughput: 0: 973.2. Samples: 207940. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:02:45,718][01348] Avg episode reward: [(0, '4.409')]
[2024-12-03 16:02:50,714][01348] Fps is (10 sec: 3275.9, 60 sec: 3754.5, 300 sec: 3515.7). Total num frames: 843776. Throughput: 0: 937.1. Samples: 212312. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:02:50,721][01348] Avg episode reward: [(0, '4.589')]
[2024-12-03 16:02:54,174][03358] Updated weights for policy 0, policy_version 210 (0.0029)
[2024-12-03 16:02:55,711][01348] Fps is (10 sec: 3276.8, 60 sec: 3686.4, 300 sec: 3527.6). Total num frames: 864256. Throughput: 0: 914.8. Samples: 214826. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:02:55,713][01348] Avg episode reward: [(0, '4.468')]
[2024-12-03 16:03:00,711][01348] Fps is (10 sec: 4506.8, 60 sec: 3891.2, 300 sec: 3555.3). Total num frames: 888832. Throughput: 0: 953.5. Samples: 221780. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:03:00,714][01348] Avg episode reward: [(0, '4.392')]
[2024-12-03 16:03:03,693][03358] Updated weights for policy 0, policy_version 220 (0.0023)
[2024-12-03 16:03:05,718][01348] Fps is (10 sec: 4093.2, 60 sec: 3822.5, 300 sec: 3549.8). Total num frames: 905216. Throughput: 0: 959.7. Samples: 227088. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:03:05,720][01348] Avg episode reward: [(0, '4.463')]
[2024-12-03 16:03:10,711][01348] Fps is (10 sec: 3276.9, 60 sec: 3686.4, 300 sec: 3544.6). Total num frames: 921600. Throughput: 0: 932.4. Samples: 229212. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2024-12-03 16:03:10,717][01348] Avg episode reward: [(0, '4.426')]
[2024-12-03 16:03:14,722][03358] Updated weights for policy 0, policy_version 230 (0.0028)
[2024-12-03 16:03:15,711][01348] Fps is (10 sec: 4098.7, 60 sec: 3891.2, 300 sec: 3570.5). Total num frames: 946176. Throughput: 0: 945.5. Samples: 235956. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2024-12-03 16:03:15,716][01348] Avg episode reward: [(0, '4.626')]
[2024-12-03 16:03:20,713][01348] Fps is (10 sec: 4504.8, 60 sec: 3891.1, 300 sec: 3580.2). Total num frames: 966656. Throughput: 0: 986.9. Samples: 242168. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:03:20,715][01348] Avg episode reward: [(0, '4.810')]
[2024-12-03 16:03:20,721][03345] Saving new best policy, reward=4.810!
[2024-12-03 16:03:25,711][01348] Fps is (10 sec: 3276.9, 60 sec: 3754.7, 300 sec: 3559.8). Total num frames: 978944. Throughput: 0: 954.2. Samples: 244212. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-12-03 16:03:25,715][01348] Avg episode reward: [(0, '4.925')]
[2024-12-03 16:03:25,724][03345] Saving new best policy, reward=4.925!
[2024-12-03 16:03:26,146][03358] Updated weights for policy 0, policy_version 240 (0.0024)
[2024-12-03 16:03:30,711][01348] Fps is (10 sec: 3687.0, 60 sec: 3823.0, 300 sec: 3584.0). Total num frames: 1003520. Throughput: 0: 938.7. Samples: 250182. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2024-12-03 16:03:30,716][01348] Avg episode reward: [(0, '4.760')]
[2024-12-03 16:03:35,052][03358] Updated weights for policy 0, policy_version 250 (0.0017)
[2024-12-03 16:03:35,711][01348] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 3593.0). Total num frames: 1024000. Throughput: 0: 994.7. Samples: 257072. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:03:35,714][01348] Avg episode reward: [(0, '4.525')]
[2024-12-03 16:03:40,712][01348] Fps is (10 sec: 3685.9, 60 sec: 3822.8, 300 sec: 3587.5). Total num frames: 1040384. Throughput: 0: 985.3. Samples: 259166. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:03:40,716][01348] Avg episode reward: [(0, '4.943')]
[2024-12-03 16:03:40,722][03345] Saving new best policy, reward=4.943!
[2024-12-03 16:03:45,711][01348] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3582.3). Total num frames: 1056768. Throughput: 0: 944.9. Samples: 264302. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-12-03 16:03:45,715][01348] Avg episode reward: [(0, '5.034')]
[2024-12-03 16:03:45,726][03345] Saving new best policy, reward=5.034!
[2024-12-03 16:03:46,829][03358] Updated weights for policy 0, policy_version 260 (0.0032)
[2024-12-03 16:03:50,711][01348] Fps is (10 sec: 4096.6, 60 sec: 3959.7, 300 sec: 3665.6). Total num frames: 1081344. Throughput: 0: 980.6. Samples: 271210. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:03:50,713][01348] Avg episode reward: [(0, '4.828')]
[2024-12-03 16:03:55,711][01348] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3721.1). Total num frames: 1097728. Throughput: 0: 999.5. Samples: 274188. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-12-03 16:03:55,717][01348] Avg episode reward: [(0, '4.649')]
[2024-12-03 16:03:57,654][03358] Updated weights for policy 0, policy_version 270 (0.0026)
[2024-12-03 16:04:00,711][01348] Fps is (10 sec: 3276.7, 60 sec: 3754.7, 300 sec: 3721.1). Total num frames: 1114112. Throughput: 0: 945.3. Samples: 278494. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:04:00,713][01348] Avg episode reward: [(0, '4.415')]
[2024-12-03 16:04:05,711][01348] Fps is (10 sec: 4096.0, 60 sec: 3891.6, 300 sec: 3762.8). Total num frames: 1138688. Throughput: 0: 962.3. Samples: 285470. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:04:05,718][01348] Avg episode reward: [(0, '4.447')]
[2024-12-03 16:04:07,084][03358] Updated weights for policy 0, policy_version 280 (0.0029)
[2024-12-03 16:04:10,711][01348] Fps is (10 sec: 4505.7, 60 sec: 3959.5, 300 sec: 3804.4). Total num frames: 1159168. Throughput: 0: 994.1. Samples: 288948. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:04:10,718][01348] Avg episode reward: [(0, '4.502')]
[2024-12-03 16:04:15,717][01348] Fps is (10 sec: 3274.9, 60 sec: 3754.3, 300 sec: 3804.3). Total num frames: 1171456. Throughput: 0: 957.7. Samples: 293284. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-12-03 16:04:15,719][01348] Avg episode reward: [(0, '4.477')]
[2024-12-03 16:04:15,735][03345] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000286_1171456.pth...
[2024-12-03 16:04:15,933][03345] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000063_258048.pth
[2024-12-03 16:04:18,966][03358] Updated weights for policy 0, policy_version 290 (0.0022)
[2024-12-03 16:04:20,711][01348] Fps is (10 sec: 3276.8, 60 sec: 3754.8, 300 sec: 3818.3). Total num frames: 1191936. Throughput: 0: 936.0. Samples: 299192. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:04:20,718][01348] Avg episode reward: [(0, '4.658')]
[2024-12-03 16:04:25,713][01348] Fps is (10 sec: 4507.4, 60 sec: 3959.3, 300 sec: 3873.8). Total num frames: 1216512. Throughput: 0: 963.4. Samples: 302520. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:04:25,721][01348] Avg episode reward: [(0, '4.645')]
[2024-12-03 16:04:29,773][03358] Updated weights for policy 0, policy_version 300 (0.0026)
[2024-12-03 16:04:30,711][01348] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3832.2). Total num frames: 1228800. Throughput: 0: 961.8. Samples: 307582. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:04:30,714][01348] Avg episode reward: [(0, '4.726')]
[2024-12-03 16:04:35,711][01348] Fps is (10 sec: 3277.4, 60 sec: 3754.7, 300 sec: 3818.3). Total num frames: 1249280. Throughput: 0: 923.6. Samples: 312772. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2024-12-03 16:04:35,718][01348] Avg episode reward: [(0, '5.005')]
[2024-12-03 16:04:40,414][03358] Updated weights for policy 0, policy_version 310 (0.0015)
[2024-12-03 16:04:40,711][01348] Fps is (10 sec: 4096.0, 60 sec: 3823.0, 300 sec: 3832.2). Total num frames: 1269760. Throughput: 0: 928.8. Samples: 315984. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:04:40,717][01348] Avg episode reward: [(0, '5.099')]
[2024-12-03 16:04:40,722][03345] Saving new best policy, reward=5.099!
[2024-12-03 16:04:45,711][01348] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3818.4). Total num frames: 1286144. Throughput: 0: 959.3. Samples: 321662. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2024-12-03 16:04:45,713][01348] Avg episode reward: [(0, '4.815')]
[2024-12-03 16:04:50,713][01348] Fps is (10 sec: 2866.7, 60 sec: 3618.0, 300 sec: 3790.5). Total num frames: 1298432. Throughput: 0: 900.5. Samples: 325996. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:04:50,715][01348] Avg episode reward: [(0, '4.752')]
[2024-12-03 16:04:52,643][03358] Updated weights for policy 0, policy_version 320 (0.0014)
[2024-12-03 16:04:55,711][01348] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3804.4). Total num frames: 1323008. Throughput: 0: 895.2. Samples: 329230. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-12-03 16:04:55,713][01348] Avg episode reward: [(0, '4.712')]
[2024-12-03 16:05:00,711][01348] Fps is (10 sec: 4506.4, 60 sec: 3822.9, 300 sec: 3818.3). Total num frames: 1343488. Throughput: 0: 943.4. Samples: 335732. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:05:00,716][01348] Avg episode reward: [(0, '4.475')]
[2024-12-03 16:05:03,683][03358] Updated weights for policy 0, policy_version 330 (0.0035)
[2024-12-03 16:05:05,711][01348] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3790.5). Total num frames: 1355776. Throughput: 0: 901.1. Samples: 339742. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:05:05,719][01348] Avg episode reward: [(0, '4.369')]
[2024-12-03 16:05:10,711][01348] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3790.6). Total num frames: 1376256. Throughput: 0: 892.8. Samples: 342696. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:05:10,718][01348] Avg episode reward: [(0, '4.464')]
[2024-12-03 16:05:13,984][03358] Updated weights for policy 0, policy_version 340 (0.0045)
[2024-12-03 16:05:15,711][01348] Fps is (10 sec: 4096.0, 60 sec: 3755.0, 300 sec: 3804.4). Total num frames: 1396736. Throughput: 0: 926.7. Samples: 349282. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:05:15,717][01348] Avg episode reward: [(0, '4.620')]
[2024-12-03 16:05:20,712][01348] Fps is (10 sec: 3686.0, 60 sec: 3686.3, 300 sec: 3790.5). Total num frames: 1413120. Throughput: 0: 914.0. Samples: 353902. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:05:20,714][01348] Avg episode reward: [(0, '4.757')]
[2024-12-03 16:05:25,711][01348] Fps is (10 sec: 3276.8, 60 sec: 3550.0, 300 sec: 3762.8). Total num frames: 1429504. Throughput: 0: 894.4. Samples: 356234. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0)
[2024-12-03 16:05:25,713][01348] Avg episode reward: [(0, '4.763')]
[2024-12-03 16:05:26,033][03358] Updated weights for policy 0, policy_version 350 (0.0021)
[2024-12-03 16:05:30,711][01348] Fps is (10 sec: 4096.5, 60 sec: 3754.7, 300 sec: 3804.4). Total num frames: 1454080. Throughput: 0: 915.6. Samples: 362864. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:05:30,714][01348] Avg episode reward: [(0, '4.618')]
[2024-12-03 16:05:35,711][01348] Fps is (10 sec: 4096.0, 60 sec: 3686.4, 300 sec: 3804.4). Total num frames: 1470464. Throughput: 0: 946.9. Samples: 368606. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:05:35,713][01348] Avg episode reward: [(0, '4.577')]
[2024-12-03 16:05:36,254][03358] Updated weights for policy 0, policy_version 360 (0.0028)
[2024-12-03 16:05:40,711][01348] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3776.7). Total num frames: 1486848. Throughput: 0: 924.5. Samples: 370832. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:05:40,713][01348] Avg episode reward: [(0, '4.336')]
[2024-12-03 16:05:45,711][01348] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3790.5). Total num frames: 1511424. Throughput: 0: 927.6. Samples: 377474. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:05:45,718][01348] Avg episode reward: [(0, '4.283')]
[2024-12-03 16:05:46,188][03358] Updated weights for policy 0, policy_version 370 (0.0041)
[2024-12-03 16:05:50,715][01348] Fps is (10 sec: 4503.9, 60 sec: 3891.1, 300 sec: 3818.3). Total num frames: 1531904. Throughput: 0: 991.1. Samples: 384344. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:05:50,726][01348] Avg episode reward: [(0, '4.428')]
[2024-12-03 16:05:55,713][01348] Fps is (10 sec: 3685.7, 60 sec: 3754.6, 300 sec: 3790.5). Total num frames: 1548288. Throughput: 0: 972.7. Samples: 386468. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:05:55,719][01348] Avg episode reward: [(0, '4.583')]
[2024-12-03 16:05:57,367][03358] Updated weights for policy 0, policy_version 380 (0.0016)
[2024-12-03 16:06:00,711][01348] Fps is (10 sec: 3687.8, 60 sec: 3754.7, 300 sec: 3804.5). Total num frames: 1568768. Throughput: 0: 953.9. Samples: 392206. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:06:00,717][01348] Avg episode reward: [(0, '4.759')]
[2024-12-03 16:06:05,714][01348] Fps is (10 sec: 4504.9, 60 sec: 3959.2, 300 sec: 3832.1). Total num frames: 1593344. Throughput: 0: 1007.4. Samples: 399236. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2024-12-03 16:06:05,719][01348] Avg episode reward: [(0, '4.725')]
[2024-12-03 16:06:06,020][03358] Updated weights for policy 0, policy_version 390 (0.0020)
[2024-12-03 16:06:10,711][01348] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3818.3). Total num frames: 1609728. Throughput: 0: 1008.8. Samples: 401632. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2024-12-03 16:06:10,713][01348] Avg episode reward: [(0, '4.739')]
[2024-12-03 16:06:15,711][01348] Fps is (10 sec: 3277.9, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 1626112. Throughput: 0: 968.4. Samples: 406440. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2024-12-03 16:06:15,713][01348] Avg episode reward: [(0, '4.650')]
[2024-12-03 16:06:15,729][03345] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000397_1626112.pth...
[2024-12-03 16:06:15,852][03345] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000174_712704.pth
[2024-12-03 16:06:17,876][03358] Updated weights for policy 0, policy_version 400 (0.0030)
[2024-12-03 16:06:20,711][01348] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3832.2). Total num frames: 1650688. Throughput: 0: 994.6. Samples: 413364. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:06:20,717][01348] Avg episode reward: [(0, '4.481')]
[2024-12-03 16:06:25,711][01348] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3818.3). Total num frames: 1667072. Throughput: 0: 1017.3. Samples: 416612. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:06:25,715][01348] Avg episode reward: [(0, '4.472')]
[2024-12-03 16:06:29,131][03358] Updated weights for policy 0, policy_version 410 (0.0028)
[2024-12-03 16:06:30,711][01348] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 1683456. Throughput: 0: 962.6. Samples: 420790. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:06:30,716][01348] Avg episode reward: [(0, '4.612')]
[2024-12-03 16:06:35,711][01348] Fps is (10 sec: 4095.8, 60 sec: 3959.4, 300 sec: 3818.3). Total num frames: 1708032. Throughput: 0: 962.9. Samples: 427672. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:06:35,716][01348] Avg episode reward: [(0, '4.705')]
[2024-12-03 16:06:38,184][03358] Updated weights for policy 0, policy_version 420 (0.0019)
[2024-12-03 16:06:40,718][01348] Fps is (10 sec: 4502.5, 60 sec: 4027.3, 300 sec: 3832.1). Total num frames: 1728512. Throughput: 0: 993.3. Samples: 431172. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2024-12-03 16:06:40,720][01348] Avg episode reward: [(0, '4.944')]
[2024-12-03 16:06:45,716][01348] Fps is (10 sec: 3684.8, 60 sec: 3890.9, 300 sec: 3818.2). Total num frames: 1744896. Throughput: 0: 971.6. Samples: 435934. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:06:45,719][01348] Avg episode reward: [(0, '4.923')]
[2024-12-03 16:06:49,642][03358] Updated weights for policy 0, policy_version 430 (0.0019)
[2024-12-03 16:06:50,711][01348] Fps is (10 sec: 3689.0, 60 sec: 3891.4, 300 sec: 3804.4). Total num frames: 1765376. Throughput: 0: 949.3. Samples: 441952. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2024-12-03 16:06:50,718][01348] Avg episode reward: [(0, '4.851')]
[2024-12-03 16:06:55,711][01348] Fps is (10 sec: 4098.0, 60 sec: 3959.6, 300 sec: 3832.2). Total num frames: 1785856. Throughput: 0: 974.4. Samples: 445482. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:06:55,713][01348] Avg episode reward: [(0, '5.013')]
[2024-12-03 16:06:59,430][03358] Updated weights for policy 0, policy_version 440 (0.0023)
[2024-12-03 16:07:00,711][01348] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3818.3). Total num frames: 1802240. Throughput: 0: 993.2. Samples: 451134. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2024-12-03 16:07:00,717][01348] Avg episode reward: [(0, '4.973')]
[2024-12-03 16:07:05,711][01348] Fps is (10 sec: 3686.4, 60 sec: 3823.2, 300 sec: 3804.4). Total num frames: 1822720. Throughput: 0: 956.1. Samples: 456388. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2024-12-03 16:07:05,718][01348] Avg episode reward: [(0, '5.003')]
[2024-12-03 16:07:09,700][03358] Updated weights for policy 0, policy_version 450 (0.0030)
[2024-12-03 16:07:10,711][01348] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 3846.1). Total num frames: 1847296. Throughput: 0: 962.7. Samples: 459932. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:07:10,717][01348] Avg episode reward: [(0, '5.051')]
[2024-12-03 16:07:15,711][01348] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3832.2). Total num frames: 1863680. Throughput: 0: 1007.8. Samples: 466140. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2024-12-03 16:07:15,717][01348] Avg episode reward: [(0, '4.804')]
[2024-12-03 16:07:20,711][01348] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3818.3). Total num frames: 1880064. Throughput: 0: 952.9. Samples: 470552. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2024-12-03 16:07:20,715][01348] Avg episode reward: [(0, '4.966')]
[2024-12-03 16:07:21,379][03358] Updated weights for policy 0, policy_version 460 (0.0035)
[2024-12-03 16:07:25,711][01348] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3818.3). Total num frames: 1900544. Throughput: 0: 951.8. Samples: 473996. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:07:25,713][01348] Avg episode reward: [(0, '5.047')]
[2024-12-03 16:07:30,461][03358] Updated weights for policy 0, policy_version 470 (0.0030)
[2024-12-03 16:07:30,711][01348] Fps is (10 sec: 4505.6, 60 sec: 4027.7, 300 sec: 3860.0). Total num frames: 1925120. Throughput: 0: 998.9. Samples: 480878. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-12-03 16:07:30,716][01348] Avg episode reward: [(0, '5.265')]
[2024-12-03 16:07:30,720][03345] Saving new best policy, reward=5.265!
[2024-12-03 16:07:35,711][01348] Fps is (10 sec: 3686.4, 60 sec: 3823.0, 300 sec: 3818.3). Total num frames: 1937408. Throughput: 0: 961.1. Samples: 485200. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:07:35,717][01348] Avg episode reward: [(0, '5.449')]
[2024-12-03 16:07:35,728][03345] Saving new best policy, reward=5.449!
[2024-12-03 16:07:40,711][01348] Fps is (10 sec: 3276.8, 60 sec: 3823.4, 300 sec: 3818.3). Total num frames: 1957888. Throughput: 0: 945.1. Samples: 488010. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:07:40,717][01348] Avg episode reward: [(0, '5.323')]
[2024-12-03 16:07:41,915][03358] Updated weights for policy 0, policy_version 480 (0.0031)
[2024-12-03 16:07:45,711][01348] Fps is (10 sec: 4505.6, 60 sec: 3959.8, 300 sec: 3860.0). Total num frames: 1982464. Throughput: 0: 975.0. Samples: 495008. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-12-03 16:07:45,716][01348] Avg episode reward: [(0, '5.430')]
[2024-12-03 16:07:50,711][01348] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3846.1). Total num frames: 1998848. Throughput: 0: 973.5. Samples: 500196. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2024-12-03 16:07:50,715][01348] Avg episode reward: [(0, '5.499')]
[2024-12-03 16:07:50,718][03345] Saving new best policy, reward=5.499!
[2024-12-03 16:07:53,520][03358] Updated weights for policy 0, policy_version 490 (0.0026)
[2024-12-03 16:07:55,711][01348] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3818.3). Total num frames: 2015232. Throughput: 0: 942.8. Samples: 502358. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:07:55,720][01348] Avg episode reward: [(0, '5.589')]
[2024-12-03 16:07:55,728][03345] Saving new best policy, reward=5.589!
[2024-12-03 16:08:00,711][01348] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3846.2). Total num frames: 2039808. Throughput: 0: 961.5. Samples: 509408. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:08:00,714][01348] Avg episode reward: [(0, '5.597')]
[2024-12-03 16:08:00,718][03345] Saving new best policy, reward=5.597!
[2024-12-03 16:08:02,576][03358] Updated weights for policy 0, policy_version 500 (0.0014)
[2024-12-03 16:08:05,711][01348] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3846.1). Total num frames: 2056192. Throughput: 0: 987.8. Samples: 515002. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:08:05,715][01348] Avg episode reward: [(0, '5.494')]
[2024-12-03 16:08:10,711][01348] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3818.3). Total num frames: 2072576. Throughput: 0: 959.0. Samples: 517150. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2024-12-03 16:08:10,713][01348] Avg episode reward: [(0, '5.520')]
[2024-12-03 16:08:14,130][03358] Updated weights for policy 0, policy_version 510 (0.0028)
[2024-12-03 16:08:15,711][01348] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3818.3). Total num frames: 2093056. Throughput: 0: 943.8. Samples: 523350. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:08:15,720][01348] Avg episode reward: [(0, '5.715')]
[2024-12-03 16:08:15,731][03345] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000511_2093056.pth...
[2024-12-03 16:08:15,850][03345] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000286_1171456.pth
[2024-12-03 16:08:15,864][03345] Saving new best policy, reward=5.715!
[2024-12-03 16:08:20,713][01348] Fps is (10 sec: 4504.6, 60 sec: 3959.3, 300 sec: 3859.9). Total num frames: 2117632. Throughput: 0: 993.4. Samples: 529904. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2024-12-03 16:08:20,718][01348] Avg episode reward: [(0, '5.670')]
[2024-12-03 16:08:24,884][03358] Updated weights for policy 0, policy_version 520 (0.0028)
[2024-12-03 16:08:25,712][01348] Fps is (10 sec: 3686.1, 60 sec: 3822.9, 300 sec: 3818.3). Total num frames: 2129920. Throughput: 0: 977.1. Samples: 531980. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2024-12-03 16:08:25,719][01348] Avg episode reward: [(0, '5.620')]
[2024-12-03 16:08:30,711][01348] Fps is (10 sec: 3277.5, 60 sec: 3754.7, 300 sec: 3818.3). Total num frames: 2150400. Throughput: 0: 945.4. Samples: 537552. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:08:30,714][01348] Avg episode reward: [(0, '6.085')]
[2024-12-03 16:08:30,716][03345] Saving new best policy, reward=6.085!
[2024-12-03 16:08:34,568][03358] Updated weights for policy 0, policy_version 530 (0.0023)
[2024-12-03 16:08:35,711][01348] Fps is (10 sec: 4505.9, 60 sec: 3959.5, 300 sec: 3846.1). Total num frames: 2174976. Throughput: 0: 982.7. Samples: 544416. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2024-12-03 16:08:35,713][01348] Avg episode reward: [(0, '6.367')]
[2024-12-03 16:08:35,721][03345] Saving new best policy, reward=6.367!
[2024-12-03 16:08:40,711][01348] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3846.1). Total num frames: 2191360. Throughput: 0: 992.4. Samples: 547018. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:08:40,720][01348] Avg episode reward: [(0, '6.150')]
[2024-12-03 16:08:45,711][01348] Fps is (10 sec: 3276.9, 60 sec: 3754.7, 300 sec: 3818.3). Total num frames: 2207744. Throughput: 0: 936.3. Samples: 551540. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2024-12-03 16:08:45,713][01348] Avg episode reward: [(0, '6.525')]
[2024-12-03 16:08:45,721][03345] Saving new best policy, reward=6.525!
[2024-12-03 16:08:46,351][03358] Updated weights for policy 0, policy_version 540 (0.0019)
[2024-12-03 16:08:50,711][01348] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3846.1). Total num frames: 2232320. Throughput: 0: 965.0. Samples: 558428. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:08:50,715][01348] Avg episode reward: [(0, '6.512')]
[2024-12-03 16:08:55,711][01348] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3846.1). Total num frames: 2248704. Throughput: 0: 993.2. Samples: 561842. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2024-12-03 16:08:55,715][01348] Avg episode reward: [(0, '6.665')]
[2024-12-03 16:08:55,723][03345] Saving new best policy, reward=6.665!
[2024-12-03 16:08:56,326][03358] Updated weights for policy 0, policy_version 550 (0.0032)
[2024-12-03 16:09:00,711][01348] Fps is (10 sec: 3276.7, 60 sec: 3754.6, 300 sec: 3818.3). Total num frames: 2265088. Throughput: 0: 948.1. Samples: 566014. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:09:00,718][01348] Avg episode reward: [(0, '6.625')]
[2024-12-03 16:09:05,711][01348] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3818.3). Total num frames: 2285568. Throughput: 0: 953.1. Samples: 572792. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:09:05,717][01348] Avg episode reward: [(0, '6.972')]
[2024-12-03 16:09:05,767][03345] Saving new best policy, reward=6.972!
[2024-12-03 16:09:06,587][03358] Updated weights for policy 0, policy_version 560 (0.0048)
[2024-12-03 16:09:10,711][01348] Fps is (10 sec: 4505.7, 60 sec: 3959.5, 300 sec: 3860.0). Total num frames: 2310144. Throughput: 0: 983.6. Samples: 576240. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:09:10,713][01348] Avg episode reward: [(0, '7.352')]
[2024-12-03 16:09:10,715][03345] Saving new best policy, reward=7.352!
[2024-12-03 16:09:15,711][01348] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3832.2). Total num frames: 2322432. Throughput: 0: 964.9. Samples: 580974. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2024-12-03 16:09:15,716][01348] Avg episode reward: [(0, '7.301')]
[2024-12-03 16:09:18,181][03358] Updated weights for policy 0, policy_version 570 (0.0028)
[2024-12-03 16:09:20,711][01348] Fps is (10 sec: 3686.4, 60 sec: 3823.1, 300 sec: 3832.2). Total num frames: 2347008. Throughput: 0: 947.2. Samples: 587040. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-12-03 16:09:20,713][01348] Avg episode reward: [(0, '7.346')]
[2024-12-03 16:09:25,711][01348] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 3860.0). Total num frames: 2367488. Throughput: 0: 965.9. Samples: 590482. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:09:25,717][01348] Avg episode reward: [(0, '7.831')]
[2024-12-03 16:09:25,730][03345] Saving new best policy, reward=7.831!
[2024-12-03 16:09:27,194][03358] Updated weights for policy 0, policy_version 580 (0.0038)
[2024-12-03 16:09:30,712][01348] Fps is (10 sec: 3686.1, 60 sec: 3891.2, 300 sec: 3846.1). Total num frames: 2383872. Throughput: 0: 991.0. Samples: 596138. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-12-03 16:09:30,714][01348] Avg episode reward: [(0, '8.640')]
[2024-12-03 16:09:30,719][03345] Saving new best policy, reward=8.640!
[2024-12-03 16:09:35,711][01348] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3832.2). Total num frames: 2400256. Throughput: 0: 952.1. Samples: 601272. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:09:35,715][01348] Avg episode reward: [(0, '9.161')]
[2024-12-03 16:09:35,722][03345] Saving new best policy, reward=9.161!
[2024-12-03 16:09:38,411][03358] Updated weights for policy 0, policy_version 590 (0.0023)
[2024-12-03 16:09:40,711][01348] Fps is (10 sec: 4096.3, 60 sec: 3891.2, 300 sec: 3860.0). Total num frames: 2424832. Throughput: 0: 952.8. Samples: 604718. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:09:40,721][01348] Avg episode reward: [(0, '9.322')]
[2024-12-03 16:09:40,726][03345] Saving new best policy, reward=9.322!
[2024-12-03 16:09:45,713][01348] Fps is (10 sec: 4504.8, 60 sec: 3959.3, 300 sec: 3887.7). Total num frames: 2445312. Throughput: 0: 1004.1. Samples: 611202. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:09:45,718][01348] Avg episode reward: [(0, '8.700')]
[2024-12-03 16:09:50,004][03358] Updated weights for policy 0, policy_version 600 (0.0019)
[2024-12-03 16:09:50,711][01348] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3846.1). Total num frames: 2457600. Throughput: 0: 950.8. Samples: 615576. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:09:50,718][01348] Avg episode reward: [(0, '8.748')]
[2024-12-03 16:09:55,711][01348] Fps is (10 sec: 3687.1, 60 sec: 3891.2, 300 sec: 3860.0). Total num frames: 2482176. Throughput: 0: 951.8. Samples: 619070. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:09:55,713][01348] Avg episode reward: [(0, '8.732')]
[2024-12-03 16:09:58,744][03358] Updated weights for policy 0, policy_version 610 (0.0017)
[2024-12-03 16:10:00,711][01348] Fps is (10 sec: 4915.2, 60 sec: 4027.8, 300 sec: 3901.6). Total num frames: 2506752. Throughput: 0: 1003.8. Samples: 626144. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:10:00,713][01348] Avg episode reward: [(0, '9.432')]
[2024-12-03 16:10:00,715][03345] Saving new best policy, reward=9.432!
[2024-12-03 16:10:05,711][01348] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3873.8). Total num frames: 2519040. Throughput: 0: 969.5. Samples: 630666. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:10:05,721][01348] Avg episode reward: [(0, '10.043')]
[2024-12-03 16:10:05,733][03345] Saving new best policy, reward=10.043!
[2024-12-03 16:10:10,287][03358] Updated weights for policy 0, policy_version 620 (0.0060)
[2024-12-03 16:10:10,711][01348] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3873.8). Total num frames: 2539520. Throughput: 0: 953.7. Samples: 633400. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:10:10,718][01348] Avg episode reward: [(0, '9.941')]
[2024-12-03 16:10:15,711][01348] Fps is (10 sec: 4505.5, 60 sec: 4027.7, 300 sec: 3901.6). Total num frames: 2564096. Throughput: 0: 986.1. Samples: 640514. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:10:15,714][01348] Avg episode reward: [(0, '10.427')]
[2024-12-03 16:10:15,722][03345] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000626_2564096.pth...
[2024-12-03 16:10:15,841][03345] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000397_1626112.pth
[2024-12-03 16:10:15,853][03345] Saving new best policy, reward=10.427!
[2024-12-03 16:10:20,711][01348] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3887.7). Total num frames: 2576384. Throughput: 0: 985.2. Samples: 645606. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:10:20,718][01348] Avg episode reward: [(0, '10.228')]
[2024-12-03 16:10:20,744][03358] Updated weights for policy 0, policy_version 630 (0.0023)
[2024-12-03 16:10:25,711][01348] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3873.8). Total num frames: 2596864. Throughput: 0: 955.9. Samples: 647732. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:10:25,714][01348] Avg episode reward: [(0, '10.329')]
[2024-12-03 16:10:30,601][03358] Updated weights for policy 0, policy_version 640 (0.0026)
[2024-12-03 16:10:30,711][01348] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 3901.6). Total num frames: 2621440. Throughput: 0: 966.9. Samples: 654712. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-12-03 16:10:30,719][01348] Avg episode reward: [(0, '9.489')]
[2024-12-03 16:10:35,712][01348] Fps is (10 sec: 4095.8, 60 sec: 3959.4, 300 sec: 3901.6). Total num frames: 2637824. Throughput: 0: 1006.6. Samples: 660876. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:10:35,716][01348] Avg episode reward: [(0, '10.458')]
[2024-12-03 16:10:35,762][03345] Saving new best policy, reward=10.458!
[2024-12-03 16:10:40,711][01348] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3873.8). Total num frames: 2654208. Throughput: 0: 974.8. Samples: 662938. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:10:40,715][01348] Avg episode reward: [(0, '10.184')]
[2024-12-03 16:10:41,975][03358] Updated weights for policy 0, policy_version 650 (0.0027)
[2024-12-03 16:10:45,711][01348] Fps is (10 sec: 3686.7, 60 sec: 3823.1, 300 sec: 3873.9). Total num frames: 2674688. Throughput: 0: 953.8. Samples: 669066. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:10:45,713][01348] Avg episode reward: [(0, '11.348')]
[2024-12-03 16:10:45,794][03345] Saving new best policy, reward=11.348!
[2024-12-03 16:10:50,711][01348] Fps is (10 sec: 4505.4, 60 sec: 4027.7, 300 sec: 3901.6). Total num frames: 2699264. Throughput: 0: 1004.4. Samples: 675866. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2024-12-03 16:10:50,714][01348] Avg episode reward: [(0, '11.261')]
[2024-12-03 16:10:51,403][03358] Updated weights for policy 0, policy_version 660 (0.0019)
[2024-12-03 16:10:55,711][01348] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3887.7). Total num frames: 2715648. Throughput: 0: 989.4. Samples: 677922. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:10:55,713][01348] Avg episode reward: [(0, '11.117')]
[2024-12-03 16:11:00,711][01348] Fps is (10 sec: 3686.5, 60 sec: 3822.9, 300 sec: 3873.9). Total num frames: 2736128. Throughput: 0: 951.9. Samples: 683350. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-12-03 16:11:00,714][01348] Avg episode reward: [(0, '11.319')]
[2024-12-03 16:11:02,548][03358] Updated weights for policy 0, policy_version 670 (0.0023)
[2024-12-03 16:11:05,711][01348] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3887.7). Total num frames: 2756608. Throughput: 0: 988.9. Samples: 690108. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2024-12-03 16:11:05,713][01348] Avg episode reward: [(0, '11.845')]
[2024-12-03 16:11:05,721][03345] Saving new best policy, reward=11.845!
[2024-12-03 16:11:10,711][01348] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3887.7). Total num frames: 2772992. Throughput: 0: 1001.9. Samples: 692816. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0)
[2024-12-03 16:11:10,715][01348] Avg episode reward: [(0, '12.397')]
[2024-12-03 16:11:10,720][03345] Saving new best policy, reward=12.397!
[2024-12-03 16:11:14,515][03358] Updated weights for policy 0, policy_version 680 (0.0021)
[2024-12-03 16:11:15,711][01348] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3860.0). Total num frames: 2789376. Throughput: 0: 942.0. Samples: 697104. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2024-12-03 16:11:15,713][01348] Avg episode reward: [(0, '12.214')]
[2024-12-03 16:11:20,711][01348] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3873.8). Total num frames: 2809856. Throughput: 0: 947.4. Samples: 703506. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:11:20,717][01348] Avg episode reward: [(0, '12.262')]
[2024-12-03 16:11:23,623][03358] Updated weights for policy 0, policy_version 690 (0.0015)
[2024-12-03 16:11:25,711][01348] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3887.7). Total num frames: 2830336. Throughput: 0: 977.3. Samples: 706918. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:11:25,720][01348] Avg episode reward: [(0, '12.597')]
[2024-12-03 16:11:25,729][03345] Saving new best policy, reward=12.597!
[2024-12-03 16:11:30,711][01348] Fps is (10 sec: 3276.8, 60 sec: 3686.4, 300 sec: 3846.1). Total num frames: 2842624. Throughput: 0: 931.2. Samples: 710968. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:11:30,713][01348] Avg episode reward: [(0, '12.831')]
[2024-12-03 16:11:30,716][03345] Saving new best policy, reward=12.831!
[2024-12-03 16:11:35,711][01348] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3846.2). Total num frames: 2863104. Throughput: 0: 917.8. Samples: 717168. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2024-12-03 16:11:35,718][01348] Avg episode reward: [(0, '14.345')]
[2024-12-03 16:11:35,728][03345] Saving new best policy, reward=14.345!
[2024-12-03 16:11:35,999][03358] Updated weights for policy 0, policy_version 700 (0.0016)
[2024-12-03 16:11:40,711][01348] Fps is (10 sec: 4505.6, 60 sec: 3891.2, 300 sec: 3873.9). Total num frames: 2887680. Throughput: 0: 943.9. Samples: 720398. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0)
[2024-12-03 16:11:40,713][01348] Avg episode reward: [(0, '13.304')]
[2024-12-03 16:11:45,711][01348] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3846.1). Total num frames: 2899968. Throughput: 0: 930.1. Samples: 725204. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:11:45,714][01348] Avg episode reward: [(0, '13.632')]
[2024-12-03 16:11:47,812][03358] Updated weights for policy 0, policy_version 710 (0.0021)
[2024-12-03 16:11:50,711][01348] Fps is (10 sec: 3276.8, 60 sec: 3686.4, 300 sec: 3846.1). Total num frames: 2920448. Throughput: 0: 900.8. Samples: 730644. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:11:50,720][01348] Avg episode reward: [(0, '13.641')]
[2024-12-03 16:11:55,711][01348] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3860.0). Total num frames: 2940928. Throughput: 0: 913.2. Samples: 733912. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-12-03 16:11:55,716][01348] Avg episode reward: [(0, '13.660')]
[2024-12-03 16:11:56,967][03358] Updated weights for policy 0, policy_version 720 (0.0035)
[2024-12-03 16:12:00,715][01348] Fps is (10 sec: 3685.1, 60 sec: 3686.2, 300 sec: 3846.0). Total num frames: 2957312. Throughput: 0: 945.8. Samples: 739668. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:12:00,719][01348] Avg episode reward: [(0, '14.868')]
[2024-12-03 16:12:00,721][03345] Saving new best policy, reward=14.868!
[2024-12-03 16:12:05,711][01348] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3818.3). Total num frames: 2973696. Throughput: 0: 903.9. Samples: 744182. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2024-12-03 16:12:05,713][01348] Avg episode reward: [(0, '15.556')]
[2024-12-03 16:12:05,720][03345] Saving new best policy, reward=15.556!
[2024-12-03 16:12:09,098][03358] Updated weights for policy 0, policy_version 730 (0.0022)
[2024-12-03 16:12:10,711][01348] Fps is (10 sec: 3687.7, 60 sec: 3686.4, 300 sec: 3832.2). Total num frames: 2994176. Throughput: 0: 899.8. Samples: 747410. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2024-12-03 16:12:10,718][01348] Avg episode reward: [(0, '15.237')]
[2024-12-03 16:12:15,711][01348] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3846.1). Total num frames: 3014656. Throughput: 0: 957.1. Samples: 754038. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2024-12-03 16:12:15,717][01348] Avg episode reward: [(0, '14.838')]
[2024-12-03 16:12:15,727][03345] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000736_3014656.pth...
[2024-12-03 16:12:15,888][03345] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000511_2093056.pth
[2024-12-03 16:12:20,711][01348] Fps is (10 sec: 3276.7, 60 sec: 3618.1, 300 sec: 3818.3). Total num frames: 3026944. Throughput: 0: 906.0. Samples: 757940. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-12-03 16:12:20,717][01348] Avg episode reward: [(0, '14.921')]
[2024-12-03 16:12:21,109][03358] Updated weights for policy 0, policy_version 740 (0.0031)
[2024-12-03 16:12:25,711][01348] Fps is (10 sec: 3686.4, 60 sec: 3686.4, 300 sec: 3818.3). Total num frames: 3051520. Throughput: 0: 909.0. Samples: 761304. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2024-12-03 16:12:25,714][01348] Avg episode reward: [(0, '15.474')]
[2024-12-03 16:12:30,021][03358] Updated weights for policy 0, policy_version 750 (0.0028)
[2024-12-03 16:12:30,711][01348] Fps is (10 sec: 4505.8, 60 sec: 3822.9, 300 sec: 3846.1). Total num frames: 3072000. Throughput: 0: 953.3. Samples: 768102. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:12:30,713][01348] Avg episode reward: [(0, '16.160')]
[2024-12-03 16:12:30,715][03345] Saving new best policy, reward=16.160!
[2024-12-03 16:12:35,717][01348] Fps is (10 sec: 3684.3, 60 sec: 3754.3, 300 sec: 3832.1). Total num frames: 3088384. Throughput: 0: 936.1. Samples: 772774. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:12:35,719][01348] Avg episode reward: [(0, '16.339')]
[2024-12-03 16:12:35,738][03345] Saving new best policy, reward=16.339!
[2024-12-03 16:12:40,711][01348] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3804.4). Total num frames: 3104768. Throughput: 0: 915.5. Samples: 775110. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2024-12-03 16:12:40,713][01348] Avg episode reward: [(0, '17.189')]
[2024-12-03 16:12:40,720][03345] Saving new best policy, reward=17.189!
[2024-12-03 16:12:41,927][03358] Updated weights for policy 0, policy_version 760 (0.0019)
[2024-12-03 16:12:45,711][01348] Fps is (10 sec: 4098.4, 60 sec: 3822.9, 300 sec: 3832.2). Total num frames: 3129344. Throughput: 0: 936.4. Samples: 781802. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:12:45,713][01348] Avg episode reward: [(0, '16.769')]
[2024-12-03 16:12:50,713][01348] Fps is (10 sec: 4095.3, 60 sec: 3754.5, 300 sec: 3832.2). Total num frames: 3145728. Throughput: 0: 956.4. Samples: 787222. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:12:50,717][01348] Avg episode reward: [(0, '15.491')]
[2024-12-03 16:12:53,098][03358] Updated weights for policy 0, policy_version 770 (0.0020)
[2024-12-03 16:12:55,711][01348] Fps is (10 sec: 3276.8, 60 sec: 3686.4, 300 sec: 3804.4). Total num frames: 3162112. Throughput: 0: 929.0. Samples: 789214. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2024-12-03 16:12:55,719][01348] Avg episode reward: [(0, '17.471')]
[2024-12-03 16:12:55,727][03345] Saving new best policy, reward=17.471!
[2024-12-03 16:13:00,711][01348] Fps is (10 sec: 3687.1, 60 sec: 3754.9, 300 sec: 3818.3). Total num frames: 3182592. Throughput: 0: 924.4. Samples: 795636. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-12-03 16:13:00,713][01348] Avg episode reward: [(0, '16.416')]
[2024-12-03 16:13:02,887][03358] Updated weights for policy 0, policy_version 780 (0.0033)
[2024-12-03 16:13:05,711][01348] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3832.2). Total num frames: 3203072. Throughput: 0: 976.0. Samples: 801860. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:13:05,716][01348] Avg episode reward: [(0, '17.447')]
[2024-12-03 16:13:10,713][01348] Fps is (10 sec: 3685.7, 60 sec: 3754.6, 300 sec: 3818.3). Total num frames: 3219456. Throughput: 0: 945.9. Samples: 803872. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2024-12-03 16:13:10,715][01348] Avg episode reward: [(0, '17.680')]
[2024-12-03 16:13:10,717][03345] Saving new best policy, reward=17.680!
[2024-12-03 16:13:14,533][03358] Updated weights for policy 0, policy_version 790 (0.0013)
[2024-12-03 16:13:15,711][01348] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3804.4). Total num frames: 3239936. Throughput: 0: 920.8. Samples: 809538. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2024-12-03 16:13:15,720][01348] Avg episode reward: [(0, '17.256')]
[2024-12-03 16:13:20,711][01348] Fps is (10 sec: 4096.8, 60 sec: 3891.2, 300 sec: 3832.2). Total num frames: 3260416. Throughput: 0: 966.3. Samples: 816252. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:13:20,718][01348] Avg episode reward: [(0, '17.816')]
[2024-12-03 16:13:20,720][03345] Saving new best policy, reward=17.816!
[2024-12-03 16:13:25,713][01348] Fps is (10 sec: 3276.2, 60 sec: 3686.3, 300 sec: 3804.4). Total num frames: 3272704. Throughput: 0: 962.5. Samples: 818424. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2024-12-03 16:13:25,715][01348] Avg episode reward: [(0, '18.601')]
[2024-12-03 16:13:25,734][03358] Updated weights for policy 0, policy_version 800 (0.0043)
[2024-12-03 16:13:25,734][03345] Saving new best policy, reward=18.601!
[2024-12-03 16:13:30,711][01348] Fps is (10 sec: 3276.8, 60 sec: 3686.4, 300 sec: 3790.5). Total num frames: 3293184. Throughput: 0: 926.4. Samples: 823488. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0)
[2024-12-03 16:13:30,713][01348] Avg episode reward: [(0, '17.583')]
[2024-12-03 16:13:35,043][03358] Updated weights for policy 0, policy_version 810 (0.0036)
[2024-12-03 16:13:35,711][01348] Fps is (10 sec: 4506.5, 60 sec: 3823.3, 300 sec: 3818.3). Total num frames: 3317760. Throughput: 0: 962.3. Samples: 830526. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:13:35,718][01348] Avg episode reward: [(0, '19.272')]
[2024-12-03 16:13:35,727][03345] Saving new best policy, reward=19.272!
[2024-12-03 16:13:40,711][01348] Fps is (10 sec: 4505.4, 60 sec: 3891.2, 300 sec: 3832.2). Total num frames: 3338240. Throughput: 0: 984.7. Samples: 833526. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:13:40,714][01348] Avg episode reward: [(0, '19.340')]
[2024-12-03 16:13:40,716][03345] Saving new best policy, reward=19.340!
[2024-12-03 16:13:45,711][01348] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3804.4). Total num frames: 3354624. Throughput: 0: 937.2. Samples: 837812. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2024-12-03 16:13:45,713][01348] Avg episode reward: [(0, '18.946')]
[2024-12-03 16:13:46,589][03358] Updated weights for policy 0, policy_version 820 (0.0022)
[2024-12-03 16:13:50,711][01348] Fps is (10 sec: 3686.6, 60 sec: 3823.1, 300 sec: 3818.3). Total num frames: 3375104. Throughput: 0: 949.9. Samples: 844604. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-12-03 16:13:50,713][01348] Avg episode reward: [(0, '20.102')]
[2024-12-03 16:13:50,718][03345] Saving new best policy, reward=20.102!
[2024-12-03 16:13:55,711][01348] Fps is (10 sec: 4095.9, 60 sec: 3891.2, 300 sec: 3832.2). Total num frames: 3395584. Throughput: 0: 982.8. Samples: 848098. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:13:55,719][01348] Avg episode reward: [(0, '18.794')]
[2024-12-03 16:13:56,265][03358] Updated weights for policy 0, policy_version 830 (0.0025)
[2024-12-03 16:14:00,713][01348] Fps is (10 sec: 3685.5, 60 sec: 3822.8, 300 sec: 3818.3). Total num frames: 3411968. Throughput: 0: 961.3. Samples: 852800. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-12-03 16:14:00,718][01348] Avg episode reward: [(0, '18.885')]
[2024-12-03 16:14:05,711][01348] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3804.4). Total num frames: 3432448. Throughput: 0: 945.1. Samples: 858782. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:14:05,714][01348] Avg episode reward: [(0, '18.834')]
[2024-12-03 16:14:07,308][03358] Updated weights for policy 0, policy_version 840 (0.0026)
[2024-12-03 16:14:10,711][01348] Fps is (10 sec: 4097.0, 60 sec: 3891.3, 300 sec: 3832.2). Total num frames: 3452928. Throughput: 0: 970.9. Samples: 862114. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:14:10,715][01348] Avg episode reward: [(0, '19.286')]
[2024-12-03 16:14:15,711][01348] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3804.4). Total num frames: 3469312. Throughput: 0: 970.7. Samples: 867168. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:14:15,714][01348] Avg episode reward: [(0, '18.850')]
[2024-12-03 16:14:15,721][03345] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000847_3469312.pth...
[2024-12-03 16:14:15,900][03345] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000626_2564096.pth
[2024-12-03 16:14:19,363][03358] Updated weights for policy 0, policy_version 850 (0.0030)
[2024-12-03 16:14:20,711][01348] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3790.5). Total num frames: 3485696. Throughput: 0: 929.5. Samples: 872352. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2024-12-03 16:14:20,714][01348] Avg episode reward: [(0, '20.245')]
[2024-12-03 16:14:20,721][03345] Saving new best policy, reward=20.245!
[2024-12-03 16:14:25,711][01348] Fps is (10 sec: 3686.4, 60 sec: 3891.3, 300 sec: 3804.4). Total num frames: 3506176. Throughput: 0: 934.2. Samples: 875566. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2024-12-03 16:14:25,717][01348] Avg episode reward: [(0, '19.360')]
[2024-12-03 16:14:29,258][03358] Updated weights for policy 0, policy_version 860 (0.0023)
[2024-12-03 16:14:30,711][01348] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3804.4). Total num frames: 3522560. Throughput: 0: 968.7. Samples: 881404. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:14:30,714][01348] Avg episode reward: [(0, '20.361')]
[2024-12-03 16:14:30,718][03345] Saving new best policy, reward=20.361!
[2024-12-03 16:14:35,711][01348] Fps is (10 sec: 3276.8, 60 sec: 3686.4, 300 sec: 3776.6). Total num frames: 3538944. Throughput: 0: 913.4. Samples: 885706. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2024-12-03 16:14:35,716][01348] Avg episode reward: [(0, '19.531')]
[2024-12-03 16:14:40,711][01348] Fps is (10 sec: 3686.4, 60 sec: 3686.4, 300 sec: 3776.7). Total num frames: 3559424. Throughput: 0: 910.4. Samples: 889064. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:14:40,717][01348] Avg episode reward: [(0, '20.385')]
[2024-12-03 16:14:40,723][03345] Saving new best policy, reward=20.385!
[2024-12-03 16:14:40,726][03358] Updated weights for policy 0, policy_version 870 (0.0046)
[2024-12-03 16:14:45,713][01348] Fps is (10 sec: 4095.1, 60 sec: 3754.5, 300 sec: 3804.4). Total num frames: 3579904. Throughput: 0: 950.5. Samples: 895572. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-12-03 16:14:45,719][01348] Avg episode reward: [(0, '20.773')]
[2024-12-03 16:14:45,730][03345] Saving new best policy, reward=20.773!
[2024-12-03 16:14:50,715][01348] Fps is (10 sec: 3275.6, 60 sec: 3617.9, 300 sec: 3762.7). Total num frames: 3592192. Throughput: 0: 903.3. Samples: 899432. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:14:50,720][01348] Avg episode reward: [(0, '21.275')]
[2024-12-03 16:14:50,725][03345] Saving new best policy, reward=21.275!
[2024-12-03 16:14:53,059][03358] Updated weights for policy 0, policy_version 880 (0.0016)
[2024-12-03 16:14:55,711][01348] Fps is (10 sec: 3277.5, 60 sec: 3618.1, 300 sec: 3748.9). Total num frames: 3612672. Throughput: 0: 893.1. Samples: 902302. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2024-12-03 16:14:55,713][01348] Avg episode reward: [(0, '20.950')]
[2024-12-03 16:15:00,711][01348] Fps is (10 sec: 4507.3, 60 sec: 3754.8, 300 sec: 3790.5). Total num frames: 3637248. Throughput: 0: 926.9. Samples: 908878. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2024-12-03 16:15:00,718][01348] Avg episode reward: [(0, '20.360')]
[2024-12-03 16:15:02,809][03358] Updated weights for policy 0, policy_version 890 (0.0027)
[2024-12-03 16:15:05,711][01348] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3762.8). Total num frames: 3649536. Throughput: 0: 917.2. Samples: 913628. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:15:05,714][01348] Avg episode reward: [(0, '19.198')]
[2024-12-03 16:15:10,711][01348] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3748.9). Total num frames: 3670016. Throughput: 0: 892.9. Samples: 915746. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:15:10,713][01348] Avg episode reward: [(0, '19.408')]
[2024-12-03 16:15:14,322][03358] Updated weights for policy 0, policy_version 900 (0.0038)
[2024-12-03 16:15:15,711][01348] Fps is (10 sec: 4096.0, 60 sec: 3686.4, 300 sec: 3776.7). Total num frames: 3690496. Throughput: 0: 911.6. Samples: 922424. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2024-12-03 16:15:15,713][01348] Avg episode reward: [(0, '19.835')]
[2024-12-03 16:15:20,712][01348] Fps is (10 sec: 4095.7, 60 sec: 3754.6, 300 sec: 3776.6). Total num frames: 3710976. Throughput: 0: 946.3. Samples: 928288. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2024-12-03 16:15:20,715][01348] Avg episode reward: [(0, '20.834')]
[2024-12-03 16:15:25,711][01348] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3735.0). Total num frames: 3723264. Throughput: 0: 917.9. Samples: 930368. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-12-03 16:15:25,713][01348] Avg episode reward: [(0, '20.610')]
[2024-12-03 16:15:25,780][03358] Updated weights for policy 0, policy_version 910 (0.0026)
[2024-12-03 16:15:30,711][01348] Fps is (10 sec: 3686.6, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 3747840. Throughput: 0: 921.2. Samples: 937022. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2024-12-03 16:15:30,713][01348] Avg episode reward: [(0, '20.878')]
[2024-12-03 16:15:34,286][03358] Updated weights for policy 0, policy_version 920 (0.0021)
[2024-12-03 16:15:35,716][01348] Fps is (10 sec: 4912.9, 60 sec: 3890.9, 300 sec: 3790.5). Total num frames: 3772416. Throughput: 0: 984.4. Samples: 943732. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:15:35,717][01348] Avg episode reward: [(0, '21.527')]
[2024-12-03 16:15:35,724][03345] Saving new best policy, reward=21.527!
[2024-12-03 16:15:40,711][01348] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 3784704. Throughput: 0: 965.1. Samples: 945732. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:15:40,718][01348] Avg episode reward: [(0, '21.325')]
[2024-12-03 16:15:45,655][03358] Updated weights for policy 0, policy_version 930 (0.0017)
[2024-12-03 16:15:45,711][01348] Fps is (10 sec: 3688.2, 60 sec: 3823.1, 300 sec: 3762.8). Total num frames: 3809280. Throughput: 0: 949.0. Samples: 951584. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:15:45,713][01348] Avg episode reward: [(0, '20.952')]
[2024-12-03 16:15:50,711][01348] Fps is (10 sec: 4505.5, 60 sec: 3959.7, 300 sec: 3776.6). Total num frames: 3829760. Throughput: 0: 998.4. Samples: 958556. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:15:50,713][01348] Avg episode reward: [(0, '21.876')]
[2024-12-03 16:15:50,720][03345] Saving new best policy, reward=21.876!
[2024-12-03 16:15:55,711][01348] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3762.8). Total num frames: 3846144. Throughput: 0: 1005.4. Samples: 960990. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:15:55,716][01348] Avg episode reward: [(0, '22.917')]
[2024-12-03 16:15:55,731][03345] Saving new best policy, reward=22.917!
[2024-12-03 16:15:56,674][03358] Updated weights for policy 0, policy_version 940 (0.0026)
[2024-12-03 16:16:00,715][01348] Fps is (10 sec: 3275.5, 60 sec: 3754.4, 300 sec: 3748.8). Total num frames: 3862528. Throughput: 0: 965.9. Samples: 965894. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:16:00,718][01348] Avg episode reward: [(0, '22.599')]
[2024-12-03 16:16:05,711][01348] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3776.7). Total num frames: 3887104. Throughput: 0: 992.6. Samples: 972956. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:16:05,716][01348] Avg episode reward: [(0, '21.371')]
[2024-12-03 16:16:05,901][03358] Updated weights for policy 0, policy_version 950 (0.0033)
[2024-12-03 16:16:10,711][01348] Fps is (10 sec: 4507.3, 60 sec: 3959.4, 300 sec: 3790.5). Total num frames: 3907584. Throughput: 0: 1017.3. Samples: 976148. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:16:10,715][01348] Avg episode reward: [(0, '21.117')]
[2024-12-03 16:16:15,711][01348] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3776.7). Total num frames: 3923968. Throughput: 0: 964.1. Samples: 980406. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-12-03 16:16:15,718][01348] Avg episode reward: [(0, '21.991')]
[2024-12-03 16:16:15,729][03345] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000958_3923968.pth...
[2024-12-03 16:16:15,852][03345] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000736_3014656.pth
[2024-12-03 16:16:17,369][03358] Updated weights for policy 0, policy_version 960 (0.0039)
[2024-12-03 16:16:20,711][01348] Fps is (10 sec: 3686.5, 60 sec: 3891.2, 300 sec: 3776.7). Total num frames: 3944448. Throughput: 0: 966.5. Samples: 987220. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2024-12-03 16:16:20,718][01348] Avg episode reward: [(0, '21.229')]
[2024-12-03 16:16:25,711][01348] Fps is (10 sec: 4096.0, 60 sec: 4027.7, 300 sec: 3804.4). Total num frames: 3964928. Throughput: 0: 998.9. Samples: 990682. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:16:25,724][01348] Avg episode reward: [(0, '21.005')]
[2024-12-03 16:16:27,217][03358] Updated weights for policy 0, policy_version 970 (0.0019)
[2024-12-03 16:16:30,714][01348] Fps is (10 sec: 3685.1, 60 sec: 3891.0, 300 sec: 3790.5). Total num frames: 3981312. Throughput: 0: 974.0. Samples: 995416. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:16:30,717][01348] Avg episode reward: [(0, '21.584')]
[2024-12-03 16:16:35,711][01348] Fps is (10 sec: 3686.4, 60 sec: 3823.2, 300 sec: 3776.7). Total num frames: 4001792. Throughput: 0: 958.5. Samples: 1001690. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2024-12-03 16:16:35,713][01348] Avg episode reward: [(0, '22.265')]
[2024-12-03 16:16:37,643][03358] Updated weights for policy 0, policy_version 980 (0.0031)
[2024-12-03 16:16:40,711][01348] Fps is (10 sec: 4097.4, 60 sec: 3959.5, 300 sec: 3804.4). Total num frames: 4022272. Throughput: 0: 970.5. Samples: 1004664. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:16:40,713][01348] Avg episode reward: [(0, '22.298')]
[2024-12-03 16:16:45,711][01348] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 4038656. Throughput: 0: 966.3. Samples: 1009374. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-12-03 16:16:45,713][01348] Avg episode reward: [(0, '21.714')]
[2024-12-03 16:16:50,126][03358] Updated weights for policy 0, policy_version 990 (0.0029)
[2024-12-03 16:16:50,711][01348] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 4055040. Throughput: 0: 927.9. Samples: 1014712. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:16:50,717][01348] Avg episode reward: [(0, '21.195')]
[2024-12-03 16:16:55,711][01348] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3804.5). Total num frames: 4079616. Throughput: 0: 935.5. Samples: 1018244. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:16:55,718][01348] Avg episode reward: [(0, '20.416')]
[2024-12-03 16:16:59,395][03358] Updated weights for policy 0, policy_version 1000 (0.0022)
[2024-12-03 16:17:00,711][01348] Fps is (10 sec: 4096.0, 60 sec: 3891.5, 300 sec: 3804.4). Total num frames: 4096000. Throughput: 0: 983.3. Samples: 1024656. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:17:00,717][01348] Avg episode reward: [(0, '20.771')]
[2024-12-03 16:17:05,714][01348] Fps is (10 sec: 3275.9, 60 sec: 3754.5, 300 sec: 3790.5). Total num frames: 4112384. Throughput: 0: 934.7. Samples: 1029286. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:17:05,718][01348] Avg episode reward: [(0, '19.996')]
[2024-12-03 16:17:10,180][03358] Updated weights for policy 0, policy_version 1010 (0.0021)
[2024-12-03 16:17:10,711][01348] Fps is (10 sec: 4096.0, 60 sec: 3823.0, 300 sec: 3804.4). Total num frames: 4136960. Throughput: 0: 936.5. Samples: 1032826. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-12-03 16:17:10,715][01348] Avg episode reward: [(0, '20.117')]
[2024-12-03 16:17:15,715][01348] Fps is (10 sec: 4505.2, 60 sec: 3891.0, 300 sec: 3832.1). Total num frames: 4157440. Throughput: 0: 989.9. Samples: 1039960. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:17:15,721][01348] Avg episode reward: [(0, '20.366')]
[2024-12-03 16:17:20,718][01348] Fps is (10 sec: 3683.9, 60 sec: 3822.5, 300 sec: 3804.3). Total num frames: 4173824. Throughput: 0: 945.0. Samples: 1044220. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:17:20,720][01348] Avg episode reward: [(0, '20.796')]
[2024-12-03 16:17:21,576][03358] Updated weights for policy 0, policy_version 1020 (0.0026)
[2024-12-03 16:17:25,711][01348] Fps is (10 sec: 3687.8, 60 sec: 3822.9, 300 sec: 3804.4). Total num frames: 4194304. Throughput: 0: 947.5. Samples: 1047302. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:17:25,714][01348] Avg episode reward: [(0, '19.543')]
[2024-12-03 16:17:30,392][03358] Updated weights for policy 0, policy_version 1030 (0.0017)
[2024-12-03 16:17:30,711][01348] Fps is (10 sec: 4508.7, 60 sec: 3959.7, 300 sec: 3832.3). Total num frames: 4218880. Throughput: 0: 997.8. Samples: 1054274. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2024-12-03 16:17:30,715][01348] Avg episode reward: [(0, '21.332')]
[2024-12-03 16:17:35,712][01348] Fps is (10 sec: 4095.7, 60 sec: 3891.1, 300 sec: 3832.2). Total num frames: 4235264. Throughput: 0: 994.1. Samples: 1059448. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2024-12-03 16:17:35,714][01348] Avg episode reward: [(0, '23.607')]
[2024-12-03 16:17:35,722][03345] Saving new best policy, reward=23.607!
[2024-12-03 16:17:40,711][01348] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3804.4). Total num frames: 4251648. Throughput: 0: 964.0. Samples: 1061624. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:17:40,719][01348] Avg episode reward: [(0, '23.301')]
[2024-12-03 16:17:41,882][03358] Updated weights for policy 0, policy_version 1040 (0.0023)
[2024-12-03 16:17:45,711][01348] Fps is (10 sec: 4096.3, 60 sec: 3959.5, 300 sec: 3832.2). Total num frames: 4276224. Throughput: 0: 980.2. Samples: 1068766. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2024-12-03 16:17:45,720][01348] Avg episode reward: [(0, '24.076')]
[2024-12-03 16:17:45,734][03345] Saving new best policy, reward=24.076!
[2024-12-03 16:17:50,711][01348] Fps is (10 sec: 4505.6, 60 sec: 4027.7, 300 sec: 3846.1). Total num frames: 4296704. Throughput: 0: 1007.8. Samples: 1074634. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2024-12-03 16:17:50,717][01348] Avg episode reward: [(0, '23.992')]
[2024-12-03 16:17:52,203][03358] Updated weights for policy 0, policy_version 1050 (0.0022)
[2024-12-03 16:17:55,711][01348] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3818.3). Total num frames: 4308992. Throughput: 0: 973.6. Samples: 1076638. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-12-03 16:17:55,719][01348] Avg episode reward: [(0, '22.276')]
[2024-12-03 16:18:00,711][01348] Fps is (10 sec: 3686.4, 60 sec: 3959.5, 300 sec: 3832.2). Total num frames: 4333568. Throughput: 0: 958.9. Samples: 1083108. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:18:00,715][01348] Avg episode reward: [(0, '20.868')]
[2024-12-03 16:18:01,993][03358] Updated weights for policy 0, policy_version 1060 (0.0023)
[2024-12-03 16:18:05,718][01348] Fps is (10 sec: 4502.5, 60 sec: 4027.5, 300 sec: 3846.0). Total num frames: 4354048. Throughput: 0: 1015.9. Samples: 1089936. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:18:05,723][01348] Avg episode reward: [(0, '21.021')]
[2024-12-03 16:18:10,714][01348] Fps is (10 sec: 3685.4, 60 sec: 3891.0, 300 sec: 3832.2). Total num frames: 4370432. Throughput: 0: 994.0. Samples: 1092034. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:18:10,721][01348] Avg episode reward: [(0, '21.340')]
[2024-12-03 16:18:13,291][03358] Updated weights for policy 0, policy_version 1070 (0.0020)
[2024-12-03 16:18:15,711][01348] Fps is (10 sec: 3688.9, 60 sec: 3891.4, 300 sec: 3832.2). Total num frames: 4390912. Throughput: 0: 966.5. Samples: 1097768. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2024-12-03 16:18:15,714][01348] Avg episode reward: [(0, '21.552')]
[2024-12-03 16:18:15,720][03345] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001072_4390912.pth...
[2024-12-03 16:18:15,860][03345] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000847_3469312.pth
[2024-12-03 16:18:20,711][01348] Fps is (10 sec: 4506.9, 60 sec: 4028.2, 300 sec: 3873.9). Total num frames: 4415488. Throughput: 0: 1002.6. Samples: 1104564. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:18:20,713][01348] Avg episode reward: [(0, '21.925')]
[2024-12-03 16:18:22,464][03358] Updated weights for policy 0, policy_version 1080 (0.0022)
[2024-12-03 16:18:25,714][01348] Fps is (10 sec: 4094.7, 60 sec: 3959.3, 300 sec: 3859.9). Total num frames: 4431872. Throughput: 0: 1012.5. Samples: 1107190. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-12-03 16:18:25,717][01348] Avg episode reward: [(0, '22.516')]
[2024-12-03 16:18:30,711][01348] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3832.2). Total num frames: 4448256. Throughput: 0: 959.1. Samples: 1111926. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:18:30,713][01348] Avg episode reward: [(0, '23.644')]
[2024-12-03 16:18:33,460][03358] Updated weights for policy 0, policy_version 1090 (0.0019)
[2024-12-03 16:18:35,711][01348] Fps is (10 sec: 4097.3, 60 sec: 3959.5, 300 sec: 3846.1). Total num frames: 4472832. Throughput: 0: 989.0. Samples: 1119140. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:18:35,718][01348] Avg episode reward: [(0, '22.100')]
[2024-12-03 16:18:40,711][01348] Fps is (10 sec: 4505.6, 60 sec: 4027.7, 300 sec: 3860.0). Total num frames: 4493312. Throughput: 0: 1021.6. Samples: 1122608. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:18:40,715][01348] Avg episode reward: [(0, '22.699')]
[2024-12-03 16:18:44,628][03358] Updated weights for policy 0, policy_version 1100 (0.0027)
[2024-12-03 16:18:45,711][01348] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3832.2). Total num frames: 4505600. Throughput: 0: 973.3. Samples: 1126908. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:18:45,717][01348] Avg episode reward: [(0, '21.949')]
[2024-12-03 16:18:50,711][01348] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3846.1). Total num frames: 4530176. Throughput: 0: 971.7. Samples: 1133658. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2024-12-03 16:18:50,715][01348] Avg episode reward: [(0, '21.704')]
[2024-12-03 16:18:53,567][03358] Updated weights for policy 0, policy_version 1110 (0.0031)
[2024-12-03 16:18:55,711][01348] Fps is (10 sec: 4915.0, 60 sec: 4096.0, 300 sec: 3873.9). Total num frames: 4554752. Throughput: 0: 1002.9. Samples: 1137160. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:18:55,718][01348] Avg episode reward: [(0, '21.355')]
[2024-12-03 16:19:00,711][01348] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3846.1). Total num frames: 4567040. Throughput: 0: 984.1. Samples: 1142052. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:19:00,716][01348] Avg episode reward: [(0, '22.748')]
[2024-12-03 16:19:04,983][03358] Updated weights for policy 0, policy_version 1120 (0.0019)
[2024-12-03 16:19:05,711][01348] Fps is (10 sec: 3276.9, 60 sec: 3891.6, 300 sec: 3846.1). Total num frames: 4587520. Throughput: 0: 966.7. Samples: 1148066. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:19:05,713][01348] Avg episode reward: [(0, '22.355')]
[2024-12-03 16:19:10,711][01348] Fps is (10 sec: 4505.6, 60 sec: 4027.9, 300 sec: 3873.8). Total num frames: 4612096. Throughput: 0: 987.1. Samples: 1151608. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:19:10,713][01348] Avg episode reward: [(0, '23.792')]
[2024-12-03 16:19:15,046][03358] Updated weights for policy 0, policy_version 1130 (0.0037)
[2024-12-03 16:19:15,711][01348] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3873.8). Total num frames: 4628480. Throughput: 0: 1009.6. Samples: 1157358. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:19:15,718][01348] Avg episode reward: [(0, '23.966')]
[2024-12-03 16:19:20,711][01348] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3860.0). Total num frames: 4644864. Throughput: 0: 963.1. Samples: 1162478. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-12-03 16:19:20,713][01348] Avg episode reward: [(0, '23.905')]
[2024-12-03 16:19:25,257][03358] Updated weights for policy 0, policy_version 1140 (0.0034)
[2024-12-03 16:19:25,711][01348] Fps is (10 sec: 4096.0, 60 sec: 3959.7, 300 sec: 3887.7). Total num frames: 4669440. Throughput: 0: 962.0. Samples: 1165896. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2024-12-03 16:19:25,713][01348] Avg episode reward: [(0, '23.133')]
[2024-12-03 16:19:30,711][01348] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3887.7). Total num frames: 4685824. Throughput: 0: 1008.0. Samples: 1172268. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2024-12-03 16:19:30,721][01348] Avg episode reward: [(0, '22.329')]
[2024-12-03 16:19:35,711][01348] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3873.8). Total num frames: 4702208. Throughput: 0: 954.4. Samples: 1176606. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:19:35,717][01348] Avg episode reward: [(0, '22.297')]
[2024-12-03 16:19:36,909][03358] Updated weights for policy 0, policy_version 1150 (0.0025)
[2024-12-03 16:19:40,711][01348] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3887.8). Total num frames: 4726784. Throughput: 0: 953.2. Samples: 1180052. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:19:40,719][01348] Avg episode reward: [(0, '21.390')]
[2024-12-03 16:19:45,715][01348] Fps is (10 sec: 4503.8, 60 sec: 4027.5, 300 sec: 3915.5). Total num frames: 4747264. Throughput: 0: 999.1. Samples: 1187014. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-12-03 16:19:45,721][01348] Avg episode reward: [(0, '22.744')]
[2024-12-03 16:19:45,823][03358] Updated weights for policy 0, policy_version 1160 (0.0022)
[2024-12-03 16:19:50,711][01348] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3901.6). Total num frames: 4763648. Throughput: 0: 968.2. Samples: 1191636. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-12-03 16:19:50,716][01348] Avg episode reward: [(0, '22.887')]
[2024-12-03 16:19:55,711][01348] Fps is (10 sec: 3687.9, 60 sec: 3823.0, 300 sec: 3887.7). Total num frames: 4784128. Throughput: 0: 949.6. Samples: 1194338. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-12-03 16:19:55,716][01348] Avg episode reward: [(0, '23.837')]
[2024-12-03 16:19:57,344][03358] Updated weights for policy 0, policy_version 1170 (0.0016)
[2024-12-03 16:20:00,711][01348] Fps is (10 sec: 4505.6, 60 sec: 4027.7, 300 sec: 3929.4). Total num frames: 4808704. Throughput: 0: 977.8. Samples: 1201360. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:20:00,719][01348] Avg episode reward: [(0, '22.405')]
[2024-12-03 16:20:05,711][01348] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3915.5). Total num frames: 4825088. Throughput: 0: 984.3. Samples: 1206770. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2024-12-03 16:20:05,717][01348] Avg episode reward: [(0, '21.915')]
[2024-12-03 16:20:08,383][03358] Updated weights for policy 0, policy_version 1180 (0.0019)
[2024-12-03 16:20:10,711][01348] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3901.6). Total num frames: 4841472. Throughput: 0: 956.4. Samples: 1208936. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-12-03 16:20:10,716][01348] Avg episode reward: [(0, '21.655')]
[2024-12-03 16:20:15,711][01348] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3915.5). Total num frames: 4866048. Throughput: 0: 973.5. Samples: 1216074. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:20:15,718][01348] Avg episode reward: [(0, '21.119')]
[2024-12-03 16:20:15,729][03345] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001188_4866048.pth...
[2024-12-03 16:20:15,866][03345] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000958_3923968.pth
[2024-12-03 16:20:17,217][03358] Updated weights for policy 0, policy_version 1190 (0.0014)
[2024-12-03 16:20:20,711][01348] Fps is (10 sec: 4505.6, 60 sec: 4027.7, 300 sec: 3943.3). Total num frames: 4886528. Throughput: 0: 1016.9. Samples: 1222366. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2024-12-03 16:20:20,717][01348] Avg episode reward: [(0, '21.376')]
[2024-12-03 16:20:25,711][01348] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3901.6). Total num frames: 4898816. Throughput: 0: 983.8. Samples: 1224322. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2024-12-03 16:20:25,714][01348] Avg episode reward: [(0, '22.305')]
[2024-12-03 16:20:28,915][03358] Updated weights for policy 0, policy_version 1200 (0.0026)
[2024-12-03 16:20:30,711][01348] Fps is (10 sec: 3686.4, 60 sec: 3959.5, 300 sec: 3901.7). Total num frames: 4923392. Throughput: 0: 960.8. Samples: 1230248. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2024-12-03 16:20:30,715][01348] Avg episode reward: [(0, '23.218')]
[2024-12-03 16:20:35,713][01348] Fps is (10 sec: 4505.0, 60 sec: 4027.6, 300 sec: 3929.4). Total num frames: 4943872. Throughput: 0: 1013.3. Samples: 1237238. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2024-12-03 16:20:35,715][01348] Avg episode reward: [(0, '23.134')]
[2024-12-03 16:20:38,870][03358] Updated weights for policy 0, policy_version 1210 (0.0027)
[2024-12-03 16:20:40,711][01348] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3901.6). Total num frames: 4960256. Throughput: 0: 1002.0. Samples: 1239426. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2024-12-03 16:20:40,718][01348] Avg episode reward: [(0, '24.940')]
[2024-12-03 16:20:40,721][03345] Saving new best policy, reward=24.940!
[2024-12-03 16:20:45,711][01348] Fps is (10 sec: 3687.0, 60 sec: 3891.5, 300 sec: 3901.6). Total num frames: 4980736. Throughput: 0: 959.2. Samples: 1244526. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:20:45,716][01348] Avg episode reward: [(0, '25.332')]
[2024-12-03 16:20:45,725][03345] Saving new best policy, reward=25.332!
[2024-12-03 16:20:49,034][03358] Updated weights for policy 0, policy_version 1220 (0.0015)
[2024-12-03 16:20:50,711][01348] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3915.5). Total num frames: 5001216. Throughput: 0: 995.5. Samples: 1251568. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2024-12-03 16:20:50,716][01348] Avg episode reward: [(0, '26.026')]
[2024-12-03 16:20:50,801][03345] Saving new best policy, reward=26.026!
[2024-12-03 16:20:55,713][01348] Fps is (10 sec: 3685.8, 60 sec: 3891.1, 300 sec: 3915.5). Total num frames: 5017600. Throughput: 0: 1009.0. Samples: 1254342. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:20:55,717][01348] Avg episode reward: [(0, '25.042')]
[2024-12-03 16:21:00,711][01348] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3887.7). Total num frames: 5033984. Throughput: 0: 948.4. Samples: 1258754. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:21:00,713][01348] Avg episode reward: [(0, '24.368')]
[2024-12-03 16:21:00,829][03358] Updated weights for policy 0, policy_version 1230 (0.0020)
[2024-12-03 16:21:05,711][01348] Fps is (10 sec: 4096.7, 60 sec: 3891.2, 300 sec: 3901.6). Total num frames: 5058560. Throughput: 0: 964.7. Samples: 1265776. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:21:05,713][01348] Avg episode reward: [(0, '24.828')]
[2024-12-03 16:21:09,655][03358] Updated weights for policy 0, policy_version 1240 (0.0016)
[2024-12-03 16:21:10,711][01348] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 3915.5). Total num frames: 5079040. Throughput: 0: 999.8. Samples: 1269312. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:21:10,713][01348] Avg episode reward: [(0, '23.774')]
[2024-12-03 16:21:15,711][01348] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3901.6). Total num frames: 5095424. Throughput: 0: 967.6. Samples: 1273792. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-12-03 16:21:15,716][01348] Avg episode reward: [(0, '22.856')]
[2024-12-03 16:21:20,457][03358] Updated weights for policy 0, policy_version 1250 (0.0030)
[2024-12-03 16:21:20,711][01348] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3915.5). Total num frames: 5120000. Throughput: 0: 958.8. Samples: 1280382. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2024-12-03 16:21:20,713][01348] Avg episode reward: [(0, '23.659')]
[2024-12-03 16:21:25,711][01348] Fps is (10 sec: 4505.6, 60 sec: 4027.7, 300 sec: 3929.4). Total num frames: 5140480. Throughput: 0: 986.1. Samples: 1283800. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2024-12-03 16:21:25,718][01348] Avg episode reward: [(0, '23.780')]
[2024-12-03 16:21:30,711][01348] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3915.5). Total num frames: 5156864. Throughput: 0: 991.6. Samples: 1289148. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:21:30,713][01348] Avg episode reward: [(0, '23.525')]
[2024-12-03 16:21:31,559][03358] Updated weights for policy 0, policy_version 1260 (0.0015)
[2024-12-03 16:21:35,711][01348] Fps is (10 sec: 3686.4, 60 sec: 3891.3, 300 sec: 3915.5). Total num frames: 5177344. Throughput: 0: 964.4. Samples: 1294966. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0)
[2024-12-03 16:21:35,719][01348] Avg episode reward: [(0, '23.452')]
[2024-12-03 16:21:40,466][03358] Updated weights for policy 0, policy_version 1270 (0.0031)
[2024-12-03 16:21:40,711][01348] Fps is (10 sec: 4505.6, 60 sec: 4027.7, 300 sec: 3943.3). Total num frames: 5201920. Throughput: 0: 981.9. Samples: 1298524. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2024-12-03 16:21:40,715][01348] Avg episode reward: [(0, '24.034')]
[2024-12-03 16:21:45,711][01348] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3943.3). Total num frames: 5218304. Throughput: 0: 1021.2. Samples: 1304710. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2024-12-03 16:21:45,719][01348] Avg episode reward: [(0, '22.983')]
[2024-12-03 16:21:50,711][01348] Fps is (10 sec: 3276.8, 60 sec: 3891.2, 300 sec: 3915.5). Total num frames: 5234688. Throughput: 0: 974.8. Samples: 1309640. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0)
[2024-12-03 16:21:50,716][01348] Avg episode reward: [(0, '22.490')]
[2024-12-03 16:21:51,788][03358] Updated weights for policy 0, policy_version 1280 (0.0037)
[2024-12-03 16:21:55,711][01348] Fps is (10 sec: 4096.0, 60 sec: 4027.8, 300 sec: 3943.3). Total num frames: 5259264. Throughput: 0: 971.5. Samples: 1313028. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2024-12-03 16:21:55,713][01348] Avg episode reward: [(0, '20.306')]
[2024-12-03 16:22:00,711][01348] Fps is (10 sec: 4505.6, 60 sec: 4096.0, 300 sec: 3957.2). Total num frames: 5279744. Throughput: 0: 1024.2. Samples: 1319882. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2024-12-03 16:22:00,715][01348] Avg episode reward: [(0, '18.348')]
[2024-12-03 16:22:01,542][03358] Updated weights for policy 0, policy_version 1290 (0.0019)
[2024-12-03 16:22:05,711][01348] Fps is (10 sec: 3276.8, 60 sec: 3891.2, 300 sec: 3915.5). Total num frames: 5292032. Throughput: 0: 973.4. Samples: 1324184. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0)
[2024-12-03 16:22:05,713][01348] Avg episode reward: [(0, '18.271')]
[2024-12-03 16:22:10,711][01348] Fps is (10 sec: 3686.4, 60 sec: 3959.5, 300 sec: 3929.4). Total num frames: 5316608. Throughput: 0: 975.4. Samples: 1327694. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:22:10,718][01348] Avg episode reward: [(0, '20.170')]
[2024-12-03 16:22:12,061][03358] Updated weights for policy 0, policy_version 1300 (0.0017)
[2024-12-03 16:22:15,711][01348] Fps is (10 sec: 4915.2, 60 sec: 4096.0, 300 sec: 3957.2). Total num frames: 5341184. Throughput: 0: 1013.3. Samples: 1334746. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:22:15,714][01348] Avg episode reward: [(0, '20.969')]
[2024-12-03 16:22:15,722][03345] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001304_5341184.pth...
[2024-12-03 16:22:15,888][03345] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001072_4390912.pth
[2024-12-03 16:22:20,711][01348] Fps is (10 sec: 3686.3, 60 sec: 3891.2, 300 sec: 3929.4). Total num frames: 5353472. Throughput: 0: 986.3. Samples: 1339348. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:22:20,714][01348] Avg episode reward: [(0, '21.440')]
[2024-12-03 16:22:23,616][03358] Updated weights for policy 0, policy_version 1310 (0.0026)
[2024-12-03 16:22:25,711][01348] Fps is (10 sec: 3276.8, 60 sec: 3891.2, 300 sec: 3915.5). Total num frames: 5373952. Throughput: 0: 963.6. Samples: 1341884. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:22:25,714][01348] Avg episode reward: [(0, '21.997')]
[2024-12-03 16:22:30,711][01348] Fps is (10 sec: 4505.7, 60 sec: 4027.7, 300 sec: 3943.3). Total num frames: 5398528. Throughput: 0: 985.2. Samples: 1349042. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2024-12-03 16:22:30,714][01348] Avg episode reward: [(0, '23.584')]
[2024-12-03 16:22:32,166][03358] Updated weights for policy 0, policy_version 1320 (0.0027)
[2024-12-03 16:22:35,715][01348] Fps is (10 sec: 4094.4, 60 sec: 3959.2, 300 sec: 3943.2). Total num frames: 5414912. Throughput: 0: 996.7. Samples: 1354496. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2024-12-03 16:22:35,717][01348] Avg episode reward: [(0, '23.949')]
[2024-12-03 16:22:40,711][01348] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3915.5). Total num frames: 5431296. Throughput: 0: 970.5. Samples: 1356702. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-12-03 16:22:40,713][01348] Avg episode reward: [(0, '24.006')]
[2024-12-03 16:22:43,517][03358] Updated weights for policy 0, policy_version 1330 (0.0020)
[2024-12-03 16:22:45,711][01348] Fps is (10 sec: 4097.5, 60 sec: 3959.5, 300 sec: 3929.4). Total num frames: 5455872. Throughput: 0: 974.4. Samples: 1363732. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:22:45,714][01348] Avg episode reward: [(0, '24.566')]
[2024-12-03 16:22:50,711][01348] Fps is (10 sec: 4505.6, 60 sec: 4027.7, 300 sec: 3957.2). Total num frames: 5476352. Throughput: 0: 1016.5. Samples: 1369928. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-12-03 16:22:50,713][01348] Avg episode reward: [(0, '24.368')]
[2024-12-03 16:22:54,395][03358] Updated weights for policy 0, policy_version 1340 (0.0023)
[2024-12-03 16:22:55,713][01348] Fps is (10 sec: 3276.2, 60 sec: 3822.8, 300 sec: 3915.5). Total num frames: 5488640. Throughput: 0: 984.4. Samples: 1371992. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:22:55,715][01348] Avg episode reward: [(0, '24.104')]
[2024-12-03 16:23:00,711][01348] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3929.5). Total num frames: 5513216. Throughput: 0: 962.2. Samples: 1378046. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-12-03 16:23:00,721][01348] Avg episode reward: [(0, '23.724')]
[2024-12-03 16:23:03,773][03358] Updated weights for policy 0, policy_version 1350 (0.0025)
[2024-12-03 16:23:05,712][01348] Fps is (10 sec: 4915.4, 60 sec: 4095.9, 300 sec: 3957.2). Total num frames: 5537792. Throughput: 0: 1016.1. Samples: 1385072. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:23:05,722][01348] Avg episode reward: [(0, '25.171')]
[2024-12-03 16:23:10,711][01348] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3929.4). Total num frames: 5550080. Throughput: 0: 1008.4. Samples: 1387260. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:23:10,719][01348] Avg episode reward: [(0, '24.865')]
[2024-12-03 16:23:15,017][03358] Updated weights for policy 0, policy_version 1360 (0.0018)
[2024-12-03 16:23:15,711][01348] Fps is (10 sec: 3277.2, 60 sec: 3822.9, 300 sec: 3915.5). Total num frames: 5570560. Throughput: 0: 968.8. Samples: 1392640. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:23:15,713][01348] Avg episode reward: [(0, '24.966')]
[2024-12-03 16:23:20,711][01348] Fps is (10 sec: 4505.6, 60 sec: 4027.8, 300 sec: 3943.3). Total num frames: 5595136. Throughput: 0: 1005.6. Samples: 1399746. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-12-03 16:23:20,713][01348] Avg episode reward: [(0, '25.764')]
[2024-12-03 16:23:24,725][03358] Updated weights for policy 0, policy_version 1370 (0.0016)
[2024-12-03 16:23:25,711][01348] Fps is (10 sec: 4095.9, 60 sec: 3959.4, 300 sec: 3943.3). Total num frames: 5611520. Throughput: 0: 1020.2. Samples: 1402612. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:23:25,716][01348] Avg episode reward: [(0, '24.125')]
[2024-12-03 16:23:30,711][01348] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3915.5). Total num frames: 5627904. Throughput: 0: 964.4. Samples: 1407128. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:23:30,713][01348] Avg episode reward: [(0, '22.930')]
[2024-12-03 16:23:35,203][03358] Updated weights for policy 0, policy_version 1380 (0.0028)
[2024-12-03 16:23:35,711][01348] Fps is (10 sec: 4096.2, 60 sec: 3959.7, 300 sec: 3929.4). Total num frames: 5652480. Throughput: 0: 981.2. Samples: 1414084. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:23:35,713][01348] Avg episode reward: [(0, '22.367')]
[2024-12-03 16:23:40,715][01348] Fps is (10 sec: 4503.9, 60 sec: 4027.5, 300 sec: 3957.1). Total num frames: 5672960. Throughput: 0: 1012.8. Samples: 1417570. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:23:40,717][01348] Avg episode reward: [(0, '23.009')]
[2024-12-03 16:23:45,711][01348] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3929.4). Total num frames: 5689344. Throughput: 0: 976.5. Samples: 1421990. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-12-03 16:23:45,715][01348] Avg episode reward: [(0, '23.092')]
[2024-12-03 16:23:46,552][03358] Updated weights for policy 0, policy_version 1390 (0.0017)
[2024-12-03 16:23:50,711][01348] Fps is (10 sec: 3687.8, 60 sec: 3891.2, 300 sec: 3915.5). Total num frames: 5709824. Throughput: 0: 965.0. Samples: 1428496. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-12-03 16:23:50,715][01348] Avg episode reward: [(0, '22.655')]
[2024-12-03 16:23:55,560][03358] Updated weights for policy 0, policy_version 1400 (0.0018)
[2024-12-03 16:23:55,711][01348] Fps is (10 sec: 4505.6, 60 sec: 4096.1, 300 sec: 3957.2). Total num frames: 5734400. Throughput: 0: 992.5. Samples: 1431924. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:23:55,717][01348] Avg episode reward: [(0, '24.891')]
[2024-12-03 16:24:00,711][01348] Fps is (10 sec: 3686.3, 60 sec: 3891.2, 300 sec: 3929.4). Total num frames: 5746688. Throughput: 0: 988.2. Samples: 1437108. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:24:00,714][01348] Avg episode reward: [(0, '24.043')]
[2024-12-03 16:24:05,711][01348] Fps is (10 sec: 3276.8, 60 sec: 3823.0, 300 sec: 3915.5). Total num frames: 5767168. Throughput: 0: 954.4. Samples: 1442696. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-12-03 16:24:05,717][01348] Avg episode reward: [(0, '23.508')]
[2024-12-03 16:24:07,020][03358] Updated weights for policy 0, policy_version 1410 (0.0018)
[2024-12-03 16:24:10,711][01348] Fps is (10 sec: 4505.7, 60 sec: 4027.7, 300 sec: 3943.3). Total num frames: 5791744. Throughput: 0: 967.7. Samples: 1446160. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:24:10,718][01348] Avg episode reward: [(0, '22.266')]
[2024-12-03 16:24:15,711][01348] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3943.3). Total num frames: 5808128. Throughput: 0: 997.8. Samples: 1452030. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:24:15,718][01348] Avg episode reward: [(0, '22.390')]
[2024-12-03 16:24:15,729][03345] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001418_5808128.pth...
[2024-12-03 16:24:15,914][03345] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001188_4866048.pth
[2024-12-03 16:24:18,433][03358] Updated weights for policy 0, policy_version 1420 (0.0022)
[2024-12-03 16:24:20,711][01348] Fps is (10 sec: 3276.7, 60 sec: 3822.9, 300 sec: 3915.5). Total num frames: 5824512. Throughput: 0: 948.7. Samples: 1456774. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:24:20,717][01348] Avg episode reward: [(0, '22.166')]
[2024-12-03 16:24:25,711][01348] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3929.4). Total num frames: 5844992. Throughput: 0: 946.7. Samples: 1460170. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:24:25,720][01348] Avg episode reward: [(0, '21.961')]
[2024-12-03 16:24:27,820][03358] Updated weights for policy 0, policy_version 1430 (0.0014)
[2024-12-03 16:24:30,711][01348] Fps is (10 sec: 4096.1, 60 sec: 3959.5, 300 sec: 3943.3). Total num frames: 5865472. Throughput: 0: 987.2. Samples: 1466416. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-12-03 16:24:30,713][01348] Avg episode reward: [(0, '22.207')]
[2024-12-03 16:24:35,711][01348] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3901.6). Total num frames: 5877760. Throughput: 0: 934.0. Samples: 1470526. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2024-12-03 16:24:35,718][01348] Avg episode reward: [(0, '22.772')]
[2024-12-03 16:24:39,452][03358] Updated weights for policy 0, policy_version 1440 (0.0024)
[2024-12-03 16:24:40,711][01348] Fps is (10 sec: 3686.4, 60 sec: 3823.2, 300 sec: 3915.6). Total num frames: 5902336. Throughput: 0: 934.8. Samples: 1473992. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:24:40,717][01348] Avg episode reward: [(0, '23.132')]
[2024-12-03 16:24:45,713][01348] Fps is (10 sec: 4504.8, 60 sec: 3891.1, 300 sec: 3929.4). Total num frames: 5922816. Throughput: 0: 968.4. Samples: 1480688. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:24:45,717][01348] Avg episode reward: [(0, '22.235')]
[2024-12-03 16:24:50,580][03358] Updated weights for policy 0, policy_version 1450 (0.0019)
[2024-12-03 16:24:50,711][01348] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3915.5). Total num frames: 5939200. Throughput: 0: 944.6. Samples: 1485202. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:24:50,716][01348] Avg episode reward: [(0, '23.321')]
[2024-12-03 16:24:55,711][01348] Fps is (10 sec: 3277.4, 60 sec: 3686.4, 300 sec: 3887.7). Total num frames: 5955584. Throughput: 0: 928.2. Samples: 1487928. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:24:55,715][01348] Avg episode reward: [(0, '24.029')]
[2024-12-03 16:25:00,500][03358] Updated weights for policy 0, policy_version 1460 (0.0023)
[2024-12-03 16:25:00,711][01348] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3915.5). Total num frames: 5980160. Throughput: 0: 946.5. Samples: 1494622. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2024-12-03 16:25:00,720][01348] Avg episode reward: [(0, '23.267')]
[2024-12-03 16:25:05,711][01348] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3915.5). Total num frames: 5996544. Throughput: 0: 955.8. Samples: 1499786. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2024-12-03 16:25:05,715][01348] Avg episode reward: [(0, '24.317')]
[2024-12-03 16:25:10,711][01348] Fps is (10 sec: 3276.8, 60 sec: 3686.4, 300 sec: 3887.7). Total num frames: 6012928. Throughput: 0: 929.2. Samples: 1501984. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:25:10,713][01348] Avg episode reward: [(0, '24.785')]
[2024-12-03 16:25:11,897][03358] Updated weights for policy 0, policy_version 1470 (0.0032)
[2024-12-03 16:25:15,711][01348] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3901.6). Total num frames: 6037504. Throughput: 0: 945.0. Samples: 1508942. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:25:15,716][01348] Avg episode reward: [(0, '25.698')]
[2024-12-03 16:25:20,712][01348] Fps is (10 sec: 4505.2, 60 sec: 3891.2, 300 sec: 3929.4). Total num frames: 6057984. Throughput: 0: 992.0. Samples: 1515168. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2024-12-03 16:25:20,719][01348] Avg episode reward: [(0, '24.847')]
[2024-12-03 16:25:21,795][03358] Updated weights for policy 0, policy_version 1480 (0.0032)
[2024-12-03 16:25:25,711][01348] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3887.7). Total num frames: 6070272. Throughput: 0: 961.8. Samples: 1517272. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:25:25,718][01348] Avg episode reward: [(0, '25.480')]
[2024-12-03 16:25:30,711][01348] Fps is (10 sec: 3686.7, 60 sec: 3822.9, 300 sec: 3901.6). Total num frames: 6094848. Throughput: 0: 947.2. Samples: 1523310. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:25:30,713][01348] Avg episode reward: [(0, '25.799')]
[2024-12-03 16:25:32,139][03358] Updated weights for policy 0, policy_version 1490 (0.0020)
[2024-12-03 16:25:35,711][01348] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 3915.5). Total num frames: 6115328. Throughput: 0: 998.7. Samples: 1530142. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:25:35,715][01348] Avg episode reward: [(0, '26.638')]
[2024-12-03 16:25:35,729][03345] Saving new best policy, reward=26.638!
[2024-12-03 16:25:40,711][01348] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3901.6). Total num frames: 6131712. Throughput: 0: 980.5. Samples: 1532052. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-12-03 16:25:40,714][01348] Avg episode reward: [(0, '25.533')]
[2024-12-03 16:25:43,827][03358] Updated weights for policy 0, policy_version 1500 (0.0024)
[2024-12-03 16:25:45,711][01348] Fps is (10 sec: 3276.8, 60 sec: 3754.8, 300 sec: 3887.7). Total num frames: 6148096. Throughput: 0: 949.2. Samples: 1537336. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:25:45,713][01348] Avg episode reward: [(0, '24.695')]
[2024-12-03 16:25:50,711][01348] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3915.5). Total num frames: 6172672. Throughput: 0: 983.6. Samples: 1544048. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:25:50,720][01348] Avg episode reward: [(0, '23.440')]
[2024-12-03 16:25:53,623][03358] Updated weights for policy 0, policy_version 1510 (0.0020)
[2024-12-03 16:25:55,714][01348] Fps is (10 sec: 4094.9, 60 sec: 3891.0, 300 sec: 3915.5). Total num frames: 6189056. Throughput: 0: 993.9. Samples: 1546714. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:25:55,716][01348] Avg episode reward: [(0, '22.622')]
[2024-12-03 16:26:00,711][01348] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3887.7). Total num frames: 6205440. Throughput: 0: 932.9. Samples: 1550922. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2024-12-03 16:26:00,718][01348] Avg episode reward: [(0, '21.965')]
[2024-12-03 16:26:05,028][03358] Updated weights for policy 0, policy_version 1520 (0.0013)
[2024-12-03 16:26:05,711][01348] Fps is (10 sec: 3687.3, 60 sec: 3822.9, 300 sec: 3887.7). Total num frames: 6225920. Throughput: 0: 946.1. Samples: 1557742. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:26:05,719][01348] Avg episode reward: [(0, '21.989')]
[2024-12-03 16:26:10,715][01348] Fps is (10 sec: 4094.3, 60 sec: 3890.9, 300 sec: 3901.6). Total num frames: 6246400. Throughput: 0: 971.8. Samples: 1561008. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-12-03 16:26:10,725][01348] Avg episode reward: [(0, '23.251')]
[2024-12-03 16:26:15,711][01348] Fps is (10 sec: 3276.9, 60 sec: 3686.4, 300 sec: 3860.0). Total num frames: 6258688. Throughput: 0: 930.6. Samples: 1565186. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:26:15,719][01348] Avg episode reward: [(0, '21.774')]
[2024-12-03 16:26:15,727][03345] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001528_6258688.pth...
[2024-12-03 16:26:15,851][03345] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001304_5341184.pth
[2024-12-03 16:26:16,907][03358] Updated weights for policy 0, policy_version 1530 (0.0015)
[2024-12-03 16:26:20,711][01348] Fps is (10 sec: 3687.9, 60 sec: 3754.7, 300 sec: 3873.8). Total num frames: 6283264. Throughput: 0: 915.0. Samples: 1571318. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-12-03 16:26:20,721][01348] Avg episode reward: [(0, '22.263')]
[2024-12-03 16:26:25,712][01348] Fps is (10 sec: 4505.2, 60 sec: 3891.1, 300 sec: 3887.7). Total num frames: 6303744. Throughput: 0: 944.3. Samples: 1574546. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2024-12-03 16:26:25,714][01348] Avg episode reward: [(0, '21.402')]
[2024-12-03 16:26:26,691][03358] Updated weights for policy 0, policy_version 1540 (0.0016)
[2024-12-03 16:26:30,711][01348] Fps is (10 sec: 3276.8, 60 sec: 3686.4, 300 sec: 3860.0). Total num frames: 6316032. Throughput: 0: 932.9. Samples: 1579318. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:26:30,713][01348] Avg episode reward: [(0, '20.886')]
[2024-12-03 16:26:35,711][01348] Fps is (10 sec: 3277.1, 60 sec: 3686.4, 300 sec: 3846.1). Total num frames: 6336512. Throughput: 0: 904.1. Samples: 1584734. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:26:35,713][01348] Avg episode reward: [(0, '20.568')]
[2024-12-03 16:26:38,211][03358] Updated weights for policy 0, policy_version 1550 (0.0020)
[2024-12-03 16:26:40,711][01348] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3860.0). Total num frames: 6356992. Throughput: 0: 919.3. Samples: 1588082. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2024-12-03 16:26:40,718][01348] Avg episode reward: [(0, '19.925')]
[2024-12-03 16:26:45,711][01348] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3860.0). Total num frames: 6373376. Throughput: 0: 953.5. Samples: 1593828. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:26:45,715][01348] Avg episode reward: [(0, '21.285')]
[2024-12-03 16:26:49,987][03358] Updated weights for policy 0, policy_version 1560 (0.0015)
[2024-12-03 16:26:50,711][01348] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3832.2). Total num frames: 6389760. Throughput: 0: 908.1. Samples: 1598604. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-12-03 16:26:50,713][01348] Avg episode reward: [(0, '21.241')]
[2024-12-03 16:26:55,711][01348] Fps is (10 sec: 4096.0, 60 sec: 3754.8, 300 sec: 3846.1). Total num frames: 6414336. Throughput: 0: 909.4. Samples: 1601928. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-12-03 16:26:55,718][01348] Avg episode reward: [(0, '22.311')]
[2024-12-03 16:26:59,695][03358] Updated weights for policy 0, policy_version 1570 (0.0013)
[2024-12-03 16:27:00,715][01348] Fps is (10 sec: 4094.5, 60 sec: 3754.4, 300 sec: 3859.9). Total num frames: 6430720. Throughput: 0: 956.2. Samples: 1608220. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:27:00,726][01348] Avg episode reward: [(0, '22.789')]
[2024-12-03 16:27:05,711][01348] Fps is (10 sec: 3276.8, 60 sec: 3686.4, 300 sec: 3832.2). Total num frames: 6447104. Throughput: 0: 910.6. Samples: 1612294. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-12-03 16:27:05,713][01348] Avg episode reward: [(0, '23.349')]
[2024-12-03 16:27:10,711][01348] Fps is (10 sec: 3687.7, 60 sec: 3686.6, 300 sec: 3818.3). Total num frames: 6467584. Throughput: 0: 912.6. Samples: 1615614. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-12-03 16:27:10,719][01348] Avg episode reward: [(0, '24.432')]
[2024-12-03 16:27:10,978][03358] Updated weights for policy 0, policy_version 1580 (0.0020)
[2024-12-03 16:27:15,711][01348] Fps is (10 sec: 4505.6, 60 sec: 3891.2, 300 sec: 3860.0). Total num frames: 6492160. Throughput: 0: 963.8. Samples: 1622690. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0)
[2024-12-03 16:27:15,716][01348] Avg episode reward: [(0, '24.654')]
[2024-12-03 16:27:20,711][01348] Fps is (10 sec: 3686.5, 60 sec: 3686.4, 300 sec: 3832.2). Total num frames: 6504448. Throughput: 0: 949.4. Samples: 1627458. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:27:20,718][01348] Avg episode reward: [(0, '25.914')]
[2024-12-03 16:27:22,312][03358] Updated weights for policy 0, policy_version 1590 (0.0018)
[2024-12-03 16:27:25,711][01348] Fps is (10 sec: 3276.8, 60 sec: 3686.5, 300 sec: 3818.3). Total num frames: 6524928. Throughput: 0: 937.4. Samples: 1630264. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:27:25,718][01348] Avg episode reward: [(0, '26.538')]
[2024-12-03 16:27:30,711][01348] Fps is (10 sec: 4505.6, 60 sec: 3891.2, 300 sec: 3846.1). Total num frames: 6549504. Throughput: 0: 964.3. Samples: 1637220. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:27:30,713][01348] Avg episode reward: [(0, '26.045')]
[2024-12-03 16:27:30,951][03358] Updated weights for policy 0, policy_version 1600 (0.0018)
[2024-12-03 16:27:35,715][01348] Fps is (10 sec: 4094.5, 60 sec: 3822.7, 300 sec: 3846.0). Total num frames: 6565888. Throughput: 0: 983.3. Samples: 1642854. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2024-12-03 16:27:35,717][01348] Avg episode reward: [(0, '26.430')]
[2024-12-03 16:27:40,711][01348] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3832.2). Total num frames: 6586368. Throughput: 0: 958.1. Samples: 1645044. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-12-03 16:27:40,714][01348] Avg episode reward: [(0, '25.548')]
[2024-12-03 16:27:42,079][03358] Updated weights for policy 0, policy_version 1610 (0.0026)
[2024-12-03 16:27:45,711][01348] Fps is (10 sec: 4097.4, 60 sec: 3891.2, 300 sec: 3832.2). Total num frames: 6606848. Throughput: 0: 972.8. Samples: 1651992. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-12-03 16:27:45,714][01348] Avg episode reward: [(0, '23.673')]
[2024-12-03 16:27:50,711][01348] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3860.0). Total num frames: 6627328. Throughput: 0: 1022.0. Samples: 1658282. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:27:50,717][01348] Avg episode reward: [(0, '23.343')]
[2024-12-03 16:27:52,260][03358] Updated weights for policy 0, policy_version 1620 (0.0020)
[2024-12-03 16:27:55,711][01348] Fps is (10 sec: 3686.5, 60 sec: 3822.9, 300 sec: 3832.2). Total num frames: 6643712. Throughput: 0: 994.9. Samples: 1660386. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:27:55,714][01348] Avg episode reward: [(0, '22.242')]
[2024-12-03 16:28:00,711][01348] Fps is (10 sec: 4096.0, 60 sec: 3959.7, 300 sec: 3832.2). Total num frames: 6668288. Throughput: 0: 974.6. Samples: 1666546. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:28:00,713][01348] Avg episode reward: [(0, '22.378')]
[2024-12-03 16:28:02,309][03358] Updated weights for policy 0, policy_version 1630 (0.0018)
[2024-12-03 16:28:05,711][01348] Fps is (10 sec: 4505.6, 60 sec: 4027.7, 300 sec: 3860.0). Total num frames: 6688768. Throughput: 0: 1024.6. Samples: 1673564. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:28:05,713][01348] Avg episode reward: [(0, '22.921')]
[2024-12-03 16:28:10,711][01348] Fps is (10 sec: 3686.4, 60 sec: 3959.5, 300 sec: 3846.1). Total num frames: 6705152. Throughput: 0: 1010.0. Samples: 1675714. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:28:10,715][01348] Avg episode reward: [(0, '23.030')]
[2024-12-03 16:28:13,380][03358] Updated weights for policy 0, policy_version 1640 (0.0017)
[2024-12-03 16:28:15,711][01348] Fps is (10 sec: 3686.3, 60 sec: 3891.2, 300 sec: 3832.2). Total num frames: 6725632. Throughput: 0: 977.8. Samples: 1681222. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:28:15,715][01348] Avg episode reward: [(0, '23.125')]
[2024-12-03 16:28:15,725][03345] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001642_6725632.pth...
[2024-12-03 16:28:15,876][03345] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001418_5808128.pth
[2024-12-03 16:28:20,711][01348] Fps is (10 sec: 4505.6, 60 sec: 4096.0, 300 sec: 3860.0). Total num frames: 6750208. Throughput: 0: 1010.0. Samples: 1688300. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:28:20,719][01348] Avg episode reward: [(0, '22.681')]
[2024-12-03 16:28:22,137][03358] Updated weights for policy 0, policy_version 1650 (0.0014)
[2024-12-03 16:28:25,711][01348] Fps is (10 sec: 4096.2, 60 sec: 4027.7, 300 sec: 3860.0). Total num frames: 6766592. Throughput: 0: 1024.4. Samples: 1691142. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-12-03 16:28:25,713][01348] Avg episode reward: [(0, '23.304')]
[2024-12-03 16:28:30,711][01348] Fps is (10 sec: 3276.7, 60 sec: 3891.2, 300 sec: 3832.2). Total num frames: 6782976. Throughput: 0: 969.0. Samples: 1695596. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:28:30,718][01348] Avg episode reward: [(0, '23.004')]
[2024-12-03 16:28:33,657][03358] Updated weights for policy 0, policy_version 1660 (0.0021)
[2024-12-03 16:28:35,711][01348] Fps is (10 sec: 4096.0, 60 sec: 4028.0, 300 sec: 3846.1). Total num frames: 6807552. Throughput: 0: 988.4. Samples: 1702760. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:28:35,716][01348] Avg episode reward: [(0, '23.903')]
[2024-12-03 16:28:40,718][01348] Fps is (10 sec: 4502.4, 60 sec: 4027.2, 300 sec: 3859.9). Total num frames: 6828032. Throughput: 0: 1020.1. Samples: 1706296. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:28:40,725][01348] Avg episode reward: [(0, '24.683')]
[2024-12-03 16:28:44,429][03358] Updated weights for policy 0, policy_version 1670 (0.0019)
[2024-12-03 16:28:45,711][01348] Fps is (10 sec: 3276.8, 60 sec: 3891.2, 300 sec: 3832.2). Total num frames: 6840320. Throughput: 0: 980.9. Samples: 1710688. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2024-12-03 16:28:45,718][01348] Avg episode reward: [(0, '24.513')]
[2024-12-03 16:28:50,711][01348] Fps is (10 sec: 3689.1, 60 sec: 3959.5, 300 sec: 3832.2). Total num frames: 6864896. Throughput: 0: 973.2. Samples: 1717358. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-12-03 16:28:50,719][01348] Avg episode reward: [(0, '25.243')]
[2024-12-03 16:28:53,451][03358] Updated weights for policy 0, policy_version 1680 (0.0018)
[2024-12-03 16:28:55,711][01348] Fps is (10 sec: 4915.1, 60 sec: 4096.0, 300 sec: 3873.8). Total num frames: 6889472. Throughput: 0: 1004.8. Samples: 1720928. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:28:55,713][01348] Avg episode reward: [(0, '25.147')]
[2024-12-03 16:29:00,713][01348] Fps is (10 sec: 4095.2, 60 sec: 3959.3, 300 sec: 3859.9). Total num frames: 6905856. Throughput: 0: 997.8. Samples: 1726126. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2024-12-03 16:29:00,717][01348] Avg episode reward: [(0, '25.796')]
[2024-12-03 16:29:04,682][03358] Updated weights for policy 0, policy_version 1690 (0.0018)
[2024-12-03 16:29:05,711][01348] Fps is (10 sec: 3686.5, 60 sec: 3959.5, 300 sec: 3846.1). Total num frames: 6926336. Throughput: 0: 972.5. Samples: 1732064. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2024-12-03 16:29:05,714][01348] Avg episode reward: [(0, '25.326')]
[2024-12-03 16:29:10,711][01348] Fps is (10 sec: 4096.8, 60 sec: 4027.7, 300 sec: 3860.0). Total num frames: 6946816. Throughput: 0: 986.8. Samples: 1735548. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:29:10,713][01348] Avg episode reward: [(0, '25.743')]
[2024-12-03 16:29:14,302][03358] Updated weights for policy 0, policy_version 1700 (0.0019)
[2024-12-03 16:29:15,711][01348] Fps is (10 sec: 3686.4, 60 sec: 3959.5, 300 sec: 3860.0). Total num frames: 6963200. Throughput: 0: 1022.0. Samples: 1741584. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:29:15,718][01348] Avg episode reward: [(0, '23.921')]
[2024-12-03 16:29:20,711][01348] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3860.0). Total num frames: 6983680. Throughput: 0: 974.0. Samples: 1746590. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:29:20,713][01348] Avg episode reward: [(0, '25.027')]
[2024-12-03 16:29:24,716][03358] Updated weights for policy 0, policy_version 1710 (0.0021)
[2024-12-03 16:29:25,711][01348] Fps is (10 sec: 4505.6, 60 sec: 4027.7, 300 sec: 3873.8). Total num frames: 7008256. Throughput: 0: 975.6. Samples: 1750192. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:29:25,719][01348] Avg episode reward: [(0, '24.045')]
[2024-12-03 16:29:30,711][01348] Fps is (10 sec: 4096.0, 60 sec: 4027.8, 300 sec: 3887.7). Total num frames: 7024640. Throughput: 0: 1023.9. Samples: 1756762. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:29:30,717][01348] Avg episode reward: [(0, '24.138')]
[2024-12-03 16:29:35,711][01348] Fps is (10 sec: 3276.8, 60 sec: 3891.2, 300 sec: 3860.0). Total num frames: 7041024. Throughput: 0: 972.0. Samples: 1761100. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2024-12-03 16:29:35,719][01348] Avg episode reward: [(0, '23.034')]
[2024-12-03 16:29:36,127][03358] Updated weights for policy 0, policy_version 1720 (0.0022)
[2024-12-03 16:29:40,711][01348] Fps is (10 sec: 4096.0, 60 sec: 3960.0, 300 sec: 3873.9). Total num frames: 7065600. Throughput: 0: 971.6. Samples: 1764648. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2024-12-03 16:29:40,713][01348] Avg episode reward: [(0, '23.743')]
[2024-12-03 16:29:44,757][03358] Updated weights for policy 0, policy_version 1730 (0.0016)
[2024-12-03 16:29:45,715][01348] Fps is (10 sec: 4503.8, 60 sec: 4095.7, 300 sec: 3887.7). Total num frames: 7086080. Throughput: 0: 1014.4. Samples: 1771778. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:29:45,722][01348] Avg episode reward: [(0, '23.890')]
[2024-12-03 16:29:50,714][01348] Fps is (10 sec: 3685.4, 60 sec: 3959.3, 300 sec: 3887.7). Total num frames: 7102464. Throughput: 0: 985.2. Samples: 1776402. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:29:50,719][01348] Avg episode reward: [(0, '24.480')]
[2024-12-03 16:29:55,711][01348] Fps is (10 sec: 3687.8, 60 sec: 3891.2, 300 sec: 3873.8). Total num frames: 7122944. Throughput: 0: 971.3. Samples: 1779258. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2024-12-03 16:29:55,716][01348] Avg episode reward: [(0, '25.121')]
[2024-12-03 16:29:56,184][03358] Updated weights for policy 0, policy_version 1740 (0.0015)
[2024-12-03 16:30:00,711][01348] Fps is (10 sec: 4506.9, 60 sec: 4027.9, 300 sec: 3901.6). Total num frames: 7147520. Throughput: 0: 994.0. Samples: 1786312. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:30:00,713][01348] Avg episode reward: [(0, '23.950')]
[2024-12-03 16:30:05,713][01348] Fps is (10 sec: 4095.3, 60 sec: 3959.3, 300 sec: 3901.6). Total num frames: 7163904. Throughput: 0: 1001.5. Samples: 1791660. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:30:05,715][01348] Avg episode reward: [(0, '24.146')]
[2024-12-03 16:30:06,893][03358] Updated weights for policy 0, policy_version 1750 (0.0016)
[2024-12-03 16:30:10,711][01348] Fps is (10 sec: 3276.8, 60 sec: 3891.2, 300 sec: 3873.8). Total num frames: 7180288. Throughput: 0: 971.2. Samples: 1793894. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:30:10,716][01348] Avg episode reward: [(0, '22.862')]
[2024-12-03 16:30:15,711][01348] Fps is (10 sec: 4096.7, 60 sec: 4027.7, 300 sec: 3887.7). Total num frames: 7204864. Throughput: 0: 981.6. Samples: 1800934. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:30:15,713][01348] Avg episode reward: [(0, '22.649')]
[2024-12-03 16:30:15,728][03345] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001759_7204864.pth...
[2024-12-03 16:30:15,856][03345] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001528_6258688.pth
[2024-12-03 16:30:16,334][03358] Updated weights for policy 0, policy_version 1760 (0.0024)
[2024-12-03 16:30:20,712][01348] Fps is (10 sec: 4505.2, 60 sec: 4027.7, 300 sec: 3915.5). Total num frames: 7225344. Throughput: 0: 1024.2. Samples: 1807188. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:30:20,717][01348] Avg episode reward: [(0, '22.214')]
[2024-12-03 16:30:25,711][01348] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3873.8). Total num frames: 7237632. Throughput: 0: 991.4. Samples: 1809260. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:30:25,713][01348] Avg episode reward: [(0, '23.231')]
[2024-12-03 16:30:27,669][03358] Updated weights for policy 0, policy_version 1770 (0.0027)
[2024-12-03 16:30:30,711][01348] Fps is (10 sec: 3686.6, 60 sec: 3959.4, 300 sec: 3887.7). Total num frames: 7262208. Throughput: 0: 965.6. Samples: 1815228. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2024-12-03 16:30:30,714][01348] Avg episode reward: [(0, '22.734')]
[2024-12-03 16:30:35,711][01348] Fps is (10 sec: 4915.2, 60 sec: 4096.0, 300 sec: 3915.5). Total num frames: 7286784. Throughput: 0: 1021.7. Samples: 1822374. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:30:35,714][01348] Avg episode reward: [(0, '23.943')]
[2024-12-03 16:30:36,709][03358] Updated weights for policy 0, policy_version 1780 (0.0016)
[2024-12-03 16:30:40,717][01348] Fps is (10 sec: 3684.4, 60 sec: 3890.8, 300 sec: 3901.5). Total num frames: 7299072. Throughput: 0: 1005.3. Samples: 1824504. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-12-03 16:30:40,719][01348] Avg episode reward: [(0, '23.606')]
[2024-12-03 16:30:45,711][01348] Fps is (10 sec: 3276.8, 60 sec: 3891.5, 300 sec: 3887.7). Total num frames: 7319552. Throughput: 0: 967.2. Samples: 1829836. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:30:45,717][01348] Avg episode reward: [(0, '23.523')]
[2024-12-03 16:30:47,631][03358] Updated weights for policy 0, policy_version 1790 (0.0021)
[2024-12-03 16:30:50,711][01348] Fps is (10 sec: 4508.3, 60 sec: 4027.9, 300 sec: 3915.5). Total num frames: 7344128. Throughput: 0: 1006.1. Samples: 1836932. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-12-03 16:30:50,713][01348] Avg episode reward: [(0, '23.145')]
[2024-12-03 16:30:55,711][01348] Fps is (10 sec: 4095.8, 60 sec: 3959.4, 300 sec: 3915.5). Total num frames: 7360512. Throughput: 0: 1022.9. Samples: 1839926. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:30:55,714][01348] Avg episode reward: [(0, '22.647')]
[2024-12-03 16:30:58,977][03358] Updated weights for policy 0, policy_version 1800 (0.0026)
[2024-12-03 16:31:00,711][01348] Fps is (10 sec: 3276.7, 60 sec: 3822.9, 300 sec: 3901.6). Total num frames: 7376896. Throughput: 0: 963.3. Samples: 1844284. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:31:00,714][01348] Avg episode reward: [(0, '23.387')]
[2024-12-03 16:31:05,711][01348] Fps is (10 sec: 4096.2, 60 sec: 3959.6, 300 sec: 3915.6). Total num frames: 7401472. Throughput: 0: 974.5. Samples: 1851040. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:31:05,715][01348] Avg episode reward: [(0, '24.509')]
[2024-12-03 16:31:08,114][03358] Updated weights for policy 0, policy_version 1810 (0.0026)
[2024-12-03 16:31:10,711][01348] Fps is (10 sec: 4505.7, 60 sec: 4027.7, 300 sec: 3943.3). Total num frames: 7421952. Throughput: 0: 1004.6. Samples: 1854466. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:31:10,718][01348] Avg episode reward: [(0, '24.408')]
[2024-12-03 16:31:15,712][01348] Fps is (10 sec: 3276.6, 60 sec: 3822.9, 300 sec: 3901.6). Total num frames: 7434240. Throughput: 0: 973.6. Samples: 1859042. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2024-12-03 16:31:15,718][01348] Avg episode reward: [(0, '24.212')]
[2024-12-03 16:31:19,508][03358] Updated weights for policy 0, policy_version 1820 (0.0027)
[2024-12-03 16:31:20,711][01348] Fps is (10 sec: 3686.4, 60 sec: 3891.3, 300 sec: 3915.5). Total num frames: 7458816. Throughput: 0: 958.5. Samples: 1865508. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2024-12-03 16:31:20,713][01348] Avg episode reward: [(0, '23.861')]
[2024-12-03 16:31:25,711][01348] Fps is (10 sec: 4915.5, 60 sec: 4096.0, 300 sec: 3957.2). Total num frames: 7483392. Throughput: 0: 991.1. Samples: 1869098. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:31:25,713][01348] Avg episode reward: [(0, '23.653')]
[2024-12-03 16:31:29,520][03358] Updated weights for policy 0, policy_version 1830 (0.0020)
[2024-12-03 16:31:30,711][01348] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3929.4). Total num frames: 7495680. Throughput: 0: 991.1. Samples: 1874436. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:31:30,713][01348] Avg episode reward: [(0, '23.577')]
[2024-12-03 16:31:35,711][01348] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3929.4). Total num frames: 7516160. Throughput: 0: 958.7. Samples: 1880072. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:31:35,720][01348] Avg episode reward: [(0, '22.982')]
[2024-12-03 16:31:39,563][03358] Updated weights for policy 0, policy_version 1840 (0.0016)
[2024-12-03 16:31:40,711][01348] Fps is (10 sec: 4505.6, 60 sec: 4028.1, 300 sec: 3957.2). Total num frames: 7540736. Throughput: 0: 971.5. Samples: 1883644. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:31:40,718][01348] Avg episode reward: [(0, '22.934')]
[2024-12-03 16:31:45,711][01348] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3957.2). Total num frames: 7557120. Throughput: 0: 1014.3. Samples: 1889926. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0)
[2024-12-03 16:31:45,713][01348] Avg episode reward: [(0, '23.934')]
[2024-12-03 16:31:50,711][01348] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3929.4). Total num frames: 7573504. Throughput: 0: 969.0. Samples: 1894646. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2024-12-03 16:31:50,717][01348] Avg episode reward: [(0, '22.685')]
[2024-12-03 16:31:50,894][03358] Updated weights for policy 0, policy_version 1850 (0.0016)
[2024-12-03 16:31:55,711][01348] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3957.2). Total num frames: 7598080. Throughput: 0: 970.9. Samples: 1898156. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:31:55,719][01348] Avg episode reward: [(0, '22.860')]
[2024-12-03 16:31:59,755][03358] Updated weights for policy 0, policy_version 1860 (0.0014)
[2024-12-03 16:32:00,718][01348] Fps is (10 sec: 4502.5, 60 sec: 4027.3, 300 sec: 3970.9). Total num frames: 7618560. Throughput: 0: 1023.0. Samples: 1905082. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:32:00,724][01348] Avg episode reward: [(0, '23.157')]
[2024-12-03 16:32:05,711][01348] Fps is (10 sec: 3686.3, 60 sec: 3891.2, 300 sec: 3957.2). Total num frames: 7634944. Throughput: 0: 974.1. Samples: 1909344. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2024-12-03 16:32:05,717][01348] Avg episode reward: [(0, '24.535')]
[2024-12-03 16:32:10,711][01348] Fps is (10 sec: 3688.9, 60 sec: 3891.2, 300 sec: 3943.3). Total num frames: 7655424. Throughput: 0: 963.8. Samples: 1912470. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2024-12-03 16:32:10,713][01348] Avg episode reward: [(0, '22.669')]
[2024-12-03 16:32:11,124][03358] Updated weights for policy 0, policy_version 1870 (0.0022)
[2024-12-03 16:32:15,711][01348] Fps is (10 sec: 4505.7, 60 sec: 4096.0, 300 sec: 3984.9). Total num frames: 7680000. Throughput: 0: 1003.8. Samples: 1919606. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:32:15,718][01348] Avg episode reward: [(0, '22.973')]
[2024-12-03 16:32:15,732][03345] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001875_7680000.pth...
[2024-12-03 16:32:15,885][03345] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001642_6725632.pth
[2024-12-03 16:32:20,711][01348] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3957.2). Total num frames: 7692288. Throughput: 0: 988.9. Samples: 1924574. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2024-12-03 16:32:20,718][01348] Avg episode reward: [(0, '23.287')]
[2024-12-03 16:32:22,099][03358] Updated weights for policy 0, policy_version 1880 (0.0027)
[2024-12-03 16:32:25,711][01348] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3943.3). Total num frames: 7712768. Throughput: 0: 962.7. Samples: 1926964. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:32:25,713][01348] Avg episode reward: [(0, '22.580')]
[2024-12-03 16:32:30,711][01348] Fps is (10 sec: 4505.6, 60 sec: 4027.7, 300 sec: 3971.1). Total num frames: 7737344. Throughput: 0: 976.4. Samples: 1933862. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:32:30,713][01348] Avg episode reward: [(0, '22.768')]
[2024-12-03 16:32:31,465][03358] Updated weights for policy 0, policy_version 1890 (0.0015)
[2024-12-03 16:32:35,711][01348] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3957.2). Total num frames: 7753728. Throughput: 0: 998.0. Samples: 1939558. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2024-12-03 16:32:35,714][01348] Avg episode reward: [(0, '22.642')]
[2024-12-03 16:32:40,711][01348] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3943.3). Total num frames: 7770112. Throughput: 0: 965.6. Samples: 1941606. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2024-12-03 16:32:40,713][01348] Avg episode reward: [(0, '23.602')]
[2024-12-03 16:32:42,811][03358] Updated weights for policy 0, policy_version 1900 (0.0021)
[2024-12-03 16:32:45,711][01348] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3957.2). Total num frames: 7794688. Throughput: 0: 957.8. Samples: 1948176. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:32:45,717][01348] Avg episode reward: [(0, '24.142')]
[2024-12-03 16:32:50,713][01348] Fps is (10 sec: 4504.8, 60 sec: 4027.6, 300 sec: 3971.0). Total num frames: 7815168. Throughput: 0: 1007.8. Samples: 1954698. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:32:50,720][01348] Avg episode reward: [(0, '22.772')]
[2024-12-03 16:32:53,028][03358] Updated weights for policy 0, policy_version 1910 (0.0018)
[2024-12-03 16:32:55,711][01348] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3929.4). Total num frames: 7827456. Throughput: 0: 984.0. Samples: 1956752. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-12-03 16:32:55,716][01348] Avg episode reward: [(0, '22.100')]
[2024-12-03 16:33:00,711][01348] Fps is (10 sec: 3687.1, 60 sec: 3891.6, 300 sec: 3943.3). Total num frames: 7852032. Throughput: 0: 953.7. Samples: 1962522. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2024-12-03 16:33:00,715][01348] Avg episode reward: [(0, '22.965')]
[2024-12-03 16:33:03,217][03358] Updated weights for policy 0, policy_version 1920 (0.0017)
[2024-12-03 16:33:05,711][01348] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 3957.2). Total num frames: 7872512. Throughput: 0: 999.3. Samples: 1969544. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2024-12-03 16:33:05,715][01348] Avg episode reward: [(0, '23.958')]
[2024-12-03 16:33:10,711][01348] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3943.3). Total num frames: 7888896. Throughput: 0: 995.6. Samples: 1971768. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2024-12-03 16:33:10,719][01348] Avg episode reward: [(0, '24.057')]
[2024-12-03 16:33:14,795][03358] Updated weights for policy 0, policy_version 1930 (0.0014)
[2024-12-03 16:33:15,711][01348] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3929.4). Total num frames: 7909376. Throughput: 0: 948.9. Samples: 1976562. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:33:15,717][01348] Avg episode reward: [(0, '24.757')]
[2024-12-03 16:33:20,711][01348] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3943.3). Total num frames: 7929856. Throughput: 0: 974.0. Samples: 1983386. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2024-12-03 16:33:20,713][01348] Avg episode reward: [(0, '26.709')]
[2024-12-03 16:33:20,719][03345] Saving new best policy, reward=26.709!
[2024-12-03 16:33:24,818][03358] Updated weights for policy 0, policy_version 1940 (0.0023)
[2024-12-03 16:33:25,711][01348] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3943.3). Total num frames: 7946240. Throughput: 0: 992.0. Samples: 1986244. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-12-03 16:33:25,713][01348] Avg episode reward: [(0, '27.610')]
[2024-12-03 16:33:25,719][03345] Saving new best policy, reward=27.610!
[2024-12-03 16:33:30,711][01348] Fps is (10 sec: 2867.2, 60 sec: 3686.4, 300 sec: 3901.6). Total num frames: 7958528. Throughput: 0: 930.9. Samples: 1990066. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2024-12-03 16:33:30,714][01348] Avg episode reward: [(0, '26.175')]
[2024-12-03 16:33:35,711][01348] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3915.6). Total num frames: 7983104. Throughput: 0: 927.3. Samples: 1996426. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:33:35,714][01348] Avg episode reward: [(0, '25.389')]
[2024-12-03 16:33:36,252][03358] Updated weights for policy 0, policy_version 1950 (0.0037)
[2024-12-03 16:33:40,711][01348] Fps is (10 sec: 4505.5, 60 sec: 3891.2, 300 sec: 3943.3). Total num frames: 8003584. Throughput: 0: 960.3. Samples: 1999964. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:33:40,717][01348] Avg episode reward: [(0, '23.791')]
[2024-12-03 16:33:45,711][01348] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3915.5). Total num frames: 8019968. Throughput: 0: 943.0. Samples: 2004958. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:33:45,713][01348] Avg episode reward: [(0, '22.920')]
[2024-12-03 16:33:47,636][03358] Updated weights for policy 0, policy_version 1960 (0.0013)
[2024-12-03 16:33:50,711][01348] Fps is (10 sec: 3686.5, 60 sec: 3754.8, 300 sec: 3901.6). Total num frames: 8040448. Throughput: 0: 924.2. Samples: 2011134. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2024-12-03 16:33:50,713][01348] Avg episode reward: [(0, '21.460')]
[2024-12-03 16:33:55,711][01348] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 3929.4). Total num frames: 8065024. Throughput: 0: 952.8. Samples: 2014646. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:33:55,714][01348] Avg episode reward: [(0, '22.009')]
[2024-12-03 16:33:56,332][03358] Updated weights for policy 0, policy_version 1970 (0.0029)
[2024-12-03 16:34:00,711][01348] Fps is (10 sec: 4095.9, 60 sec: 3822.9, 300 sec: 3915.5). Total num frames: 8081408. Throughput: 0: 972.5. Samples: 2020324. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-12-03 16:34:00,714][01348] Avg episode reward: [(0, '21.686')]
[2024-12-03 16:34:05,711][01348] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3901.6). Total num frames: 8097792. Throughput: 0: 928.4. Samples: 2025162. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2024-12-03 16:34:05,720][01348] Avg episode reward: [(0, '22.318')]
[2024-12-03 16:34:08,034][03358] Updated weights for policy 0, policy_version 1980 (0.0024)
[2024-12-03 16:34:10,711][01348] Fps is (10 sec: 3686.5, 60 sec: 3822.9, 300 sec: 3915.5). Total num frames: 8118272. Throughput: 0: 942.7. Samples: 2028666. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-12-03 16:34:10,718][01348] Avg episode reward: [(0, '22.791')]
[2024-12-03 16:34:15,711][01348] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3915.5). Total num frames: 8138752. Throughput: 0: 1004.7. Samples: 2035278. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:34:15,715][01348] Avg episode reward: [(0, '22.143')]
[2024-12-03 16:34:15,726][03345] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001987_8138752.pth...
[2024-12-03 16:34:15,891][03345] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001759_7204864.pth
[2024-12-03 16:34:19,071][03358] Updated weights for policy 0, policy_version 1990 (0.0014)
[2024-12-03 16:34:20,711][01348] Fps is (10 sec: 3686.3, 60 sec: 3754.7, 300 sec: 3887.7). Total num frames: 8155136. Throughput: 0: 955.0. Samples: 2039402. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:34:20,713][01348] Avg episode reward: [(0, '22.316')]
[2024-12-03 16:34:25,711][01348] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3915.5). Total num frames: 8179712. Throughput: 0: 951.7. Samples: 2042790. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:34:25,713][01348] Avg episode reward: [(0, '23.224')]
[2024-12-03 16:34:28,472][03358] Updated weights for policy 0, policy_version 2000 (0.0038)
[2024-12-03 16:34:30,716][01348] Fps is (10 sec: 4503.5, 60 sec: 4027.4, 300 sec: 3929.3). Total num frames: 8200192. Throughput: 0: 997.1. Samples: 2049832. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:34:30,718][01348] Avg episode reward: [(0, '23.342')]
[2024-12-03 16:34:35,716][01348] Fps is (10 sec: 3275.2, 60 sec: 3822.6, 300 sec: 3887.7). Total num frames: 8212480. Throughput: 0: 958.3. Samples: 2054264. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-12-03 16:34:35,718][01348] Avg episode reward: [(0, '23.482')]
[2024-12-03 16:34:39,872][03358] Updated weights for policy 0, policy_version 2010 (0.0020)
[2024-12-03 16:34:40,711][01348] Fps is (10 sec: 3688.2, 60 sec: 3891.2, 300 sec: 3901.7). Total num frames: 8237056. Throughput: 0: 945.2. Samples: 2057178. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:34:40,719][01348] Avg episode reward: [(0, '23.940')]
[2024-12-03 16:34:45,711][01348] Fps is (10 sec: 4507.8, 60 sec: 3959.5, 300 sec: 3915.5). Total num frames: 8257536. Throughput: 0: 980.7. Samples: 2064456. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:34:45,721][01348] Avg episode reward: [(0, '24.127')]
[2024-12-03 16:34:49,185][03358] Updated weights for policy 0, policy_version 2020 (0.0022)
[2024-12-03 16:34:50,711][01348] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3915.5). Total num frames: 8278016. Throughput: 0: 998.7. Samples: 2070104. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2024-12-03 16:34:50,716][01348] Avg episode reward: [(0, '23.219')]
[2024-12-03 16:34:55,711][01348] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3887.7). Total num frames: 8294400. Throughput: 0: 970.6. Samples: 2072344. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:34:55,718][01348] Avg episode reward: [(0, '24.280')]
[2024-12-03 16:34:59,474][03358] Updated weights for policy 0, policy_version 2030 (0.0029)
[2024-12-03 16:35:00,711][01348] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3915.5). Total num frames: 8318976. Throughput: 0: 978.6. Samples: 2079316. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:35:00,713][01348] Avg episode reward: [(0, '23.489')]
[2024-12-03 16:35:05,711][01348] Fps is (10 sec: 4505.5, 60 sec: 4027.7, 300 sec: 3929.4). Total num frames: 8339456. Throughput: 0: 1022.1. Samples: 2085396. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-12-03 16:35:05,714][01348] Avg episode reward: [(0, '22.241')]
[2024-12-03 16:35:10,711][01348] Fps is (10 sec: 3276.8, 60 sec: 3891.2, 300 sec: 3887.7). Total num frames: 8351744. Throughput: 0: 991.2. Samples: 2087396. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:35:10,714][01348] Avg episode reward: [(0, '22.016')]
[2024-12-03 16:35:11,200][03358] Updated weights for policy 0, policy_version 2040 (0.0030)
[2024-12-03 16:35:15,711][01348] Fps is (10 sec: 3686.5, 60 sec: 3959.5, 300 sec: 3901.6). Total num frames: 8376320. Throughput: 0: 968.8. Samples: 2093424. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:35:15,721][01348] Avg episode reward: [(0, '22.548')]
[2024-12-03 16:35:19,900][03358] Updated weights for policy 0, policy_version 2050 (0.0020)
[2024-12-03 16:35:20,711][01348] Fps is (10 sec: 4505.6, 60 sec: 4027.7, 300 sec: 3929.4). Total num frames: 8396800. Throughput: 0: 1028.1. Samples: 2100522. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:35:20,718][01348] Avg episode reward: [(0, '21.739')]
[2024-12-03 16:35:25,717][01348] Fps is (10 sec: 3684.2, 60 sec: 3890.8, 300 sec: 3901.5). Total num frames: 8413184. Throughput: 0: 1009.4. Samples: 2102606. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:35:25,719][01348] Avg episode reward: [(0, '21.460')]
[2024-12-03 16:35:30,714][01348] Fps is (10 sec: 3685.3, 60 sec: 3891.3, 300 sec: 3887.7). Total num frames: 8433664. Throughput: 0: 966.4. Samples: 2107946. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-12-03 16:35:30,719][01348] Avg episode reward: [(0, '21.853')]
[2024-12-03 16:35:31,320][03358] Updated weights for policy 0, policy_version 2060 (0.0034)
[2024-12-03 16:35:35,711][01348] Fps is (10 sec: 4098.4, 60 sec: 4028.1, 300 sec: 3915.6). Total num frames: 8454144. Throughput: 0: 994.8. Samples: 2114868. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-12-03 16:35:35,713][01348] Avg episode reward: [(0, '22.584')]
[2024-12-03 16:35:40,711][01348] Fps is (10 sec: 4097.2, 60 sec: 3959.5, 300 sec: 3915.5). Total num frames: 8474624. Throughput: 0: 1008.7. Samples: 2117736. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-12-03 16:35:40,718][01348] Avg episode reward: [(0, '23.663')]
[2024-12-03 16:35:42,136][03358] Updated weights for policy 0, policy_version 2070 (0.0022)
[2024-12-03 16:35:45,711][01348] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3887.7). Total num frames: 8491008. Throughput: 0: 955.8. Samples: 2122326. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2024-12-03 16:35:45,719][01348] Avg episode reward: [(0, '23.282')]
[2024-12-03 16:35:50,711][01348] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3915.5). Total num frames: 8515584. Throughput: 0: 979.3. Samples: 2129464. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:35:50,719][01348] Avg episode reward: [(0, '24.015')]
[2024-12-03 16:35:51,545][03358] Updated weights for policy 0, policy_version 2080 (0.0029)
[2024-12-03 16:35:55,711][01348] Fps is (10 sec: 4505.6, 60 sec: 4027.7, 300 sec: 3929.4). Total num frames: 8536064. Throughput: 0: 1012.8. Samples: 2132970. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-12-03 16:35:55,713][01348] Avg episode reward: [(0, '22.876')]
[2024-12-03 16:36:00,711][01348] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3887.7). Total num frames: 8548352. Throughput: 0: 975.2. Samples: 2137306. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:36:00,719][01348] Avg episode reward: [(0, '22.753')]
[2024-12-03 16:36:02,954][03358] Updated weights for policy 0, policy_version 2090 (0.0020)
[2024-12-03 16:36:05,711][01348] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3901.6). Total num frames: 8572928. Throughput: 0: 964.0. Samples: 2143900. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2024-12-03 16:36:05,719][01348] Avg episode reward: [(0, '20.739')]
[2024-12-03 16:36:10,711][01348] Fps is (10 sec: 4915.2, 60 sec: 4096.0, 300 sec: 3943.3). Total num frames: 8597504. Throughput: 0: 998.8. Samples: 2147546. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2024-12-03 16:36:10,719][01348] Avg episode reward: [(0, '22.106')]
[2024-12-03 16:36:11,853][03358] Updated weights for policy 0, policy_version 2100 (0.0033)
[2024-12-03 16:36:15,711][01348] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3901.6). Total num frames: 8609792. Throughput: 0: 998.9. Samples: 2152892. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-12-03 16:36:15,717][01348] Avg episode reward: [(0, '21.939')]
[2024-12-03 16:36:15,730][03345] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002102_8609792.pth...
[2024-12-03 16:36:15,922][03345] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001875_7680000.pth
[2024-12-03 16:36:20,711][01348] Fps is (10 sec: 3276.8, 60 sec: 3891.2, 300 sec: 3887.7). Total num frames: 8630272. Throughput: 0: 975.8. Samples: 2158778. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:36:20,713][01348] Avg episode reward: [(0, '21.893')]
[2024-12-03 16:36:22,596][03358] Updated weights for policy 0, policy_version 2110 (0.0024)
[2024-12-03 16:36:25,711][01348] Fps is (10 sec: 4505.6, 60 sec: 4028.1, 300 sec: 3929.4). Total num frames: 8654848. Throughput: 0: 987.8. Samples: 2162188. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2024-12-03 16:36:25,716][01348] Avg episode reward: [(0, '21.602')]
[2024-12-03 16:36:30,711][01348] Fps is (10 sec: 4096.0, 60 sec: 3959.7, 300 sec: 3915.5). Total num frames: 8671232. Throughput: 0: 1021.4. Samples: 2168288. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2024-12-03 16:36:30,716][01348] Avg episode reward: [(0, '21.343')]
[2024-12-03 16:36:34,037][03358] Updated weights for policy 0, policy_version 2120 (0.0015)
[2024-12-03 16:36:35,711][01348] Fps is (10 sec: 3276.8, 60 sec: 3891.2, 300 sec: 3887.7). Total num frames: 8687616. Throughput: 0: 967.0. Samples: 2172978. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:36:35,713][01348] Avg episode reward: [(0, '21.103')]
[2024-12-03 16:36:40,711][01348] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3915.5). Total num frames: 8712192. Throughput: 0: 970.2. Samples: 2176628. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:36:40,717][01348] Avg episode reward: [(0, '19.931')]
[2024-12-03 16:36:42,808][03358] Updated weights for policy 0, policy_version 2130 (0.0014)
[2024-12-03 16:36:45,713][01348] Fps is (10 sec: 4504.8, 60 sec: 4027.6, 300 sec: 3929.4). Total num frames: 8732672. Throughput: 0: 1028.9. Samples: 2183610. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:36:45,723][01348] Avg episode reward: [(0, '20.454')]
[2024-12-03 16:36:50,714][01348] Fps is (10 sec: 3685.4, 60 sec: 3891.0, 300 sec: 3901.6). Total num frames: 8749056. Throughput: 0: 976.8. Samples: 2187858. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:36:50,717][01348] Avg episode reward: [(0, '20.819')]
[2024-12-03 16:36:54,105][03358] Updated weights for policy 0, policy_version 2140 (0.0021)
[2024-12-03 16:36:55,711][01348] Fps is (10 sec: 3687.1, 60 sec: 3891.2, 300 sec: 3901.7). Total num frames: 8769536. Throughput: 0: 971.1. Samples: 2191246. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2024-12-03 16:36:55,718][01348] Avg episode reward: [(0, '22.412')]
[2024-12-03 16:37:00,711][01348] Fps is (10 sec: 4506.9, 60 sec: 4096.0, 300 sec: 3929.4). Total num frames: 8794112. Throughput: 0: 1010.6. Samples: 2198368. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:37:00,718][01348] Avg episode reward: [(0, '24.410')]
[2024-12-03 16:37:03,811][03358] Updated weights for policy 0, policy_version 2150 (0.0019)
[2024-12-03 16:37:05,711][01348] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3915.5). Total num frames: 8810496. Throughput: 0: 994.3. Samples: 2203520. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:37:05,713][01348] Avg episode reward: [(0, '24.499')]
[2024-12-03 16:37:10,711][01348] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3901.6). Total num frames: 8830976. Throughput: 0: 971.8. Samples: 2205920. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:37:10,713][01348] Avg episode reward: [(0, '23.048')]
[2024-12-03 16:37:13,927][03358] Updated weights for policy 0, policy_version 2160 (0.0021)
[2024-12-03 16:37:15,711][01348] Fps is (10 sec: 4505.6, 60 sec: 4096.0, 300 sec: 3943.3). Total num frames: 8855552. Throughput: 0: 998.4. Samples: 2213214. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2024-12-03 16:37:15,721][01348] Avg episode reward: [(0, '23.640')]
[2024-12-03 16:37:20,711][01348] Fps is (10 sec: 4096.0, 60 sec: 4027.7, 300 sec: 3929.4). Total num frames: 8871936. Throughput: 0: 1031.1. Samples: 2219378. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:37:20,714][01348] Avg episode reward: [(0, '22.220')]
[2024-12-03 16:37:24,850][03358] Updated weights for policy 0, policy_version 2170 (0.0032)
[2024-12-03 16:37:25,711][01348] Fps is (10 sec: 3276.8, 60 sec: 3891.2, 300 sec: 3901.6). Total num frames: 8888320. Throughput: 0: 997.2. Samples: 2221502. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:37:25,715][01348] Avg episode reward: [(0, '23.254')]
[2024-12-03 16:37:30,711][01348] Fps is (10 sec: 4096.0, 60 sec: 4027.7, 300 sec: 3929.4). Total num frames: 8912896. Throughput: 0: 987.0. Samples: 2228024. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:37:30,724][01348] Avg episode reward: [(0, '22.300')]
[2024-12-03 16:37:33,648][03358] Updated weights for policy 0, policy_version 2180 (0.0019)
[2024-12-03 16:37:35,711][01348] Fps is (10 sec: 4915.2, 60 sec: 4164.3, 300 sec: 3957.2). Total num frames: 8937472. Throughput: 0: 1042.2. Samples: 2234754. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2024-12-03 16:37:35,717][01348] Avg episode reward: [(0, '23.839')]
[2024-12-03 16:37:40,711][01348] Fps is (10 sec: 3686.4, 60 sec: 3959.5, 300 sec: 3915.5). Total num frames: 8949760. Throughput: 0: 1014.7. Samples: 2236908. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2024-12-03 16:37:40,714][01348] Avg episode reward: [(0, '23.189')]
[2024-12-03 16:37:44,971][03358] Updated weights for policy 0, policy_version 2190 (0.0024)
[2024-12-03 16:37:45,711][01348] Fps is (10 sec: 3276.8, 60 sec: 3959.6, 300 sec: 3915.5). Total num frames: 8970240. Throughput: 0: 984.2. Samples: 2242656. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:37:45,713][01348] Avg episode reward: [(0, '24.775')]
[2024-12-03 16:37:50,711][01348] Fps is (10 sec: 4505.6, 60 sec: 4096.2, 300 sec: 3957.2). Total num frames: 8994816. Throughput: 0: 1028.8. Samples: 2249818. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-12-03 16:37:50,719][01348] Avg episode reward: [(0, '23.393')]
[2024-12-03 16:37:54,584][03358] Updated weights for policy 0, policy_version 2200 (0.0017)
[2024-12-03 16:37:55,711][01348] Fps is (10 sec: 4096.0, 60 sec: 4027.7, 300 sec: 3929.4). Total num frames: 9011200. Throughput: 0: 1036.8. Samples: 2252576. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:37:55,717][01348] Avg episode reward: [(0, '23.963')]
[2024-12-03 16:38:00,711][01348] Fps is (10 sec: 3686.4, 60 sec: 3959.5, 300 sec: 3929.4). Total num frames: 9031680. Throughput: 0: 981.5. Samples: 2257382. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:38:00,713][01348] Avg episode reward: [(0, '25.237')]
[2024-12-03 16:38:04,811][03358] Updated weights for policy 0, policy_version 2210 (0.0035)
[2024-12-03 16:38:05,711][01348] Fps is (10 sec: 4505.6, 60 sec: 4096.0, 300 sec: 3957.2). Total num frames: 9056256. Throughput: 0: 1004.4. Samples: 2264578. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2024-12-03 16:38:05,713][01348] Avg episode reward: [(0, '25.372')]
[2024-12-03 16:38:10,711][01348] Fps is (10 sec: 4505.6, 60 sec: 4096.0, 300 sec: 3957.2). Total num frames: 9076736. Throughput: 0: 1038.6. Samples: 2268240. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:38:10,718][01348] Avg episode reward: [(0, '26.074')]
[2024-12-03 16:38:15,711][01348] Fps is (10 sec: 3276.8, 60 sec: 3891.2, 300 sec: 3929.4). Total num frames: 9089024. Throughput: 0: 993.2. Samples: 2272718. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:38:15,715][01348] Avg episode reward: [(0, '25.446')]
[2024-12-03 16:38:15,724][03345] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002219_9089024.pth...
[2024-12-03 16:38:15,843][03345] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001987_8138752.pth
[2024-12-03 16:38:15,992][03358] Updated weights for policy 0, policy_version 2220 (0.0033)
[2024-12-03 16:38:20,711][01348] Fps is (10 sec: 3686.3, 60 sec: 4027.7, 300 sec: 3957.1). Total num frames: 9113600. Throughput: 0: 993.0. Samples: 2279440. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:38:20,719][01348] Avg episode reward: [(0, '24.350')]
[2024-12-03 16:38:24,483][03358] Updated weights for policy 0, policy_version 2230 (0.0021)
[2024-12-03 16:38:25,714][01348] Fps is (10 sec: 4913.8, 60 sec: 4164.1, 300 sec: 3998.8). Total num frames: 9138176. Throughput: 0: 1024.2. Samples: 2283002. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:38:25,720][01348] Avg episode reward: [(0, '24.986')]
[2024-12-03 16:38:30,711][01348] Fps is (10 sec: 4096.0, 60 sec: 4027.7, 300 sec: 3971.0). Total num frames: 9154560. Throughput: 0: 1016.5. Samples: 2288398. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:38:30,718][01348] Avg episode reward: [(0, '25.122')]
[2024-12-03 16:38:35,711][01348] Fps is (10 sec: 3277.7, 60 sec: 3891.2, 300 sec: 3957.2). Total num frames: 9170944. Throughput: 0: 984.1. Samples: 2294104. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:38:35,715][01348] Avg episode reward: [(0, '24.237')]
[2024-12-03 16:38:35,787][03358] Updated weights for policy 0, policy_version 2240 (0.0030)
[2024-12-03 16:38:40,711][01348] Fps is (10 sec: 4096.1, 60 sec: 4096.0, 300 sec: 3984.9). Total num frames: 9195520. Throughput: 0: 1001.7. Samples: 2297652. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2024-12-03 16:38:40,714][01348] Avg episode reward: [(0, '24.571')]
[2024-12-03 16:38:45,522][03358] Updated weights for policy 0, policy_version 2250 (0.0015)
[2024-12-03 16:38:45,711][01348] Fps is (10 sec: 4505.6, 60 sec: 4096.0, 300 sec: 3984.9). Total num frames: 9216000. Throughput: 0: 1030.8. Samples: 2303768. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2024-12-03 16:38:45,713][01348] Avg episode reward: [(0, '26.319')]
[2024-12-03 16:38:50,711][01348] Fps is (10 sec: 3686.4, 60 sec: 3959.5, 300 sec: 3957.2). Total num frames: 9232384. Throughput: 0: 977.1. Samples: 2308546. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2024-12-03 16:38:50,716][01348] Avg episode reward: [(0, '25.569')]
[2024-12-03 16:38:55,711][01348] Fps is (10 sec: 3686.4, 60 sec: 4027.7, 300 sec: 3971.0). Total num frames: 9252864. Throughput: 0: 974.2. Samples: 2312080. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:38:55,713][01348] Avg episode reward: [(0, '25.312')]
[2024-12-03 16:38:55,793][03358] Updated weights for policy 0, policy_version 2260 (0.0023)
[2024-12-03 16:39:00,712][01348] Fps is (10 sec: 4505.1, 60 sec: 4095.9, 300 sec: 3998.8). Total num frames: 9277440. Throughput: 0: 1034.1. Samples: 2319254. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2024-12-03 16:39:00,714][01348] Avg episode reward: [(0, '25.592')]
[2024-12-03 16:39:05,712][01348] Fps is (10 sec: 3686.1, 60 sec: 3891.1, 300 sec: 3971.0). Total num frames: 9289728. Throughput: 0: 977.5. Samples: 2323430. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2024-12-03 16:39:05,716][01348] Avg episode reward: [(0, '25.794')]
[2024-12-03 16:39:07,361][03358] Updated weights for policy 0, policy_version 2270 (0.0028)
[2024-12-03 16:39:10,711][01348] Fps is (10 sec: 3686.8, 60 sec: 3959.5, 300 sec: 3984.9). Total num frames: 9314304. Throughput: 0: 968.1. Samples: 2326566. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:39:10,721][01348] Avg episode reward: [(0, '25.290')]
[2024-12-03 16:39:15,711][01348] Fps is (10 sec: 4505.9, 60 sec: 4096.0, 300 sec: 3998.8). Total num frames: 9334784. Throughput: 0: 999.7. Samples: 2333384. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2024-12-03 16:39:15,720][01348] Avg episode reward: [(0, '24.936')]
[2024-12-03 16:39:16,117][03358] Updated weights for policy 0, policy_version 2280 (0.0018)
[2024-12-03 16:39:20,711][01348] Fps is (10 sec: 3276.8, 60 sec: 3891.2, 300 sec: 3957.2). Total num frames: 9347072. Throughput: 0: 981.4. Samples: 2338268. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2024-12-03 16:39:20,713][01348] Avg episode reward: [(0, '23.485')]
[2024-12-03 16:39:25,711][01348] Fps is (10 sec: 3276.9, 60 sec: 3823.1, 300 sec: 3957.2). Total num frames: 9367552. Throughput: 0: 950.7. Samples: 2340432. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:39:25,713][01348] Avg episode reward: [(0, '24.868')]
[2024-12-03 16:39:28,102][03358] Updated weights for policy 0, policy_version 2290 (0.0029)
[2024-12-03 16:39:30,711][01348] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3985.0). Total num frames: 9388032. Throughput: 0: 967.2. Samples: 2347294. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2024-12-03 16:39:30,716][01348] Avg episode reward: [(0, '23.732')]
[2024-12-03 16:39:35,711][01348] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3971.0). Total num frames: 9408512. Throughput: 0: 989.8. Samples: 2353086. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2024-12-03 16:39:35,715][01348] Avg episode reward: [(0, '23.198')]
[2024-12-03 16:39:39,581][03358] Updated weights for policy 0, policy_version 2300 (0.0018)
[2024-12-03 16:39:40,711][01348] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3957.2). Total num frames: 9424896. Throughput: 0: 955.5. Samples: 2355076. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:39:40,715][01348] Avg episode reward: [(0, '23.036')]
[2024-12-03 16:39:45,711][01348] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3957.2). Total num frames: 9445376. Throughput: 0: 932.6. Samples: 2361220. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2024-12-03 16:39:45,721][01348] Avg episode reward: [(0, '24.318')]
[2024-12-03 16:39:48,759][03358] Updated weights for policy 0, policy_version 2310 (0.0020)
[2024-12-03 16:39:50,714][01348] Fps is (10 sec: 4094.8, 60 sec: 3891.0, 300 sec: 3971.0). Total num frames: 9465856. Throughput: 0: 987.0. Samples: 2367848. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2024-12-03 16:39:50,716][01348] Avg episode reward: [(0, '22.620')]
[2024-12-03 16:39:55,711][01348] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3943.3). Total num frames: 9482240. Throughput: 0: 964.3. Samples: 2369958. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2024-12-03 16:39:55,718][01348] Avg episode reward: [(0, '23.145')]
[2024-12-03 16:40:00,325][03358] Updated weights for policy 0, policy_version 2320 (0.0023)
[2024-12-03 16:40:00,711][01348] Fps is (10 sec: 3687.4, 60 sec: 3754.7, 300 sec: 3943.3). Total num frames: 9502720. Throughput: 0: 933.3. Samples: 2375384. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-12-03 16:40:00,717][01348] Avg episode reward: [(0, '23.875')]
[2024-12-03 16:40:05,711][01348] Fps is (10 sec: 4096.0, 60 sec: 3891.3, 300 sec: 3971.0). Total num frames: 9523200. Throughput: 0: 974.8. Samples: 2382136. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:40:05,714][01348] Avg episode reward: [(0, '25.596')]
[2024-12-03 16:40:10,711][01348] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3943.3). Total num frames: 9539584. Throughput: 0: 985.8. Samples: 2384792. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:40:10,713][01348] Avg episode reward: [(0, '25.759')]
[2024-12-03 16:40:11,043][03358] Updated weights for policy 0, policy_version 2330 (0.0029)
[2024-12-03 16:40:15,711][01348] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3943.3). Total num frames: 9560064. Throughput: 0: 933.5. Samples: 2389300. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:40:15,719][01348] Avg episode reward: [(0, '25.156')]
[2024-12-03 16:40:15,729][03345] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002334_9560064.pth...
[2024-12-03 16:40:15,862][03345] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002102_8609792.pth
[2024-12-03 16:40:20,712][01348] Fps is (10 sec: 4095.6, 60 sec: 3891.1, 300 sec: 3957.2). Total num frames: 9580544. Throughput: 0: 957.0. Samples: 2396152. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2024-12-03 16:40:20,721][01348] Avg episode reward: [(0, '25.184')]
[2024-12-03 16:40:20,999][03358] Updated weights for policy 0, policy_version 2340 (0.0018)
[2024-12-03 16:40:25,711][01348] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3957.2). Total num frames: 9601024. Throughput: 0: 987.9. Samples: 2399532. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:40:25,717][01348] Avg episode reward: [(0, '25.577')]
[2024-12-03 16:40:30,711][01348] Fps is (10 sec: 3277.1, 60 sec: 3754.7, 300 sec: 3929.4). Total num frames: 9613312. Throughput: 0: 943.8. Samples: 2403690. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:40:30,715][01348] Avg episode reward: [(0, '24.919')]
[2024-12-03 16:40:32,778][03358] Updated weights for policy 0, policy_version 2350 (0.0029)
[2024-12-03 16:40:35,711][01348] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3943.3). Total num frames: 9637888. Throughput: 0: 937.2. Samples: 2410020. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:40:35,720][01348] Avg episode reward: [(0, '23.736')]
[2024-12-03 16:40:40,714][01348] Fps is (10 sec: 4504.3, 60 sec: 3891.0, 300 sec: 3957.1). Total num frames: 9658368. Throughput: 0: 966.9. Samples: 2413472. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2024-12-03 16:40:40,716][01348] Avg episode reward: [(0, '24.058')]
[2024-12-03 16:40:42,313][03358] Updated weights for policy 0, policy_version 2360 (0.0023)
[2024-12-03 16:40:45,711][01348] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3929.4). Total num frames: 9674752. Throughput: 0: 958.7. Samples: 2418524. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2024-12-03 16:40:45,724][01348] Avg episode reward: [(0, '25.064')]
[2024-12-03 16:40:50,711][01348] Fps is (10 sec: 3687.3, 60 sec: 3823.1, 300 sec: 3929.4). Total num frames: 9695232. Throughput: 0: 937.6. Samples: 2424330. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2024-12-03 16:40:50,714][01348] Avg episode reward: [(0, '26.209')]
[2024-12-03 16:40:53,303][03358] Updated weights for policy 0, policy_version 2370 (0.0030)
[2024-12-03 16:40:55,711][01348] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3957.2). Total num frames: 9715712. Throughput: 0: 954.1. Samples: 2427726. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2024-12-03 16:40:55,714][01348] Avg episode reward: [(0, '25.762')]
[2024-12-03 16:41:00,715][01348] Fps is (10 sec: 3685.1, 60 sec: 3822.7, 300 sec: 3929.3). Total num frames: 9732096. Throughput: 0: 980.9. Samples: 2433442. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2024-12-03 16:41:00,717][01348] Avg episode reward: [(0, '25.803')]
[2024-12-03 16:41:05,022][03358] Updated weights for policy 0, policy_version 2380 (0.0015)
[2024-12-03 16:41:05,711][01348] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3901.6). Total num frames: 9748480. Throughput: 0: 934.0. Samples: 2438180. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2024-12-03 16:41:05,720][01348] Avg episode reward: [(0, '26.791')]
[2024-12-03 16:41:10,711][01348] Fps is (10 sec: 4097.6, 60 sec: 3891.2, 300 sec: 3943.3). Total num frames: 9773056. Throughput: 0: 934.3. Samples: 2441574. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:41:10,718][01348] Avg episode reward: [(0, '25.941')]
[2024-12-03 16:41:14,166][03358] Updated weights for policy 0, policy_version 2390 (0.0016)
[2024-12-03 16:41:15,711][01348] Fps is (10 sec: 4505.6, 60 sec: 3891.2, 300 sec: 3943.3). Total num frames: 9793536. Throughput: 0: 990.0. Samples: 2448242. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2024-12-03 16:41:15,713][01348] Avg episode reward: [(0, '26.142')]
[2024-12-03 16:41:20,711][01348] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3901.6). Total num frames: 9805824. Throughput: 0: 946.8. Samples: 2452628. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:41:20,722][01348] Avg episode reward: [(0, '25.391')]
[2024-12-03 16:41:25,231][03358] Updated weights for policy 0, policy_version 2400 (0.0015)
[2024-12-03 16:41:25,711][01348] Fps is (10 sec: 3686.3, 60 sec: 3822.9, 300 sec: 3929.4). Total num frames: 9830400. Throughput: 0: 950.0. Samples: 2456220. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:41:25,719][01348] Avg episode reward: [(0, '26.233')]
[2024-12-03 16:41:30,712][01348] Fps is (10 sec: 4914.9, 60 sec: 4027.7, 300 sec: 3957.1). Total num frames: 9854976. Throughput: 0: 996.7. Samples: 2463376. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:41:30,715][01348] Avg episode reward: [(0, '25.334')]
[2024-12-03 16:41:35,711][01348] Fps is (10 sec: 3686.5, 60 sec: 3822.9, 300 sec: 3915.5). Total num frames: 9867264. Throughput: 0: 970.6. Samples: 2468006. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:41:35,718][01348] Avg episode reward: [(0, '25.055')]
[2024-12-03 16:41:36,124][03358] Updated weights for policy 0, policy_version 2410 (0.0013)
[2024-12-03 16:41:40,711][01348] Fps is (10 sec: 3277.0, 60 sec: 3823.1, 300 sec: 3915.5). Total num frames: 9887744. Throughput: 0: 952.3. Samples: 2470578. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-12-03 16:41:40,718][01348] Avg episode reward: [(0, '24.268')]
[2024-12-03 16:41:45,347][03358] Updated weights for policy 0, policy_version 2420 (0.0019)
[2024-12-03 16:41:45,711][01348] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 3943.3). Total num frames: 9912320. Throughput: 0: 979.8. Samples: 2477528. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2024-12-03 16:41:45,718][01348] Avg episode reward: [(0, '25.326')]
[2024-12-03 16:41:50,711][01348] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3929.4). Total num frames: 9928704. Throughput: 0: 995.9. Samples: 2482994. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2024-12-03 16:41:50,719][01348] Avg episode reward: [(0, '25.116')]
[2024-12-03 16:41:55,711][01348] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3901.6). Total num frames: 9945088. Throughput: 0: 966.4. Samples: 2485064. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:41:55,713][01348] Avg episode reward: [(0, '24.875')]
[2024-12-03 16:41:57,190][03358] Updated weights for policy 0, policy_version 2430 (0.0030)
[2024-12-03 16:42:00,711][01348] Fps is (10 sec: 4096.0, 60 sec: 3959.7, 300 sec: 3929.4). Total num frames: 9969664. Throughput: 0: 965.9. Samples: 2491708. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2024-12-03 16:42:00,716][01348] Avg episode reward: [(0, '24.585')]
[2024-12-03 16:42:05,711][01348] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3915.5). Total num frames: 9986048. Throughput: 0: 1007.3. Samples: 2497956. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-12-03 16:42:05,721][01348] Avg episode reward: [(0, '25.749')]
[2024-12-03 16:42:07,319][03358] Updated weights for policy 0, policy_version 2440 (0.0028)
[2024-12-03 16:42:10,717][01348] Fps is (10 sec: 3274.9, 60 sec: 3822.6, 300 sec: 3887.7). Total num frames: 10002432. Throughput: 0: 973.1. Samples: 2500016. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-12-03 16:42:10,719][01348] Avg episode reward: [(0, '25.516')]
[2024-12-03 16:42:11,513][03345] Stopping Batcher_0...
[2024-12-03 16:42:11,513][03345] Loop batcher_evt_loop terminating...
[2024-12-03 16:42:11,514][03345] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002443_10006528.pth...
[2024-12-03 16:42:11,513][01348] Component Batcher_0 stopped!
[2024-12-03 16:42:11,573][03358] Weights refcount: 2 0
[2024-12-03 16:42:11,578][01348] Component InferenceWorker_p0-w0 stopped!
[2024-12-03 16:42:11,588][03358] Stopping InferenceWorker_p0-w0...
[2024-12-03 16:42:11,588][03358] Loop inference_proc0-0_evt_loop terminating...
[2024-12-03 16:42:11,639][03345] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002219_9089024.pth
[2024-12-03 16:42:11,651][03345] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002443_10006528.pth...
[2024-12-03 16:42:11,834][03345] Stopping LearnerWorker_p0...
[2024-12-03 16:42:11,835][03345] Loop learner_proc0_evt_loop terminating...
[2024-12-03 16:42:11,834][01348] Component LearnerWorker_p0 stopped!
[2024-12-03 16:42:11,961][01348] Component RolloutWorker_w3 stopped!
[2024-12-03 16:42:11,965][03366] Stopping RolloutWorker_w3...
[2024-12-03 16:42:11,969][03366] Loop rollout_proc3_evt_loop terminating...
[2024-12-03 16:42:11,980][01348] Component RolloutWorker_w5 stopped!
[2024-12-03 16:42:11,990][03368] Stopping RolloutWorker_w5...
[2024-12-03 16:42:11,993][03368] Loop rollout_proc5_evt_loop terminating...
[2024-12-03 16:42:11,998][01348] Component RolloutWorker_w2 stopped!
[2024-12-03 16:42:12,000][03364] Stopping RolloutWorker_w2...
[2024-12-03 16:42:12,008][03370] Stopping RolloutWorker_w7...
[2024-12-03 16:42:12,008][01348] Component RolloutWorker_w4 stopped!
[2024-12-03 16:42:12,010][01348] Component RolloutWorker_w7 stopped!
[2024-12-03 16:42:12,012][03367] Stopping RolloutWorker_w4...
[2024-12-03 16:42:12,018][01348] Component RolloutWorker_w6 stopped!
[2024-12-03 16:42:12,019][03369] Stopping RolloutWorker_w6...
[2024-12-03 16:42:12,002][03364] Loop rollout_proc2_evt_loop terminating...
[2024-12-03 16:42:12,023][03370] Loop rollout_proc7_evt_loop terminating...
[2024-12-03 16:42:12,023][03367] Loop rollout_proc4_evt_loop terminating...
[2024-12-03 16:42:12,022][03369] Loop rollout_proc6_evt_loop terminating...
[2024-12-03 16:42:12,042][01348] Component RolloutWorker_w0 stopped!
[2024-12-03 16:42:12,044][03365] Stopping RolloutWorker_w0...
[2024-12-03 16:42:12,055][03362] Stopping RolloutWorker_w1...
[2024-12-03 16:42:12,055][01348] Component RolloutWorker_w1 stopped!
[2024-12-03 16:42:12,058][01348] Waiting for process learner_proc0 to stop...
[2024-12-03 16:42:12,046][03365] Loop rollout_proc0_evt_loop terminating...
[2024-12-03 16:42:12,066][03362] Loop rollout_proc1_evt_loop terminating...
[2024-12-03 16:42:13,513][01348] Waiting for process inference_proc0-0 to join...
[2024-12-03 16:42:13,517][01348] Waiting for process rollout_proc0 to join...
[2024-12-03 16:42:15,550][01348] Waiting for process rollout_proc1 to join...
[2024-12-03 16:42:15,554][01348] Waiting for process rollout_proc2 to join...
[2024-12-03 16:42:15,558][01348] Waiting for process rollout_proc3 to join...
[2024-12-03 16:42:15,562][01348] Waiting for process rollout_proc4 to join...
[2024-12-03 16:42:15,566][01348] Waiting for process rollout_proc5 to join...
[2024-12-03 16:42:15,570][01348] Waiting for process rollout_proc6 to join...
[2024-12-03 16:42:15,575][01348] Waiting for process rollout_proc7 to join...
[2024-12-03 16:42:15,579][01348] Batcher 0 profile tree view:
batching: 64.7918, releasing_batches: 0.0709
[2024-12-03 16:42:15,582][01348] InferenceWorker_p0-w0 profile tree view:
wait_policy: 0.0000
wait_policy_total: 1029.0098
update_model: 22.3327
weight_update: 0.0026
one_step: 0.0029
handle_policy_step: 1446.1107
deserialize: 36.3543, stack: 8.2491, obs_to_device_normalize: 306.7894, forward: 723.3391, send_messages: 73.0594
prepare_outputs: 224.4219
to_cpu: 135.6385
[2024-12-03 16:42:15,584][01348] Learner 0 profile tree view:
misc: 0.0144, prepare_batch: 29.0485
train: 176.8815
epoch_init: 0.0270, minibatch_init: 0.0176, losses_postprocess: 1.4961, kl_divergence: 1.4582, after_optimizer: 83.6889
calculate_losses: 61.9419
losses_init: 0.0140, forward_head: 2.6845, bptt_initial: 41.0059, tail: 2.5905, advantages_returns: 0.6901, losses: 9.2982
bptt: 4.8331
bptt_forward_core: 4.5871
update: 26.6811
clip: 2.0869
[2024-12-03 16:42:15,586][01348] RolloutWorker_w0 profile tree view:
wait_for_trajectories: 0.8423, enqueue_policy_requests: 245.3620, env_step: 2053.9667, overhead: 33.0907, complete_rollouts: 18.5961
save_policy_outputs: 51.5090
split_output_tensors: 20.2507
[2024-12-03 16:42:15,588][01348] RolloutWorker_w7 profile tree view:
wait_for_trajectories: 0.7889, enqueue_policy_requests: 253.6999, env_step: 2042.3482, overhead: 34.1669, complete_rollouts: 17.0680
save_policy_outputs: 50.2890
split_output_tensors: 19.9981
[2024-12-03 16:42:15,589][01348] Loop Runner_EvtLoop terminating...
[2024-12-03 16:42:15,591][01348] Runner profile tree view:
main_loop: 2631.8217
[2024-12-03 16:42:15,591][01348] Collected {0: 10006528}, FPS: 3802.1
[2024-12-03 16:42:16,129][01348] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json
[2024-12-03 16:42:16,131][01348] Overriding arg 'num_workers' with value 1 passed from command line
[2024-12-03 16:42:16,133][01348] Adding new argument 'no_render'=True that is not in the saved config file!
[2024-12-03 16:42:16,134][01348] Adding new argument 'save_video'=True that is not in the saved config file!
[2024-12-03 16:42:16,136][01348] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file!
[2024-12-03 16:42:16,138][01348] Adding new argument 'video_name'=None that is not in the saved config file!
[2024-12-03 16:42:16,139][01348] Adding new argument 'max_num_frames'=1000000000.0 that is not in the saved config file!
[2024-12-03 16:42:16,141][01348] Adding new argument 'max_num_episodes'=10 that is not in the saved config file!
[2024-12-03 16:42:16,142][01348] Adding new argument 'push_to_hub'=False that is not in the saved config file!
[2024-12-03 16:42:16,143][01348] Adding new argument 'hf_repository'=None that is not in the saved config file!
[2024-12-03 16:42:16,144][01348] Adding new argument 'policy_index'=0 that is not in the saved config file!
[2024-12-03 16:42:16,145][01348] Adding new argument 'eval_deterministic'=False that is not in the saved config file!
[2024-12-03 16:42:16,146][01348] Adding new argument 'train_script'=None that is not in the saved config file!
[2024-12-03 16:42:16,147][01348] Adding new argument 'enjoy_script'=None that is not in the saved config file!
[2024-12-03 16:42:16,148][01348] Using frameskip 1 and render_action_repeat=4 for evaluation
[2024-12-03 16:42:16,189][01348] Doom resolution: 160x120, resize resolution: (128, 72)
[2024-12-03 16:42:16,192][01348] RunningMeanStd input shape: (3, 72, 128)
[2024-12-03 16:42:16,196][01348] RunningMeanStd input shape: (1,)
[2024-12-03 16:42:16,217][01348] ConvEncoder: input_channels=3
[2024-12-03 16:42:16,349][01348] Conv encoder output size: 512
[2024-12-03 16:42:16,351][01348] Policy head output size: 512
[2024-12-03 16:42:16,540][01348] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002443_10006528.pth...
[2024-12-03 16:42:17,296][01348] Num frames 100...
[2024-12-03 16:42:17,426][01348] Num frames 200...
[2024-12-03 16:42:17,544][01348] Num frames 300...
[2024-12-03 16:42:17,664][01348] Num frames 400...
[2024-12-03 16:42:17,784][01348] Num frames 500...
[2024-12-03 16:42:17,902][01348] Num frames 600...
[2024-12-03 16:42:18,028][01348] Num frames 700...
[2024-12-03 16:42:18,152][01348] Num frames 800...
[2024-12-03 16:42:18,276][01348] Num frames 900...
[2024-12-03 16:42:18,398][01348] Num frames 1000...
[2024-12-03 16:42:18,528][01348] Num frames 1100...
[2024-12-03 16:42:18,651][01348] Num frames 1200...
[2024-12-03 16:42:18,770][01348] Num frames 1300...
[2024-12-03 16:42:18,894][01348] Num frames 1400...
[2024-12-03 16:42:19,023][01348] Num frames 1500...
[2024-12-03 16:42:19,142][01348] Num frames 1600...
[2024-12-03 16:42:19,273][01348] Num frames 1700...
[2024-12-03 16:42:19,396][01348] Num frames 1800...
[2024-12-03 16:42:19,523][01348] Num frames 1900...
[2024-12-03 16:42:19,647][01348] Num frames 2000...
[2024-12-03 16:42:19,772][01348] Num frames 2100...
[2024-12-03 16:42:19,824][01348] Avg episode rewards: #0: 58.999, true rewards: #0: 21.000
[2024-12-03 16:42:19,826][01348] Avg episode reward: 58.999, avg true_objective: 21.000
[2024-12-03 16:42:19,947][01348] Num frames 2200...
[2024-12-03 16:42:20,077][01348] Num frames 2300...
[2024-12-03 16:42:20,198][01348] Num frames 2400...
[2024-12-03 16:42:20,317][01348] Num frames 2500...
[2024-12-03 16:42:20,446][01348] Num frames 2600...
[2024-12-03 16:42:20,571][01348] Num frames 2700...
[2024-12-03 16:42:20,688][01348] Num frames 2800...
[2024-12-03 16:42:20,804][01348] Num frames 2900...
[2024-12-03 16:42:20,923][01348] Num frames 3000...
[2024-12-03 16:42:21,064][01348] Num frames 3100...
[2024-12-03 16:42:21,229][01348] Num frames 3200...
[2024-12-03 16:42:21,431][01348] Avg episode rewards: #0: 42.919, true rewards: #0: 16.420
[2024-12-03 16:42:21,433][01348] Avg episode reward: 42.919, avg true_objective: 16.420
[2024-12-03 16:42:21,463][01348] Num frames 3300...
[2024-12-03 16:42:21,631][01348] Num frames 3400...
[2024-12-03 16:42:21,797][01348] Num frames 3500...
[2024-12-03 16:42:21,965][01348] Num frames 3600...
[2024-12-03 16:42:22,137][01348] Num frames 3700...
[2024-12-03 16:42:22,302][01348] Num frames 3800...
[2024-12-03 16:42:22,479][01348] Num frames 3900...
[2024-12-03 16:42:22,662][01348] Num frames 4000...
[2024-12-03 16:42:22,827][01348] Num frames 4100...
[2024-12-03 16:42:22,989][01348] Num frames 4200...
[2024-12-03 16:42:23,170][01348] Num frames 4300...
[2024-12-03 16:42:23,342][01348] Num frames 4400...
[2024-12-03 16:42:23,507][01348] Num frames 4500...
[2024-12-03 16:42:23,637][01348] Num frames 4600...
[2024-12-03 16:42:23,761][01348] Num frames 4700...
[2024-12-03 16:42:23,880][01348] Num frames 4800...
[2024-12-03 16:42:24,003][01348] Num frames 4900...
[2024-12-03 16:42:24,126][01348] Num frames 5000...
[2024-12-03 16:42:24,245][01348] Num frames 5100...
[2024-12-03 16:42:24,364][01348] Num frames 5200...
[2024-12-03 16:42:24,489][01348] Num frames 5300...
[2024-12-03 16:42:24,621][01348] Avg episode rewards: #0: 47.876, true rewards: #0: 17.877
[2024-12-03 16:42:24,622][01348] Avg episode reward: 47.876, avg true_objective: 17.877
[2024-12-03 16:42:24,671][01348] Num frames 5400...
[2024-12-03 16:42:24,793][01348] Num frames 5500...
[2024-12-03 16:42:24,912][01348] Num frames 5600...
[2024-12-03 16:42:25,040][01348] Num frames 5700...
[2024-12-03 16:42:25,162][01348] Num frames 5800...
[2024-12-03 16:42:25,281][01348] Num frames 5900...
[2024-12-03 16:42:25,400][01348] Num frames 6000...
[2024-12-03 16:42:25,552][01348] Avg episode rewards: #0: 39.434, true rewards: #0: 15.185
[2024-12-03 16:42:25,553][01348] Avg episode reward: 39.434, avg true_objective: 15.185
[2024-12-03 16:42:25,587][01348] Num frames 6100...
[2024-12-03 16:42:25,713][01348] Num frames 6200...
[2024-12-03 16:42:25,834][01348] Num frames 6300...
[2024-12-03 16:42:25,955][01348] Num frames 6400...
[2024-12-03 16:42:26,084][01348] Num frames 6500...
[2024-12-03 16:42:26,209][01348] Num frames 6600...
[2024-12-03 16:42:26,328][01348] Num frames 6700...
[2024-12-03 16:42:26,451][01348] Num frames 6800...
[2024-12-03 16:42:26,573][01348] Num frames 6900...
[2024-12-03 16:42:26,702][01348] Num frames 7000...
[2024-12-03 16:42:26,821][01348] Num frames 7100...
[2024-12-03 16:42:26,988][01348] Avg episode rewards: #0: 36.390, true rewards: #0: 14.390
[2024-12-03 16:42:26,990][01348] Avg episode reward: 36.390, avg true_objective: 14.390
[2024-12-03 16:42:27,003][01348] Num frames 7200...
[2024-12-03 16:42:27,127][01348] Num frames 7300...
[2024-12-03 16:42:27,248][01348] Num frames 7400...
[2024-12-03 16:42:27,375][01348] Num frames 7500...
[2024-12-03 16:42:27,496][01348] Num frames 7600...
[2024-12-03 16:42:27,625][01348] Num frames 7700...
[2024-12-03 16:42:27,758][01348] Num frames 7800...
[2024-12-03 16:42:27,878][01348] Num frames 7900...
[2024-12-03 16:42:28,002][01348] Num frames 8000...
[2024-12-03 16:42:28,129][01348] Num frames 8100...
[2024-12-03 16:42:28,250][01348] Num frames 8200...
[2024-12-03 16:42:28,372][01348] Num frames 8300...
[2024-12-03 16:42:28,498][01348] Num frames 8400...
[2024-12-03 16:42:28,620][01348] Num frames 8500...
[2024-12-03 16:42:28,751][01348] Num frames 8600...
[2024-12-03 16:42:28,873][01348] Num frames 8700...
[2024-12-03 16:42:28,994][01348] Num frames 8800...
[2024-12-03 16:42:29,116][01348] Avg episode rewards: #0: 36.915, true rewards: #0: 14.748
[2024-12-03 16:42:29,118][01348] Avg episode reward: 36.915, avg true_objective: 14.748
[2024-12-03 16:42:29,183][01348] Num frames 8900...
[2024-12-03 16:42:29,301][01348] Num frames 9000...
[2024-12-03 16:42:29,425][01348] Num frames 9100...
[2024-12-03 16:42:29,547][01348] Num frames 9200...
[2024-12-03 16:42:29,666][01348] Num frames 9300...
[2024-12-03 16:42:29,804][01348] Num frames 9400...
[2024-12-03 16:42:29,934][01348] Num frames 9500...
[2024-12-03 16:42:30,076][01348] Num frames 9600...
[2024-12-03 16:42:30,197][01348] Num frames 9700...
[2024-12-03 16:42:30,318][01348] Num frames 9800...
[2024-12-03 16:42:30,440][01348] Num frames 9900...
[2024-12-03 16:42:30,562][01348] Num frames 10000...
[2024-12-03 16:42:30,621][01348] Avg episode rewards: #0: 35.287, true rewards: #0: 14.287
[2024-12-03 16:42:30,622][01348] Avg episode reward: 35.287, avg true_objective: 14.287
[2024-12-03 16:42:30,741][01348] Num frames 10100...
[2024-12-03 16:42:30,869][01348] Num frames 10200...
[2024-12-03 16:42:30,994][01348] Num frames 10300...
[2024-12-03 16:42:31,121][01348] Num frames 10400...
[2024-12-03 16:42:31,241][01348] Num frames 10500...
[2024-12-03 16:42:31,323][01348] Avg episode rewards: #0: 32.027, true rewards: #0: 13.152
[2024-12-03 16:42:31,325][01348] Avg episode reward: 32.027, avg true_objective: 13.152
[2024-12-03 16:42:31,422][01348] Num frames 10600...
[2024-12-03 16:42:31,543][01348] Num frames 10700...
[2024-12-03 16:42:31,666][01348] Num frames 10800...
[2024-12-03 16:42:31,790][01348] Num frames 10900...
[2024-12-03 16:42:31,912][01348] Num frames 11000...
[2024-12-03 16:42:32,044][01348] Num frames 11100...
[2024-12-03 16:42:32,167][01348] Num frames 11200...
[2024-12-03 16:42:32,288][01348] Num frames 11300...
[2024-12-03 16:42:32,412][01348] Num frames 11400...
[2024-12-03 16:42:32,508][01348] Avg episode rewards: #0: 30.701, true rewards: #0: 12.701
[2024-12-03 16:42:32,509][01348] Avg episode reward: 30.701, avg true_objective: 12.701
[2024-12-03 16:42:32,591][01348] Num frames 11500...
[2024-12-03 16:42:32,711][01348] Num frames 11600...
[2024-12-03 16:42:32,842][01348] Num frames 11700...
[2024-12-03 16:42:32,960][01348] Num frames 11800...
[2024-12-03 16:42:33,089][01348] Num frames 11900...
[2024-12-03 16:42:33,210][01348] Num frames 12000...
[2024-12-03 16:42:33,357][01348] Avg episode rewards: #0: 29.277, true rewards: #0: 12.077
[2024-12-03 16:42:33,358][01348] Avg episode reward: 29.277, avg true_objective: 12.077
[2024-12-03 16:43:47,025][01348] Replay video saved to /content/train_dir/default_experiment/replay.mp4!
[2024-12-03 16:43:47,703][01348] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json
[2024-12-03 16:43:47,705][01348] Overriding arg 'num_workers' with value 1 passed from command line
[2024-12-03 16:43:47,707][01348] Adding new argument 'no_render'=True that is not in the saved config file!
[2024-12-03 16:43:47,708][01348] Adding new argument 'save_video'=True that is not in the saved config file!
[2024-12-03 16:43:47,710][01348] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file!
[2024-12-03 16:43:47,712][01348] Adding new argument 'video_name'=None that is not in the saved config file!
[2024-12-03 16:43:47,713][01348] Adding new argument 'max_num_frames'=100000 that is not in the saved config file!
[2024-12-03 16:43:47,715][01348] Adding new argument 'max_num_episodes'=10 that is not in the saved config file!
[2024-12-03 16:43:47,716][01348] Adding new argument 'push_to_hub'=True that is not in the saved config file!
[2024-12-03 16:43:47,717][01348] Adding new argument 'hf_repository'='ahmadsy/rl_course_vizdoom_health_gathering_supreme' that is not in the saved config file!
[2024-12-03 16:43:47,718][01348] Adding new argument 'policy_index'=0 that is not in the saved config file!
[2024-12-03 16:43:47,719][01348] Adding new argument 'eval_deterministic'=False that is not in the saved config file!
[2024-12-03 16:43:47,720][01348] Adding new argument 'train_script'=None that is not in the saved config file!
[2024-12-03 16:43:47,721][01348] Adding new argument 'enjoy_script'=None that is not in the saved config file!
[2024-12-03 16:43:47,722][01348] Using frameskip 1 and render_action_repeat=4 for evaluation
[2024-12-03 16:43:47,760][01348] RunningMeanStd input shape: (3, 72, 128)
[2024-12-03 16:43:47,762][01348] RunningMeanStd input shape: (1,)
[2024-12-03 16:43:47,781][01348] ConvEncoder: input_channels=3
[2024-12-03 16:43:47,838][01348] Conv encoder output size: 512
[2024-12-03 16:43:47,840][01348] Policy head output size: 512
[2024-12-03 16:43:47,868][01348] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002443_10006528.pth...
[2024-12-03 16:43:48,469][01348] Num frames 100...
[2024-12-03 16:43:48,623][01348] Num frames 200...
[2024-12-03 16:43:48,775][01348] Num frames 300...
[2024-12-03 16:43:48,930][01348] Num frames 400...
[2024-12-03 16:43:49,117][01348] Num frames 500...
[2024-12-03 16:43:49,194][01348] Avg episode rewards: #0: 8.120, true rewards: #0: 5.120
[2024-12-03 16:43:49,195][01348] Avg episode reward: 8.120, avg true_objective: 5.120
[2024-12-03 16:43:49,329][01348] Num frames 600...
[2024-12-03 16:43:49,490][01348] Num frames 700...
[2024-12-03 16:43:49,648][01348] Num frames 800...
[2024-12-03 16:43:49,801][01348] Num frames 900...
[2024-12-03 16:43:49,951][01348] Num frames 1000...
[2024-12-03 16:43:50,115][01348] Num frames 1100...
[2024-12-03 16:43:50,273][01348] Num frames 1200...
[2024-12-03 16:43:50,428][01348] Num frames 1300...
[2024-12-03 16:43:50,504][01348] Avg episode rewards: #0: 14.560, true rewards: #0: 6.560
[2024-12-03 16:43:50,506][01348] Avg episode reward: 14.560, avg true_objective: 6.560
[2024-12-03 16:43:50,643][01348] Num frames 1400...
[2024-12-03 16:43:50,803][01348] Num frames 1500...
[2024-12-03 16:43:50,959][01348] Num frames 1600...
[2024-12-03 16:43:51,143][01348] Num frames 1700...
[2024-12-03 16:43:51,303][01348] Num frames 1800...
[2024-12-03 16:43:51,474][01348] Num frames 1900...
[2024-12-03 16:43:51,653][01348] Num frames 2000...
[2024-12-03 16:43:51,817][01348] Num frames 2100...
[2024-12-03 16:43:52,029][01348] Avg episode rewards: #0: 15.990, true rewards: #0: 7.323
[2024-12-03 16:43:52,032][01348] Avg episode reward: 15.990, avg true_objective: 7.323
[2024-12-03 16:43:52,041][01348] Num frames 2200...
[2024-12-03 16:43:52,249][01348] Num frames 2300...
[2024-12-03 16:43:52,460][01348] Num frames 2400...
[2024-12-03 16:43:52,655][01348] Num frames 2500...
[2024-12-03 16:43:52,842][01348] Num frames 2600...
[2024-12-03 16:43:53,042][01348] Num frames 2700...
[2024-12-03 16:43:53,303][01348] Num frames 2800...
[2024-12-03 16:43:53,581][01348] Num frames 2900...
[2024-12-03 16:43:53,837][01348] Num frames 3000...
[2024-12-03 16:43:54,055][01348] Avg episode rewards: #0: 16.403, true rewards: #0: 7.652
[2024-12-03 16:43:54,057][01348] Avg episode reward: 16.403, avg true_objective: 7.652
[2024-12-03 16:43:54,141][01348] Num frames 3100...
[2024-12-03 16:43:54,365][01348] Num frames 3200...
[2024-12-03 16:43:54,586][01348] Num frames 3300...
[2024-12-03 16:43:54,783][01348] Num frames 3400...
[2024-12-03 16:43:54,989][01348] Num frames 3500...
[2024-12-03 16:43:55,139][01348] Avg episode rewards: #0: 15.082, true rewards: #0: 7.082
[2024-12-03 16:43:55,141][01348] Avg episode reward: 15.082, avg true_objective: 7.082
[2024-12-03 16:43:55,258][01348] Num frames 3600...
[2024-12-03 16:43:55,464][01348] Num frames 3700...
[2024-12-03 16:43:55,647][01348] Num frames 3800...
[2024-12-03 16:43:55,858][01348] Num frames 3900...
[2024-12-03 16:43:55,966][01348] Avg episode rewards: #0: 13.208, true rewards: #0: 6.542
[2024-12-03 16:43:55,968][01348] Avg episode reward: 13.208, avg true_objective: 6.542
[2024-12-03 16:43:56,153][01348] Num frames 4000...
[2024-12-03 16:43:56,350][01348] Num frames 4100...
[2024-12-03 16:43:56,574][01348] Num frames 4200...
[2024-12-03 16:43:56,718][01348] Num frames 4300...
[2024-12-03 16:43:56,839][01348] Num frames 4400...
[2024-12-03 16:43:56,961][01348] Num frames 4500...
[2024-12-03 16:43:57,019][01348] Avg episode rewards: #0: 13.287, true rewards: #0: 6.430
[2024-12-03 16:43:57,021][01348] Avg episode reward: 13.287, avg true_objective: 6.430
[2024-12-03 16:43:57,138][01348] Num frames 4600...
[2024-12-03 16:43:57,263][01348] Num frames 4700...
[2024-12-03 16:43:57,384][01348] Num frames 4800...
[2024-12-03 16:43:57,504][01348] Num frames 4900...
[2024-12-03 16:43:57,629][01348] Num frames 5000...
[2024-12-03 16:43:57,748][01348] Num frames 5100...
[2024-12-03 16:43:57,867][01348] Num frames 5200...
[2024-12-03 16:43:57,990][01348] Num frames 5300...
[2024-12-03 16:43:58,162][01348] Avg episode rewards: #0: 14.246, true rewards: #0: 6.746
[2024-12-03 16:43:58,163][01348] Avg episode reward: 14.246, avg true_objective: 6.746
[2024-12-03 16:43:58,170][01348] Num frames 5400...
[2024-12-03 16:43:58,295][01348] Num frames 5500...
[2024-12-03 16:43:58,421][01348] Num frames 5600...
[2024-12-03 16:43:58,540][01348] Num frames 5700...
[2024-12-03 16:43:58,669][01348] Num frames 5800...
[2024-12-03 16:43:58,788][01348] Num frames 5900...
[2024-12-03 16:43:58,906][01348] Num frames 6000...
[2024-12-03 16:43:59,028][01348] Num frames 6100...
[2024-12-03 16:43:59,148][01348] Num frames 6200...
[2024-12-03 16:43:59,277][01348] Num frames 6300...
[2024-12-03 16:43:59,398][01348] Num frames 6400...
[2024-12-03 16:43:59,519][01348] Num frames 6500...
[2024-12-03 16:43:59,647][01348] Num frames 6600...
[2024-12-03 16:43:59,767][01348] Num frames 6700...
[2024-12-03 16:43:59,887][01348] Num frames 6800...
[2024-12-03 16:44:00,015][01348] Num frames 6900...
[2024-12-03 16:44:00,134][01348] Num frames 7000...
[2024-12-03 16:44:00,260][01348] Num frames 7100...
[2024-12-03 16:44:00,388][01348] Avg episode rewards: #0: 17.174, true rewards: #0: 7.952
[2024-12-03 16:44:00,389][01348] Avg episode reward: 17.174, avg true_objective: 7.952
[2024-12-03 16:44:00,446][01348] Num frames 7200...
[2024-12-03 16:44:00,565][01348] Num frames 7300...
[2024-12-03 16:44:00,694][01348] Num frames 7400...
[2024-12-03 16:44:00,770][01348] Avg episode rewards: #0: 15.915, true rewards: #0: 7.415
[2024-12-03 16:44:00,771][01348] Avg episode reward: 15.915, avg true_objective: 7.415
[2024-12-03 16:44:43,555][01348] Replay video saved to /content/train_dir/default_experiment/replay.mp4!
[2024-12-03 16:45:59,246][01348] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json
[2024-12-03 16:45:59,248][01348] Overriding arg 'num_workers' with value 1 passed from command line
[2024-12-03 16:45:59,250][01348] Adding new argument 'no_render'=True that is not in the saved config file!
[2024-12-03 16:45:59,251][01348] Adding new argument 'save_video'=True that is not in the saved config file!
[2024-12-03 16:45:59,253][01348] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file!
[2024-12-03 16:45:59,255][01348] Adding new argument 'video_name'=None that is not in the saved config file!
[2024-12-03 16:45:59,256][01348] Adding new argument 'max_num_frames'=100000 that is not in the saved config file!
[2024-12-03 16:45:59,258][01348] Adding new argument 'max_num_episodes'=10 that is not in the saved config file!
[2024-12-03 16:45:59,260][01348] Adding new argument 'push_to_hub'=True that is not in the saved config file!
[2024-12-03 16:45:59,260][01348] Adding new argument 'hf_repository'='ahmadsy/rl_course_vizdoom_health_gathering_supreme' that is not in the saved config file!
[2024-12-03 16:45:59,268][01348] Adding new argument 'policy_index'=0 that is not in the saved config file!
[2024-12-03 16:45:59,268][01348] Adding new argument 'eval_deterministic'=False that is not in the saved config file!
[2024-12-03 16:45:59,269][01348] Adding new argument 'train_script'=None that is not in the saved config file!
[2024-12-03 16:45:59,270][01348] Adding new argument 'enjoy_script'=None that is not in the saved config file!
[2024-12-03 16:45:59,271][01348] Using frameskip 1 and render_action_repeat=4 for evaluation
[2024-12-03 16:45:59,304][01348] RunningMeanStd input shape: (3, 72, 128)
[2024-12-03 16:45:59,306][01348] RunningMeanStd input shape: (1,)
[2024-12-03 16:45:59,320][01348] ConvEncoder: input_channels=3
[2024-12-03 16:45:59,359][01348] Conv encoder output size: 512
[2024-12-03 16:45:59,361][01348] Policy head output size: 512
[2024-12-03 16:45:59,379][01348] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002443_10006528.pth...
[2024-12-03 16:45:59,786][01348] Num frames 100...
[2024-12-03 16:45:59,904][01348] Num frames 200...
[2024-12-03 16:46:00,030][01348] Num frames 300...
[2024-12-03 16:46:00,150][01348] Num frames 400...
[2024-12-03 16:46:00,286][01348] Num frames 500...
[2024-12-03 16:46:00,420][01348] Num frames 600...
[2024-12-03 16:46:00,540][01348] Num frames 700...
[2024-12-03 16:46:00,669][01348] Num frames 800...
[2024-12-03 16:46:00,788][01348] Num frames 900...
[2024-12-03 16:46:00,907][01348] Num frames 1000...
[2024-12-03 16:46:01,029][01348] Num frames 1100...
[2024-12-03 16:46:01,152][01348] Num frames 1200...
[2024-12-03 16:46:01,280][01348] Num frames 1300...
[2024-12-03 16:46:01,400][01348] Num frames 1400...
[2024-12-03 16:46:01,540][01348] Avg episode rewards: #0: 39.720, true rewards: #0: 14.720
[2024-12-03 16:46:01,542][01348] Avg episode reward: 39.720, avg true_objective: 14.720
[2024-12-03 16:46:01,578][01348] Num frames 1500...
[2024-12-03 16:46:01,711][01348] Num frames 1600...
[2024-12-03 16:46:01,836][01348] Num frames 1700...
[2024-12-03 16:46:01,955][01348] Num frames 1800...
[2024-12-03 16:46:02,085][01348] Num frames 1900...
[2024-12-03 16:46:02,202][01348] Num frames 2000...
[2024-12-03 16:46:02,331][01348] Num frames 2100...
[2024-12-03 16:46:02,450][01348] Num frames 2200...
[2024-12-03 16:46:02,565][01348] Num frames 2300...
[2024-12-03 16:46:02,692][01348] Num frames 2400...
[2024-12-03 16:46:02,809][01348] Num frames 2500...
[2024-12-03 16:46:02,927][01348] Num frames 2600...
[2024-12-03 16:46:03,051][01348] Num frames 2700...
[2024-12-03 16:46:03,173][01348] Num frames 2800...
[2024-12-03 16:46:03,301][01348] Num frames 2900...
[2024-12-03 16:46:03,421][01348] Num frames 3000...
[2024-12-03 16:46:03,540][01348] Num frames 3100...
[2024-12-03 16:46:03,664][01348] Num frames 3200...
[2024-12-03 16:46:03,716][01348] Avg episode rewards: #0: 42.000, true rewards: #0: 16.000
[2024-12-03 16:46:03,717][01348] Avg episode reward: 42.000, avg true_objective: 16.000
[2024-12-03 16:46:03,837][01348] Num frames 3300...
[2024-12-03 16:46:03,959][01348] Num frames 3400...
[2024-12-03 16:46:04,082][01348] Num frames 3500...
[2024-12-03 16:46:04,201][01348] Num frames 3600...
[2024-12-03 16:46:04,329][01348] Num frames 3700...
[2024-12-03 16:46:04,450][01348] Num frames 3800...
[2024-12-03 16:46:04,568][01348] Num frames 3900...
[2024-12-03 16:46:04,692][01348] Num frames 4000...
[2024-12-03 16:46:04,822][01348] Num frames 4100...
[2024-12-03 16:46:04,944][01348] Num frames 4200...
[2024-12-03 16:46:05,068][01348] Num frames 4300...
[2024-12-03 16:46:05,221][01348] Avg episode rewards: #0: 37.613, true rewards: #0: 14.613
[2024-12-03 16:46:05,224][01348] Avg episode reward: 37.613, avg true_objective: 14.613
[2024-12-03 16:46:05,257][01348] Num frames 4400...
[2024-12-03 16:46:05,379][01348] Num frames 4500...
[2024-12-03 16:46:05,498][01348] Num frames 4600...
[2024-12-03 16:46:05,619][01348] Num frames 4700...
[2024-12-03 16:46:05,744][01348] Num frames 4800...
[2024-12-03 16:46:05,865][01348] Num frames 4900...
[2024-12-03 16:46:06,007][01348] Num frames 5000...
[2024-12-03 16:46:06,120][01348] Avg episode rewards: #0: 31.560, true rewards: #0: 12.560
[2024-12-03 16:46:06,122][01348] Avg episode reward: 31.560, avg true_objective: 12.560
[2024-12-03 16:46:06,263][01348] Num frames 5100...
[2024-12-03 16:46:06,433][01348] Num frames 5200...
[2024-12-03 16:46:06,596][01348] Num frames 5300...
[2024-12-03 16:46:06,762][01348] Num frames 5400...
[2024-12-03 16:46:06,945][01348] Num frames 5500...
[2024-12-03 16:46:07,110][01348] Num frames 5600...
[2024-12-03 16:46:07,280][01348] Num frames 5700...
[2024-12-03 16:46:07,455][01348] Num frames 5800...
[2024-12-03 16:46:07,627][01348] Num frames 5900...
[2024-12-03 16:46:07,796][01348] Num frames 6000...
[2024-12-03 16:46:07,983][01348] Num frames 6100...
[2024-12-03 16:46:08,158][01348] Num frames 6200...
[2024-12-03 16:46:08,330][01348] Num frames 6300...
[2024-12-03 16:46:08,507][01348] Num frames 6400...
[2024-12-03 16:46:08,626][01348] Num frames 6500...
[2024-12-03 16:46:08,753][01348] Avg episode rewards: #0: 32.920, true rewards: #0: 13.120
[2024-12-03 16:46:08,755][01348] Avg episode reward: 32.920, avg true_objective: 13.120
[2024-12-03 16:46:08,806][01348] Num frames 6600...
[2024-12-03 16:46:08,934][01348] Num frames 6700...
[2024-12-03 16:46:09,067][01348] Num frames 6800...
[2024-12-03 16:46:09,187][01348] Num frames 6900...
[2024-12-03 16:46:09,314][01348] Num frames 7000...
[2024-12-03 16:46:09,433][01348] Num frames 7100...
[2024-12-03 16:46:09,571][01348] Avg episode rewards: #0: 28.947, true rewards: #0: 11.947
[2024-12-03 16:46:09,574][01348] Avg episode reward: 28.947, avg true_objective: 11.947
[2024-12-03 16:46:09,615][01348] Num frames 7200...
[2024-12-03 16:46:09,733][01348] Num frames 7300...
[2024-12-03 16:46:09,857][01348] Num frames 7400...
[2024-12-03 16:46:09,988][01348] Num frames 7500...
[2024-12-03 16:46:10,111][01348] Num frames 7600...
[2024-12-03 16:46:10,242][01348] Num frames 7700...
[2024-12-03 16:46:10,363][01348] Num frames 7800...
[2024-12-03 16:46:10,486][01348] Num frames 7900...
[2024-12-03 16:46:10,608][01348] Num frames 8000...
[2024-12-03 16:46:10,728][01348] Num frames 8100...
[2024-12-03 16:46:10,849][01348] Num frames 8200...
[2024-12-03 16:46:10,982][01348] Num frames 8300...
[2024-12-03 16:46:11,103][01348] Num frames 8400...
[2024-12-03 16:46:11,229][01348] Num frames 8500...
[2024-12-03 16:46:11,353][01348] Num frames 8600...
[2024-12-03 16:46:11,473][01348] Num frames 8700...
[2024-12-03 16:46:11,592][01348] Num frames 8800...
[2024-12-03 16:46:11,716][01348] Num frames 8900...
[2024-12-03 16:46:11,836][01348] Num frames 9000...
[2024-12-03 16:46:11,967][01348] Num frames 9100...
[2024-12-03 16:46:12,094][01348] Num frames 9200...
[2024-12-03 16:46:12,231][01348] Avg episode rewards: #0: 33.097, true rewards: #0: 13.240
[2024-12-03 16:46:12,232][01348] Avg episode reward: 33.097, avg true_objective: 13.240
[2024-12-03 16:46:12,274][01348] Num frames 9300...
[2024-12-03 16:46:12,395][01348] Num frames 9400...
[2024-12-03 16:46:12,518][01348] Num frames 9500...
[2024-12-03 16:46:12,638][01348] Num frames 9600...
[2024-12-03 16:46:12,757][01348] Num frames 9700...
[2024-12-03 16:46:12,879][01348] Num frames 9800...
[2024-12-03 16:46:13,006][01348] Num frames 9900...
[2024-12-03 16:46:13,127][01348] Num frames 10000...
[2024-12-03 16:46:13,259][01348] Num frames 10100...
[2024-12-03 16:46:13,382][01348] Num frames 10200...
[2024-12-03 16:46:13,502][01348] Num frames 10300...
[2024-12-03 16:46:13,627][01348] Avg episode rewards: #0: 32.320, true rewards: #0: 12.945
[2024-12-03 16:46:13,629][01348] Avg episode reward: 32.320, avg true_objective: 12.945
[2024-12-03 16:46:13,685][01348] Num frames 10400...
[2024-12-03 16:46:13,804][01348] Num frames 10500...
[2024-12-03 16:46:13,926][01348] Num frames 10600...
[2024-12-03 16:46:14,056][01348] Num frames 10700...
[2024-12-03 16:46:14,179][01348] Num frames 10800...
[2024-12-03 16:46:14,308][01348] Num frames 10900...
[2024-12-03 16:46:14,429][01348] Num frames 11000...
[2024-12-03 16:46:14,547][01348] Num frames 11100...
[2024-12-03 16:46:14,673][01348] Avg episode rewards: #0: 30.284, true rewards: #0: 12.396
[2024-12-03 16:46:14,674][01348] Avg episode reward: 30.284, avg true_objective: 12.396
[2024-12-03 16:46:14,728][01348] Num frames 11200...
[2024-12-03 16:46:14,846][01348] Num frames 11300...
[2024-12-03 16:46:14,969][01348] Num frames 11400...
[2024-12-03 16:46:15,100][01348] Num frames 11500...
[2024-12-03 16:46:15,223][01348] Num frames 11600...
[2024-12-03 16:46:15,350][01348] Num frames 11700...
[2024-12-03 16:46:15,472][01348] Num frames 11800...
[2024-12-03 16:46:15,592][01348] Num frames 11900...
[2024-12-03 16:46:15,711][01348] Num frames 12000...
[2024-12-03 16:46:15,834][01348] Num frames 12100...
[2024-12-03 16:46:15,954][01348] Num frames 12200...
[2024-12-03 16:46:16,083][01348] Num frames 12300...
[2024-12-03 16:46:16,203][01348] Num frames 12400...
[2024-12-03 16:46:16,331][01348] Num frames 12500...
[2024-12-03 16:46:16,452][01348] Num frames 12600...
[2024-12-03 16:46:16,572][01348] Num frames 12700...
[2024-12-03 16:46:16,692][01348] Num frames 12800...
[2024-12-03 16:46:16,811][01348] Num frames 12900...
[2024-12-03 16:46:16,936][01348] Num frames 13000...
[2024-12-03 16:46:17,065][01348] Num frames 13100...
[2024-12-03 16:46:17,186][01348] Num frames 13200...
[2024-12-03 16:46:17,317][01348] Avg episode rewards: #0: 33.556, true rewards: #0: 13.256
[2024-12-03 16:46:17,319][01348] Avg episode reward: 33.556, avg true_objective: 13.256
[2024-12-03 16:47:35,363][01348] Replay video saved to /content/train_dir/default_experiment/replay.mp4!