[2024-11-28 11:51:57,223] [INFO] [real_accelerator.py:203:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2024-11-28 11:51:57,406] [INFO] [real_accelerator.py:203:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2024-11-28 11:51:57,453] [INFO] [real_accelerator.py:203:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2024-11-28 11:51:57,541] [INFO] [real_accelerator.py:203:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2024-11-28 11:51:57,571] [INFO] [real_accelerator.py:203:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2024-11-28 11:51:57,578] [INFO] [real_accelerator.py:203:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2024-11-28 11:51:57,588] [INFO] [real_accelerator.py:203:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2024-11-28 11:51:57,603] [INFO] [real_accelerator.py:203:get_accelerator] Setting ds_accelerator to cuda (auto detect) /opt/conda/envs/ptca/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_validation.py:118: UserWarning: onnxruntime training package info: package_name: onnxruntime-training warnings.warn("onnxruntime training package info: package_name: %s" % package_name) /opt/conda/envs/ptca/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_validation.py:119: UserWarning: onnxruntime training package info: __version__: 1.18.0 warnings.warn("onnxruntime training package info: __version__: %s" % version) /opt/conda/envs/ptca/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_validation.py:120: UserWarning: onnxruntime training package info: cuda_version: 12.2 warnings.warn("onnxruntime training package info: cuda_version: %s" % cuda_version) /opt/conda/envs/ptca/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_validation.py:121: UserWarning: onnxruntime build info: cudart_version: 12020 warnings.warn("onnxruntime build info: cudart_version: %s" % cudart_version) /opt/conda/envs/ptca/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_validation.py:129: UserWarning: WARNING: failed to find cudart version that matches onnxruntime build info warnings.warn("WARNING: failed to find cudart version that matches onnxruntime build info") /opt/conda/envs/ptca/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_validation.py:130: UserWarning: WARNING: found cudart versions: [12040] warnings.warn("WARNING: found cudart versions: %s" % local_cudart_versions) /opt/conda/envs/ptca/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_validation.py:118: UserWarning: onnxruntime training package info: package_name: onnxruntime-training warnings.warn("onnxruntime training package info: package_name: %s" % package_name) /opt/conda/envs/ptca/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_validation.py:119: UserWarning: onnxruntime training package info: __version__: 1.18.0 warnings.warn("onnxruntime training package info: __version__: %s" % version) /opt/conda/envs/ptca/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_validation.py:120: UserWarning: onnxruntime training package info: cuda_version: 12.2 warnings.warn("onnxruntime training package info: cuda_version: %s" % cuda_version) /opt/conda/envs/ptca/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_validation.py:121: UserWarning: onnxruntime build info: cudart_version: 12020 warnings.warn("onnxruntime build info: cudart_version: %s" % cudart_version) /opt/conda/envs/ptca/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_validation.py:129: UserWarning: WARNING: failed to find cudart version that matches onnxruntime build info warnings.warn("WARNING: failed to find cudart version that matches onnxruntime build info") /opt/conda/envs/ptca/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_validation.py:130: UserWarning: WARNING: found cudart versions: [12040] warnings.warn("WARNING: found cudart versions: %s" % local_cudart_versions) /opt/conda/envs/ptca/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_validation.py:118: UserWarning: onnxruntime training package info: package_name: onnxruntime-training warnings.warn("onnxruntime training package info: package_name: %s" % package_name) /opt/conda/envs/ptca/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_validation.py:119: UserWarning: onnxruntime training package info: __version__: 1.18.0 warnings.warn("onnxruntime training package info: __version__: %s" % version) /opt/conda/envs/ptca/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_validation.py:120: UserWarning: onnxruntime training package info: cuda_version: 12.2 warnings.warn("onnxruntime training package info: cuda_version: %s" % cuda_version) /opt/conda/envs/ptca/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_validation.py:121: UserWarning: onnxruntime build info: cudart_version: 12020 warnings.warn("onnxruntime build info: cudart_version: %s" % cudart_version) /opt/conda/envs/ptca/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_validation.py:129: UserWarning: WARNING: failed to find cudart version that matches onnxruntime build info warnings.warn("WARNING: failed to find cudart version that matches onnxruntime build info") /opt/conda/envs/ptca/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_validation.py:130: UserWarning: WARNING: found cudart versions: [12040] warnings.warn("WARNING: found cudart versions: %s" % local_cudart_versions) /opt/conda/envs/ptca/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_validation.py:118: UserWarning: onnxruntime training package info: package_name: onnxruntime-training warnings.warn("onnxruntime training package info: package_name: %s" % package_name) /opt/conda/envs/ptca/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_validation.py:119: UserWarning: onnxruntime training package info: __version__: 1.18.0 warnings.warn("onnxruntime training package info: __version__: %s" % version) /opt/conda/envs/ptca/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_validation.py:120: UserWarning: onnxruntime training package info: cuda_version: 12.2 warnings.warn("onnxruntime training package info: cuda_version: %s" % cuda_version) /opt/conda/envs/ptca/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_validation.py:121: UserWarning: onnxruntime build info: cudart_version: 12020 warnings.warn("onnxruntime build info: cudart_version: %s" % cudart_version) /opt/conda/envs/ptca/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_validation.py:129: UserWarning: WARNING: failed to find cudart version that matches onnxruntime build info warnings.warn("WARNING: failed to find cudart version that matches onnxruntime build info") /opt/conda/envs/ptca/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_validation.py:130: UserWarning: WARNING: found cudart versions: [12040] warnings.warn("WARNING: found cudart versions: %s" % local_cudart_versions) /opt/conda/envs/ptca/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_validation.py:118: UserWarning: onnxruntime training package info: package_name: onnxruntime-training warnings.warn("onnxruntime training package info: package_name: %s" % package_name) /opt/conda/envs/ptca/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_validation.py:119: UserWarning: onnxruntime training package info: __version__: 1.18.0 warnings.warn("onnxruntime training package info: __version__: %s" % version) /opt/conda/envs/ptca/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_validation.py:120: UserWarning: onnxruntime training package info: cuda_version: 12.2 warnings.warn("onnxruntime training package info: cuda_version: %s" % cuda_version) /opt/conda/envs/ptca/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_validation.py:121: UserWarning: onnxruntime build info: cudart_version: 12020 warnings.warn("onnxruntime build info: cudart_version: %s" % cudart_version) /opt/conda/envs/ptca/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_validation.py:129: UserWarning: WARNING: failed to find cudart version that matches onnxruntime build info warnings.warn("WARNING: failed to find cudart version that matches onnxruntime build info") /opt/conda/envs/ptca/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_validation.py:130: UserWarning: WARNING: found cudart versions: [12040] warnings.warn("WARNING: found cudart versions: %s" % local_cudart_versions) /opt/conda/envs/ptca/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_validation.py:118: UserWarning: onnxruntime training package info: package_name: onnxruntime-training warnings.warn("onnxruntime training package info: package_name: %s" % package_name) /opt/conda/envs/ptca/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_validation.py:119: UserWarning: onnxruntime training package info: __version__: 1.18.0 warnings.warn("onnxruntime training package info: __version__: %s" % version) /opt/conda/envs/ptca/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_validation.py:120: UserWarning: onnxruntime training package info: cuda_version: 12.2 warnings.warn("onnxruntime training package info: cuda_version: %s" % cuda_version) /opt/conda/envs/ptca/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_validation.py:121: UserWarning: onnxruntime build info: cudart_version: 12020 warnings.warn("onnxruntime build info: cudart_version: %s" % cudart_version) /opt/conda/envs/ptca/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_validation.py:129: UserWarning: WARNING: failed to find cudart version that matches onnxruntime build info warnings.warn("WARNING: failed to find cudart version that matches onnxruntime build info") /opt/conda/envs/ptca/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_validation.py:130: UserWarning: WARNING: found cudart versions: [12040] warnings.warn("WARNING: found cudart versions: %s" % local_cudart_versions) /opt/conda/envs/ptca/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_validation.py:118: UserWarning: onnxruntime training package info: package_name: onnxruntime-training warnings.warn("onnxruntime training package info: package_name: %s" % package_name) /opt/conda/envs/ptca/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_validation.py:119: UserWarning: onnxruntime training package info: __version__: 1.18.0 warnings.warn("onnxruntime training package info: __version__: %s" % version) /opt/conda/envs/ptca/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_validation.py:120: UserWarning: onnxruntime training package info: cuda_version: 12.2 warnings.warn("onnxruntime training package info: cuda_version: %s" % cuda_version) /opt/conda/envs/ptca/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_validation.py:121: UserWarning: onnxruntime build info: cudart_version: 12020 warnings.warn("onnxruntime build info: cudart_version: %s" % cudart_version) /opt/conda/envs/ptca/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_validation.py:129: UserWarning: WARNING: failed to find cudart version that matches onnxruntime build info warnings.warn("WARNING: failed to find cudart version that matches onnxruntime build info") /opt/conda/envs/ptca/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_validation.py:130: UserWarning: WARNING: found cudart versions: [12040] warnings.warn("WARNING: found cudart versions: %s" % local_cudart_versions) /opt/conda/envs/ptca/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_validation.py:118: UserWarning: onnxruntime training package info: package_name: onnxruntime-training warnings.warn("onnxruntime training package info: package_name: %s" % package_name) /opt/conda/envs/ptca/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_validation.py:119: UserWarning: onnxruntime training package info: __version__: 1.18.0 warnings.warn("onnxruntime training package info: __version__: %s" % version) /opt/conda/envs/ptca/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_validation.py:120: UserWarning: onnxruntime training package info: cuda_version: 12.2 warnings.warn("onnxruntime training package info: cuda_version: %s" % cuda_version) /opt/conda/envs/ptca/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_validation.py:121: UserWarning: onnxruntime build info: cudart_version: 12020 warnings.warn("onnxruntime build info: cudart_version: %s" % cudart_version) /opt/conda/envs/ptca/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_validation.py:129: UserWarning: WARNING: failed to find cudart version that matches onnxruntime build info warnings.warn("WARNING: failed to find cudart version that matches onnxruntime build info") /opt/conda/envs/ptca/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_validation.py:130: UserWarning: WARNING: found cudart versions: [12040] warnings.warn("WARNING: found cudart versions: %s" % local_cudart_versions) [W1128 11:51:59.866321328 Utils.hpp:164] Warning: Environment variable NCCL_BLOCKING_WAIT is deprecated; use TORCH_NCCL_BLOCKING_WAIT instead (function operator()) [W1128 11:51:59.866358718 Utils.hpp:135] Warning: Environment variable NCCL_ASYNC_ERROR_HANDLING is deprecated; use TORCH_NCCL_ASYNC_ERROR_HANDLING instead (function operator()) [W1128 11:51:59.068975946 Utils.hpp:164] Warning: Environment variable NCCL_BLOCKING_WAIT is deprecated; use TORCH_NCCL_BLOCKING_WAIT instead (function operator()) [W1128 11:51:59.068998147 Utils.hpp:135] Warning: Environment variable NCCL_ASYNC_ERROR_HANDLING is deprecated; use TORCH_NCCL_ASYNC_ERROR_HANDLING instead (function operator()) [W1128 11:51:59.146571022 Utils.hpp:164] Warning: Environment variable NCCL_BLOCKING_WAIT is deprecated; use TORCH_NCCL_BLOCKING_WAIT instead (function operator()) [W1128 11:51:59.146605125 Utils.hpp:135] Warning: Environment variable NCCL_ASYNC_ERROR_HANDLING is deprecated; use TORCH_NCCL_ASYNC_ERROR_HANDLING instead (function operator()) [W1128 11:51:59.169732867 Utils.hpp:164] Warning: Environment variable NCCL_BLOCKING_WAIT is deprecated; use TORCH_NCCL_BLOCKING_WAIT instead (function operator()) [W1128 11:51:59.169822504 Utils.hpp:135] Warning: Environment variable NCCL_ASYNC_ERROR_HANDLING is deprecated; use TORCH_NCCL_ASYNC_ERROR_HANDLING instead (function operator()) [W1128 11:51:59.287186808 Utils.hpp:164] Warning: Environment variable NCCL_BLOCKING_WAIT is deprecated; use TORCH_NCCL_BLOCKING_WAIT instead (function operator()) [W1128 11:51:59.287221022 Utils.hpp:135] Warning: Environment variable NCCL_ASYNC_ERROR_HANDLING is deprecated; use TORCH_NCCL_ASYNC_ERROR_HANDLING instead (function operator()) [W1128 11:51:59.339576722 Utils.hpp:164] Warning: Environment variable NCCL_BLOCKING_WAIT is deprecated; use TORCH_NCCL_BLOCKING_WAIT instead (function operator()) [W1128 11:51:59.339606257 Utils.hpp:135] Warning: Environment variable NCCL_ASYNC_ERROR_HANDLING is deprecated; use TORCH_NCCL_ASYNC_ERROR_HANDLING instead (function operator()) [W1128 11:51:59.494867135 Utils.hpp:164] Warning: Environment variable NCCL_BLOCKING_WAIT is deprecated; use TORCH_NCCL_BLOCKING_WAIT instead (function operator()) [W1128 11:51:59.494897541 Utils.hpp:135] Warning: Environment variable NCCL_ASYNC_ERROR_HANDLING is deprecated; use TORCH_NCCL_ASYNC_ERROR_HANDLING instead (function operator()) You set `add_prefix_space`. The tokenizer needs to be converted from the slow tokenizers [W1128 11:51:59.543628853 Utils.hpp:164] Warning: Environment variable NCCL_BLOCKING_WAIT is deprecated; use TORCH_NCCL_BLOCKING_WAIT instead (function operator()) [W1128 11:51:59.543758645 Utils.hpp:135] Warning: Environment variable NCCL_ASYNC_ERROR_HANDLING is deprecated; use TORCH_NCCL_ASYNC_ERROR_HANDLING instead (function operator()) You are using a model of type clip_text_model to instantiate a model of type . This is not supported for all configurations of models and can yield errors. You are using a model of type t5 to instantiate a model of type . This is not supported for all configurations of models and can yield errors. {'use_exponential_sigmas', 'use_beta_sigmas', 'use_karras_sigmas', 'invert_sigmas'} was not found in config. Values will be initialized to default values. Downloading shards: 0%| | 0/2 [00:00 node-0:1933430:1933430 [0] NCCL INFO NET/Plugin: Failed to find ncclNetPlugin_v7 symbol. node-0:1933430:1933430 [0] NCCL INFO NET/Plugin: Loaded net plugin NCCL RDMA Plugin v6 (v6) node-0:1933430:1933430 [0] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v7 symbol. node-0:1933430:1933430 [0] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin symbol (>= v5). ncclCollNetPlugin symbols v4 and lower are not supported. node-0:1933430:1933430 [0] NCCL INFO cudaDriverVersion 12040 NCCL version 2.19.4+cuda12.4 node-0:1933432:1933432 [2] NCCL INFO cudaDriverVersion 12040 node-0:1933432:1933432 [2] NCCL INFO Bootstrap : Using eth0:10.1.32.75<0> node-0:1933432:1933432 [2] NCCL INFO NET/Plugin: Failed to find ncclNetPlugin_v7 symbol. node-0:1933432:1933432 [2] NCCL INFO NET/Plugin: Loaded net plugin NCCL RDMA Plugin v6 (v6) node-0:1933432:1933432 [2] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v7 symbol. node-0:1933432:1933432 [2] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin symbol (>= v5). ncclCollNetPlugin symbols v4 and lower are not supported. node-0:1933430:1934228 [0] NCCL INFO Plugin Path : /opt/nccl-rdma-sharp-plugins/lib/libnccl-net.so node-0:1933430:1934228 [0] NCCL INFO P2P plugin IBext node-0:1933430:1934228 [0] NCCL INFO NCCL_IB_PCI_RELAXED_ORDERING set by environment to 1. node-0:1933430:1934228 [0] NCCL INFO NET/IB : Using [0]mlx5_0:1/IB [1]mlx5_1:1/IB [2]mlx5_2:1/IB [3]mlx5_3:1/IB [4]mlx5_5:1/IB [5]mlx5_6:1/IB [6]mlx5_7:1/IB [7]mlx5_8:1/IB [RO]; OOB eth0:10.1.32.75<0> node-0:1933430:1934228 [0] NCCL INFO Using non-device net plugin version 0 node-0:1933430:1934228 [0] NCCL INFO Using network IBext node-0:1933432:1934230 [2] NCCL INFO Plugin Path : /opt/nccl-rdma-sharp-plugins/lib/libnccl-net.so node-0:1933432:1934230 [2] NCCL INFO P2P plugin IBext /opt/conda/envs/ptca/lib/python3.10/site-packages/apex/normalization/fused_layer_norm.py:188: FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead. with torch.cuda.amp.autocast(enabled=False): node-0:1933432:1934230 [2] NCCL INFO NCCL_IB_PCI_RELAXED_ORDERING set by environment to 1. node-0:1933432:1934230 [2] NCCL INFO NET/IB : Using [0]mlx5_0:1/IB [1]mlx5_1:1/IB [2]mlx5_2:1/IB [3]mlx5_3:1/IB [4]mlx5_5:1/IB [5]mlx5_6:1/IB [6]mlx5_7:1/IB [7]mlx5_8:1/IB [RO]; OOB eth0:10.1.32.75<0> node-0:1933432:1934230 [2] NCCL INFO Using non-device net plugin version 0 node-0:1933432:1934230 [2] NCCL INFO Using network IBext /opt/conda/envs/ptca/lib/python3.10/site-packages/apex/normalization/fused_layer_norm.py:188: FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead. with torch.cuda.amp.autocast(enabled=False): /opt/conda/envs/ptca/lib/python3.10/site-packages/apex/normalization/fused_layer_norm.py:188: FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead. with torch.cuda.amp.autocast(enabled=False): node-0:1933431:1933431 [1] NCCL INFO cudaDriverVersion 12040 node-0:1933431:1933431 [1] NCCL INFO Bootstrap : Using eth0:10.1.32.75<0> node-0:1933431:1933431 [1] NCCL INFO NET/Plugin: Failed to find ncclNetPlugin_v7 symbol. node-0:1933431:1933431 [1] NCCL INFO NET/Plugin: Loaded net plugin NCCL RDMA Plugin v6 (v6) node-0:1933431:1933431 [1] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v7 symbol. node-0:1933431:1933431 [1] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin symbol (>= v5). ncclCollNetPlugin symbols v4 and lower are not supported. node-0:1933436:1933436 [6] NCCL INFO cudaDriverVersion 12040 node-0:1933436:1933436 [6] NCCL INFO Bootstrap : Using eth0:10.1.32.75<0> node-0:1933436:1933436 [6] NCCL INFO NET/Plugin: Failed to find ncclNetPlugin_v7 symbol. node-0:1933436:1933436 [6] NCCL INFO NET/Plugin: Loaded net plugin NCCL RDMA Plugin v6 (v6) node-0:1933436:1933436 [6] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v7 symbol. node-0:1933436:1933436 [6] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin symbol (>= v5). ncclCollNetPlugin symbols v4 and lower are not supported. node-0:1933433:1933433 [3] NCCL INFO cudaDriverVersion 12040 node-0:1933433:1933433 [3] NCCL INFO Bootstrap : Using eth0:10.1.32.75<0> node-0:1933431:1934545 [1] NCCL INFO Plugin Path : /opt/nccl-rdma-sharp-plugins/lib/libnccl-net.so node-0:1933431:1934545 [1] NCCL INFO P2P plugin IBext node-0:1933433:1933433 [3] NCCL INFO NET/Plugin: Failed to find ncclNetPlugin_v7 symbol. node-0:1933433:1933433 [3] NCCL INFO NET/Plugin: Loaded net plugin NCCL RDMA Plugin v6 (v6) node-0:1933433:1933433 [3] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v7 symbol. node-0:1933433:1933433 [3] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin symbol (>= v5). ncclCollNetPlugin symbols v4 and lower are not supported. node-0:1933436:1934547 [6] NCCL INFO Plugin Path : /opt/nccl-rdma-sharp-plugins/lib/libnccl-net.so node-0:1933436:1934547 [6] NCCL INFO P2P plugin IBext node-0:1933431:1934545 [1] NCCL INFO NCCL_IB_PCI_RELAXED_ORDERING set by environment to 1. node-0:1933431:1934545 [1] NCCL INFO NET/IB : Using [0]mlx5_0:1/IB [1]mlx5_1:1/IB [2]mlx5_2:1/IB [3]mlx5_3:1/IB [4]mlx5_5:1/IB [5]mlx5_6:1/IB [6]mlx5_7:1/IB [7]mlx5_8:1/IB [RO]; OOB eth0:10.1.32.75<0> node-0:1933431:1934545 [1] NCCL INFO Using non-device net plugin version 0 node-0:1933431:1934545 [1] NCCL INFO Using network IBext node-0:1933436:1934547 [6] NCCL INFO NCCL_IB_PCI_RELAXED_ORDERING set by environment to 1. node-0:1933436:1934547 [6] NCCL INFO NET/IB : Using [0]mlx5_0:1/IB [1]mlx5_1:1/IB [2]mlx5_2:1/IB [3]mlx5_3:1/IB [4]mlx5_5:1/IB [5]mlx5_6:1/IB [6]mlx5_7:1/IB [7]mlx5_8:1/IB [RO]; OOB eth0:10.1.32.75<0> node-0:1933436:1934547 [6] NCCL INFO Using non-device net plugin version 0 node-0:1933436:1934547 [6] NCCL INFO Using network IBext Using decoupled weight decay node-0:1933433:1934551 [3] NCCL INFO Plugin Path : /opt/nccl-rdma-sharp-plugins/lib/libnccl-net.so node-0:1933433:1934551 [3] NCCL INFO P2P plugin IBext Using decoupled weight decay node-0:1933433:1934551 [3] NCCL INFO NCCL_IB_PCI_RELAXED_ORDERING set by environment to 1. node-0:1933433:1934551 [3] NCCL INFO NET/IB : Using [0]mlx5_0:1/IB [1]mlx5_1:1/IB [2]mlx5_2:1/IB [3]mlx5_3:1/IB [4]mlx5_5:1/IB [5]mlx5_6:1/IB [6]mlx5_7:1/IB [7]mlx5_8:1/IB [RO]; OOB eth0:10.1.32.75<0> node-0:1933433:1934551 [3] NCCL INFO Using non-device net plugin version 0 node-0:1933433:1934551 [3] NCCL INFO Using network IBext Using decoupled weight decay /opt/conda/envs/ptca/lib/python3.10/site-packages/apex/normalization/fused_layer_norm.py:188: FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead. with torch.cuda.amp.autocast(enabled=False): /opt/conda/envs/ptca/lib/python3.10/site-packages/apex/normalization/fused_layer_norm.py:188: FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead. with torch.cuda.amp.autocast(enabled=False): /opt/conda/envs/ptca/lib/python3.10/site-packages/apex/normalization/fused_layer_norm.py:188: FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead. with torch.cuda.amp.autocast(enabled=False): node-0:1933435:1933435 [5] NCCL INFO cudaDriverVersion 12040 node-0:1933435:1933435 [5] NCCL INFO Bootstrap : Using eth0:10.1.32.75<0> node-0:1933434:1933434 [4] NCCL INFO cudaDriverVersion 12040 node-0:1933434:1933434 [4] NCCL INFO Bootstrap : Using eth0:10.1.32.75<0> node-0:1933435:1933435 [5] NCCL INFO NET/Plugin: Failed to find ncclNetPlugin_v7 symbol. node-0:1933435:1933435 [5] NCCL INFO NET/Plugin: Loaded net plugin NCCL RDMA Plugin v6 (v6) node-0:1933435:1933435 [5] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v7 symbol. node-0:1933435:1933435 [5] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin symbol (>= v5). ncclCollNetPlugin symbols v4 and lower are not supported. node-0:1933434:1933434 [4] NCCL INFO NET/Plugin: Failed to find ncclNetPlugin_v7 symbol. node-0:1933434:1933434 [4] NCCL INFO NET/Plugin: Loaded net plugin NCCL RDMA Plugin v6 (v6) node-0:1933434:1933434 [4] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v7 symbol. node-0:1933434:1933434 [4] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin symbol (>= v5). ncclCollNetPlugin symbols v4 and lower are not supported. node-0:1933437:1933437 [7] NCCL INFO cudaDriverVersion 12040 node-0:1933437:1933437 [7] NCCL INFO Bootstrap : Using eth0:10.1.32.75<0> node-0:1933437:1933437 [7] NCCL INFO NET/Plugin: Failed to find ncclNetPlugin_v7 symbol. node-0:1933437:1933437 [7] NCCL INFO NET/Plugin: Loaded net plugin NCCL RDMA Plugin v6 (v6) node-0:1933437:1933437 [7] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v7 symbol. node-0:1933437:1933437 [7] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin symbol (>= v5). ncclCollNetPlugin symbols v4 and lower are not supported. node-0:1933435:1934848 [5] NCCL INFO Plugin Path : /opt/nccl-rdma-sharp-plugins/lib/libnccl-net.so node-0:1933435:1934848 [5] NCCL INFO P2P plugin IBext node-0:1933434:1934850 [4] NCCL INFO Plugin Path : /opt/nccl-rdma-sharp-plugins/lib/libnccl-net.so node-0:1933434:1934850 [4] NCCL INFO P2P plugin IBext node-0:1933435:1934848 [5] NCCL INFO NCCL_IB_PCI_RELAXED_ORDERING set by environment to 1. node-0:1933435:1934848 [5] NCCL INFO NET/IB : Using [0]mlx5_0:1/IB [1]mlx5_1:1/IB [2]mlx5_2:1/IB [3]mlx5_3:1/IB [4]mlx5_5:1/IB [5]mlx5_6:1/IB [6]mlx5_7:1/IB [7]mlx5_8:1/IB [RO]; OOB eth0:10.1.32.75<0> node-0:1933435:1934848 [5] NCCL INFO Using non-device net plugin version 0 node-0:1933435:1934848 [5] NCCL INFO Using network IBext node-0:1933434:1934850 [4] NCCL INFO NCCL_IB_PCI_RELAXED_ORDERING set by environment to 1. node-0:1933434:1934850 [4] NCCL INFO NET/IB : Using [0]mlx5_0:1/IB [1]mlx5_1:1/IB [2]mlx5_2:1/IB [3]mlx5_3:1/IB [4]mlx5_5:1/IB [5]mlx5_6:1/IB [6]mlx5_7:1/IB [7]mlx5_8:1/IB [RO]; OOB eth0:10.1.32.75<0> node-0:1933434:1934850 [4] NCCL INFO Using non-device net plugin version 0 node-0:1933434:1934850 [4] NCCL INFO Using network IBext node-0:1933437:1934852 [7] NCCL INFO Plugin Path : /opt/nccl-rdma-sharp-plugins/lib/libnccl-net.so node-0:1933437:1934852 [7] NCCL INFO P2P plugin IBext node-0:1933437:1934852 [7] NCCL INFO NCCL_IB_PCI_RELAXED_ORDERING set by environment to 1. node-0:1933437:1934852 [7] NCCL INFO NET/IB : Using [0]mlx5_0:1/IB [1]mlx5_1:1/IB [2]mlx5_2:1/IB [3]mlx5_3:1/IB [4]mlx5_5:1/IB [5]mlx5_6:1/IB [6]mlx5_7:1/IB [7]mlx5_8:1/IB [RO]; OOB eth0:10.1.32.75<0> node-0:1933437:1934852 [7] NCCL INFO Using non-device net plugin version 0 node-0:1933437:1934852 [7] NCCL INFO Using network IBext node-0:1933437:1934852 [7] NCCL INFO comm 0x2b32c390 rank 7 nranks 8 cudaDev 7 nvmlDev 7 busId e00000 commId 0xf0f4eccd25f6e07d - Init START node-0:1933434:1934850 [4] NCCL INFO comm 0x1c4fa290 rank 4 nranks 8 cudaDev 4 nvmlDev 4 busId b00000 commId 0xf0f4eccd25f6e07d - Init START node-0:1933435:1934848 [5] NCCL INFO comm 0x18508c00 rank 5 nranks 8 cudaDev 5 nvmlDev 5 busId c00000 commId 0xf0f4eccd25f6e07d - Init START node-0:1933430:1934228 [0] NCCL INFO comm 0x2691bc80 rank 0 nranks 8 cudaDev 0 nvmlDev 0 busId 100000 commId 0xf0f4eccd25f6e07d - Init START node-0:1933433:1934551 [3] NCCL INFO comm 0x280a7fe0 rank 3 nranks 8 cudaDev 3 nvmlDev 3 busId 400000 commId 0xf0f4eccd25f6e07d - Init START node-0:1933432:1934230 [2] NCCL INFO comm 0x190895c0 rank 2 nranks 8 cudaDev 2 nvmlDev 2 busId 300000 commId 0xf0f4eccd25f6e07d - Init START node-0:1933431:1934545 [1] NCCL INFO comm 0x18aedf80 rank 1 nranks 8 cudaDev 1 nvmlDev 1 busId 200000 commId 0xf0f4eccd25f6e07d - Init START node-0:1933436:1934547 [6] NCCL INFO comm 0x191c29e0 rank 6 nranks 8 cudaDev 6 nvmlDev 6 busId d00000 commId 0xf0f4eccd25f6e07d - Init START node-0:1933433:1934551 [3] NCCL INFO Loading topology file /opt/microsoft/ndv4-topo.xml node-0:1933437:1934852 [7] NCCL INFO Loading topology file /opt/microsoft/ndv4-topo.xml node-0:1933432:1934230 [2] NCCL INFO Loading topology file /opt/microsoft/ndv4-topo.xml node-0:1933434:1934850 [4] NCCL INFO Loading topology file /opt/microsoft/ndv4-topo.xml node-0:1933435:1934848 [5] NCCL INFO Loading topology file /opt/microsoft/ndv4-topo.xml node-0:1933431:1934545 [1] NCCL INFO Loading topology file /opt/microsoft/ndv4-topo.xml node-0:1933436:1934547 [6] NCCL INFO Loading topology file /opt/microsoft/ndv4-topo.xml node-0:1933430:1934228 [0] NCCL INFO Loading topology file /opt/microsoft/ndv4-topo.xml node-0:1933437:1934852 [7] NCCL INFO Loading unnamed topology node-0:1933434:1934850 [4] NCCL INFO Loading unnamed topology node-0:1933432:1934230 [2] NCCL INFO Loading unnamed topology node-0:1933433:1934551 [3] NCCL INFO Loading unnamed topology node-0:1933435:1934848 [5] NCCL INFO Loading unnamed topology node-0:1933431:1934545 [1] NCCL INFO Loading unnamed topology node-0:1933436:1934547 [6] NCCL INFO Loading unnamed topology node-0:1933430:1934228 [0] NCCL INFO Loading unnamed topology node-0:1933437:1934852 [7] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:01.0 node-0:1933434:1934850 [4] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:01.0 node-0:1933432:1934230 [2] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:01.0 node-0:1933431:1934545 [1] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:01.0 node-0:1933436:1934547 [6] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:01.0 node-0:1933433:1934551 [3] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:01.0 node-0:1933430:1934228 [0] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:01.0 node-0:1933435:1934848 [5] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:01.0 node-0:1933437:1934852 [7] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:01.0 node-0:1933434:1934850 [4] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:01.0 node-0:1933432:1934230 [2] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:01.0 node-0:1933431:1934545 [1] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:01.0 node-0:1933436:1934547 [6] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:01.0 node-0:1933435:1934848 [5] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:01.0 node-0:1933433:1934551 [3] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:01.0 node-0:1933430:1934228 [0] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:01.0 node-0:1933437:1934852 [7] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:02.0 node-0:1933434:1934850 [4] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:02.0 node-0:1933432:1934230 [2] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:02.0 node-0:1933431:1934545 [1] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:02.0 node-0:1933436:1934547 [6] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:02.0 node-0:1933435:1934848 [5] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:02.0 node-0:1933433:1934551 [3] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:02.0 node-0:1933430:1934228 [0] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:02.0 node-0:1933437:1934852 [7] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:02.0 node-0:1933434:1934850 [4] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:02.0 node-0:1933432:1934230 [2] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:02.0 node-0:1933431:1934545 [1] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:02.0 node-0:1933435:1934848 [5] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:02.0 node-0:1933436:1934547 [6] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:02.0 node-0:1933433:1934551 [3] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:02.0 node-0:1933430:1934228 [0] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:02.0 node-0:1933437:1934852 [7] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:03.0 node-0:1933434:1934850 [4] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:03.0 node-0:1933432:1934230 [2] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:03.0 node-0:1933431:1934545 [1] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:03.0 node-0:1933435:1934848 [5] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:03.0 node-0:1933436:1934547 [6] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:03.0 node-0:1933433:1934551 [3] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:03.0 node-0:1933430:1934228 [0] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:03.0 node-0:1933437:1934852 [7] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:03.0 node-0:1933434:1934850 [4] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:03.0 node-0:1933432:1934230 [2] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:03.0 node-0:1933431:1934545 [1] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:03.0 node-0:1933435:1934848 [5] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:03.0 node-0:1933436:1934547 [6] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:03.0 node-0:1933433:1934551 [3] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:03.0 node-0:1933430:1934228 [0] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:03.0 node-0:1933437:1934852 [7] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:04.0 node-0:1933434:1934850 [4] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:04.0 node-0:1933432:1934230 [2] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:04.0 node-0:1933431:1934545 [1] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:04.0 node-0:1933435:1934848 [5] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:04.0 node-0:1933436:1934547 [6] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:04.0 node-0:1933433:1934551 [3] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:04.0 node-0:1933430:1934228 [0] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:04.0 node-0:1933437:1934852 [7] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:04.0 node-0:1933434:1934850 [4] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:04.0 node-0:1933432:1934230 [2] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:04.0 node-0:1933431:1934545 [1] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:04.0 node-0:1933435:1934848 [5] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:04.0 node-0:1933436:1934547 [6] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:04.0 node-0:1933433:1934551 [3] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:04.0 node-0:1933430:1934228 [0] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:04.0 node-0:1933437:1934852 [7] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:01.0 node-0:1933437:1934852 [7] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:03.0 node-0:1933437:1934852 [7] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:04.0 node-0:1933437:1934852 [7] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:04.0 node-0:1933437:1934852 [7] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:02.0 node-0:1933437:1934852 [7] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:01.0 node-0:1933437:1934852 [7] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:02.0 node-0:1933437:1934852 [7] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:03.0 node-0:1933434:1934850 [4] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:01.0 node-0:1933437:1934852 [7] NCCL INFO NCCL_NET_GDR_LEVEL set by environment to SYS node-0:1933434:1934850 [4] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:03.0 node-0:1933434:1934850 [4] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:04.0 node-0:1933434:1934850 [4] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:04.0 node-0:1933437:1934852 [7] NCCL INFO === System : maxBw 240.0 totalBw 240.0 === node-0:1933437:1934852 [7] NCCL INFO CPU/0 (1/2/-1) node-0:1933437:1934852 [7] NCCL INFO + PCI[24.0] - PCI/FFFFFF010 (0) node-0:1933437:1934852 [7] NCCL INFO + PCI[24.0] - GPU/100000 (0) node-0:1933437:1934852 [7] NCCL INFO + NVL[240.0] - NVS/0 node-0:1933437:1934852 [7] NCCL INFO + PCI[24.0] - NIC/10100000 node-0:1933437:1934852 [7] NCCL INFO + PCI[24.0] - GPU/200000 (1) node-0:1933437:1934852 [7] NCCL INFO + NVL[240.0] - NVS/0 node-0:1933437:1934852 [7] NCCL INFO + PCI[24.0] - NIC/10200000 node-0:1933437:1934852 [7] NCCL INFO + SYS[16.0] - CPU/1 node-0:1933437:1934852 [7] NCCL INFO + SYS[16.0] - CPU/2 node-0:1933437:1934852 [7] NCCL INFO + SYS[16.0] - CPU/3 node-0:1933437:1934852 [7] NCCL INFO CPU/1 (1/2/-1) node-0:1933437:1934852 [7] NCCL INFO + PCI[24.0] - PCI/FFFFFF020 (0) node-0:1933437:1934852 [7] NCCL INFO + PCI[24.0] - GPU/300000 (2) node-0:1933437:1934852 [7] NCCL INFO + NVL[240.0] - NVS/0 node-0:1933437:1934852 [7] NCCL INFO + PCI[24.0] - NIC/10300000 node-0:1933437:1934852 [7] NCCL INFO + PCI[24.0] - GPU/400000 (3) node-0:1933437:1934852 [7] NCCL INFO + NVL[240.0] - NVS/0 node-0:1933437:1934852 [7] NCCL INFO + PCI[24.0] - NIC/10400000 node-0:1933437:1934852 [7] NCCL INFO + SYS[16.0] - CPU/0 node-0:1933437:1934852 [7] NCCL INFO + SYS[16.0] - CPU/2 node-0:1933437:1934852 [7] NCCL INFO + SYS[16.0] - CPU/3 node-0:1933437:1934852 [7] NCCL INFO CPU/2 (1/2/-1) node-0:1933437:1934852 [7] NCCL INFO + PCI[24.0] - PCI/FFFFFF030 (0) node-0:1933437:1934852 [7] NCCL INFO + PCI[24.0] - GPU/B00000 (4) node-0:1933437:1934852 [7] NCCL INFO + NVL[240.0] - NVS/0 node-0:1933437:1934852 [7] NCCL INFO + PCI[24.0] - NIC/10500000 node-0:1933437:1934852 [7] NCCL INFO + PCI[24.0] - GPU/C00000 (5) node-0:1933437:1934852 [7] NCCL INFO + NVL[240.0] - NVS/0 node-0:1933437:1934852 [7] NCCL INFO + PCI[24.0] - NIC/10600000 node-0:1933437:1934852 [7] NCCL INFO + SYS[16.0] - CPU/0 node-0:1933437:1934852 [7] NCCL INFO + SYS[16.0] - CPU/1 node-0:1933437:1934852 [7] NCCL INFO + SYS[16.0] - CPU/3 node-0:1933437:1934852 [7] NCCL INFO CPU/3 (1/2/-1) node-0:1933437:1934852 [7] NCCL INFO + PCI[24.0] - PCI/FFFFFF040 (0) node-0:1933437:1934852 [7] NCCL INFO + PCI[24.0] - GPU/D00000 (6) node-0:1933437:1934852 [7] NCCL INFO + NVL[240.0] - NVS/0 node-0:1933437:1934852 [7] NCCL INFO + PCI[24.0] - NIC/10700000 node-0:1933437:1934852 [7] NCCL INFO + PCI[24.0] - GPU/E00000 (7) node-0:1933437:1934852 [7] NCCL INFO + NVL[240.0] - NVS/0 node-0:1933437:1934852 [7] NCCL INFO + PCI[24.0] - NIC/10800000 node-0:1933437:1934852 [7] NCCL INFO + SYS[16.0] - CPU/0 node-0:1933437:1934852 [7] NCCL INFO + SYS[16.0] - CPU/1 node-0:1933437:1934852 [7] NCCL INFO + SYS[16.0] - CPU/2 node-0:1933437:1934852 [7] NCCL INFO ========================================== node-0:1933437:1934852 [7] NCCL INFO GPU/100000 :GPU/100000 (0/5000.000000/LOC) GPU/200000 (2/240.000000/NVL) GPU/300000 (2/240.000000/NVL) GPU/400000 (2/240.000000/NVL) GPU/B00000 (2/240.000000/NVL) GPU/C00000 (2/240.000000/NVL) GPU/D00000 (2/240.000000/NVL) GPU/E00000 (2/240.000000/NVL) NVS/0 (1/240.000000/NVL) CPU/0 (2/24.000000/PHB) CPU/1 (3/16.000000/SYS) CPU/2 (3/16.000000/SYS) CPU/3 (3/16.000000/SYS) node-0:1933437:1934852 [7] NCCL INFO GPU/200000 :GPU/100000 (2/240.000000/NVL) GPU/200000 (0/5000.000000/LOC) GPU/300000 (2/240.000000/NVL) GPU/400000 (2/240.000000/NVL) GPU/B00000 (2/240.000000/NVL) GPU/C00000 (2/240.000000/NVL) GPU/D00000 (2/240.000000/NVL) GPU/E00000 (2/240.000000/NVL) NVS/0 (1/240.000000/NVL) CPU/0 (2/24.000000/PHB) CPU/1 (3/16.000000/SYS) CPU/2 (3/16.000000/SYS) CPU/3 (3/16.000000/SYS) node-0:1933437:1934852 [7] NCCL INFO GPU/300000 :GPU/100000 (2/240.000000/NVL) GPU/200000 (2/240.000000/NVL) GPU/300000 (0/5000.000000/LOC) GPU/400000 (2/240.000000/NVL) GPU/B00000 (2/240.000000/NVL) GPU/C00000 (2/240.000000/NVL) GPU/D00000 (2/240.000000/NVL) GPU/E00000 (2/240.000000/NVL) NVS/0 (1/240.000000/NVL) CPU/0 (3/16.000000/SYS) CPU/1 (2/24.000000/PHB) CPU/2 (3/16.000000/SYS) CPU/3 (3/16.000000/SYS) node-0:1933437:1934852 [7] NCCL INFO GPU/400000 :GPU/100000 (2/240.000000/NVL) GPU/200000 (2/240.000000/NVL) GPU/300000 (2/240.000000/NVL) GPU/400000 (0/5000.000000/LOC) GPU/B00000 (2/240.000000/NVL) GPU/C00000 (2/240.000000/NVL) GPU/D00000 (2/240.000000/NVL) GPU/E00000 (2/240.000000/NVL) NVS/0 (1/240.000000/NVL) CPU/0 (3/16.000000/SYS) CPU/1 (2/24.000000/PHB) CPU/2 (3/16.000000/SYS) CPU/3 (3/16.000000/SYS) node-0:1933437:1934852 [7] NCCL INFO GPU/B00000 :GPU/100000 (2/240.000000/NVL) GPU/200000 (2/240.000000/NVL) GPU/300000 (2/240.000000/NVL) GPU/400000 (2/240.000000/NVL) GPU/B00000 (0/5000.000000/LOC) GPU/C00000 (2/240.000000/NVL) GPU/D00000 (2/240.000000/NVL) GPU/E00000 (2/240.000000/NVL) NVS/0 (1/240.000000/NVL) CPU/0 (3/16.000000/SYS) CPU/1 (3/16.000000/SYS) CPU/2 (2/24.000000/PHB) CPU/3 (3/16.000000/SYS) node-0:1933437:1934852 [7] NCCL INFO GPU/C00000 :GPU/100000 (2/240.000000/NVL) GPU/200000 (2/240.000000/NVL) GPU/300000 (2/240.000000/NVL) GPU/400000 (2/240.000000/NVL) GPU/B00000 (2/240.000000/NVL) GPU/C00000 (0/5000.000000/LOC) GPU/D00000 (2/240.000000/NVL) GPU/E00000 (2/240.000000/NVL) NVS/0 (1/240.000000/NVL) CPU/0 (3/16.000000/SYS) CPU/1 (3/16.000000/SYS) CPU/2 (2/24.000000/PHB) CPU/3 (3/16.000000/SYS) node-0:1933437:1934852 [7] NCCL INFO GPU/D00000 :GPU/100000 (2/240.000000/NVL) GPU/200000 (2/240.000000/NVL) GPU/300000 (2/240.000000/NVL) GPU/400000 (2/240.000000/NVL) GPU/B00000 (2/240.000000/NVL) GPU/C00000 (2/240.000000/NVL) GPU/D00000 (0/5000.000000/LOC) GPU/E00000 (2/240.000000/NVL) NVS/0 (1/240.000000/NVL) CPU/0 (3/16.000000/SYS) CPU/1 (3/16.000000/SYS) CPU/2 (3/16.000000/SYS) CPU/3 (2/24.000000/PHB) node-0:1933437:1934852 [7] NCCL INFO GPU/E00000 :GPU/100000 (2/240.000000/NVL) GPU/200000 (2/240.000000/NVL) GPU/300000 (2/240.000000/NVL) GPU/400000 (2/240.000000/NVL) GPU/B00000 (2/240.000000/NVL) GPU/C00000 (2/240.000000/NVL) GPU/D00000 (2/240.000000/NVL) GPU/E00000 (0/5000.000000/LOC) NVS/0 (1/240.000000/NVL) CPU/0 (3/16.000000/SYS) CPU/1 (3/16.000000/SYS) CPU/2 (3/16.000000/SYS) CPU/3 (2/24.000000/PHB) node-0:1933434:1934850 [4] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:02.0 node-0:1933437:1934852 [7] NCCL INFO Setting affinity for GPU 7 to ffff,0000ffff node-0:1933437:1934852 [7] NCCL INFO NVLS multicast support is not available on dev 7 node-0:1933437:1934852 [7] NCCL INFO Pattern 4, crossNic 0, nChannels 12, bw 20.000000/20.000000, type NVL/PIX, sameChannels 1 node-0:1933437:1934852 [7] NCCL INFO 0 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933437:1934852 [7] NCCL INFO 1 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933437:1934852 [7] NCCL INFO 2 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933437:1934852 [7] NCCL INFO 3 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933437:1934852 [7] NCCL INFO 4 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933437:1934852 [7] NCCL INFO 5 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933434:1934850 [4] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:01.0 node-0:1933437:1934852 [7] NCCL INFO 6 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933437:1934852 [7] NCCL INFO 7 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933437:1934852 [7] NCCL INFO 8 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933437:1934852 [7] NCCL INFO 9 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933437:1934852 [7] NCCL INFO 10 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933437:1934852 [7] NCCL INFO 11 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933437:1934852 [7] NCCL INFO Pattern 1, crossNic 0, nChannels 12, bw 20.000000/20.000000, type NVL/PIX, sameChannels 1 node-0:1933437:1934852 [7] NCCL INFO 0 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933437:1934852 [7] NCCL INFO 1 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933437:1934852 [7] NCCL INFO 2 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933437:1934852 [7] NCCL INFO 3 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933437:1934852 [7] NCCL INFO 4 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933437:1934852 [7] NCCL INFO 5 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933437:1934852 [7] NCCL INFO 6 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933437:1934852 [7] NCCL INFO 7 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933437:1934852 [7] NCCL INFO 8 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933437:1934852 [7] NCCL INFO 9 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933437:1934852 [7] NCCL INFO 10 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933437:1934852 [7] NCCL INFO 11 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933434:1934850 [4] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:02.0 node-0:1933432:1934230 [2] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:01.0 node-0:1933434:1934850 [4] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:03.0 node-0:1933432:1934230 [2] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:03.0 node-0:1933431:1934545 [1] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:01.0 node-0:1933432:1934230 [2] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:04.0 node-0:1933431:1934545 [1] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:03.0 node-0:1933432:1934230 [2] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:04.0 node-0:1933431:1934545 [1] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:04.0 node-0:1933432:1934230 [2] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:02.0 node-0:1933435:1934848 [5] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:01.0 node-0:1933431:1934545 [1] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:04.0 node-0:1933432:1934230 [2] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:01.0 node-0:1933435:1934848 [5] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:03.0 node-0:1933434:1934850 [4] NCCL INFO NCCL_NET_GDR_LEVEL set by environment to SYS node-0:1933431:1934545 [1] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:02.0 node-0:1933436:1934547 [6] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:01.0 node-0:1933432:1934230 [2] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:02.0 node-0:1933435:1934848 [5] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:04.0 node-0:1933431:1934545 [1] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:01.0 node-0:1933434:1934850 [4] NCCL INFO === System : maxBw 240.0 totalBw 240.0 === node-0:1933434:1934850 [4] NCCL INFO CPU/0 (1/2/-1) node-0:1933434:1934850 [4] NCCL INFO + PCI[24.0] - PCI/FFFFFF010 (0) node-0:1933434:1934850 [4] NCCL INFO + PCI[24.0] - GPU/100000 (0) node-0:1933434:1934850 [4] NCCL INFO + NVL[240.0] - NVS/0 node-0:1933434:1934850 [4] NCCL INFO + PCI[24.0] - NIC/10100000 node-0:1933434:1934850 [4] NCCL INFO + PCI[24.0] - GPU/200000 (1) node-0:1933434:1934850 [4] NCCL INFO + NVL[240.0] - NVS/0 node-0:1933434:1934850 [4] NCCL INFO + PCI[24.0] - NIC/10200000 node-0:1933434:1934850 [4] NCCL INFO + SYS[16.0] - CPU/1 node-0:1933434:1934850 [4] NCCL INFO + SYS[16.0] - CPU/2 node-0:1933434:1934850 [4] NCCL INFO + SYS[16.0] - CPU/3 node-0:1933434:1934850 [4] NCCL INFO CPU/1 (1/2/-1) node-0:1933434:1934850 [4] NCCL INFO + PCI[24.0] - PCI/FFFFFF020 (0) node-0:1933434:1934850 [4] NCCL INFO + PCI[24.0] - GPU/300000 (2) node-0:1933434:1934850 [4] NCCL INFO + NVL[240.0] - NVS/0 node-0:1933434:1934850 [4] NCCL INFO + PCI[24.0] - NIC/10300000 node-0:1933434:1934850 [4] NCCL INFO + PCI[24.0] - GPU/400000 (3) node-0:1933434:1934850 [4] NCCL INFO + NVL[240.0] - NVS/0 node-0:1933434:1934850 [4] NCCL INFO + PCI[24.0] - NIC/10400000 node-0:1933434:1934850 [4] NCCL INFO + SYS[16.0] - CPU/0 node-0:1933434:1934850 [4] NCCL INFO + SYS[16.0] - CPU/2 node-0:1933434:1934850 [4] NCCL INFO + SYS[16.0] - CPU/3 node-0:1933434:1934850 [4] NCCL INFO CPU/2 (1/2/-1) node-0:1933434:1934850 [4] NCCL INFO + PCI[24.0] - PCI/FFFFFF030 (0) node-0:1933434:1934850 [4] NCCL INFO + PCI[24.0] - GPU/B00000 (4) node-0:1933434:1934850 [4] NCCL INFO + NVL[240.0] - NVS/0 node-0:1933434:1934850 [4] NCCL INFO + PCI[24.0] - NIC/10500000 node-0:1933434:1934850 [4] NCCL INFO + PCI[24.0] - GPU/C00000 (5) node-0:1933434:1934850 [4] NCCL INFO + NVL[240.0] - NVS/0 node-0:1933434:1934850 [4] NCCL INFO + PCI[24.0] - NIC/10600000 node-0:1933434:1934850 [4] NCCL INFO + SYS[16.0] - CPU/0 node-0:1933434:1934850 [4] NCCL INFO + SYS[16.0] - CPU/1 node-0:1933434:1934850 [4] NCCL INFO + SYS[16.0] - CPU/3 node-0:1933434:1934850 [4] NCCL INFO CPU/3 (1/2/-1) node-0:1933434:1934850 [4] NCCL INFO + PCI[24.0] - PCI/FFFFFF040 (0) node-0:1933434:1934850 [4] NCCL INFO + PCI[24.0] - GPU/D00000 (6) node-0:1933434:1934850 [4] NCCL INFO + NVL[240.0] - NVS/0 node-0:1933434:1934850 [4] NCCL INFO + PCI[24.0] - NIC/10700000 node-0:1933434:1934850 [4] NCCL INFO + PCI[24.0] - GPU/E00000 (7) node-0:1933434:1934850 [4] NCCL INFO + NVL[240.0] - NVS/0 node-0:1933434:1934850 [4] NCCL INFO + PCI[24.0] - NIC/10800000 node-0:1933434:1934850 [4] NCCL INFO + SYS[16.0] - CPU/0 node-0:1933434:1934850 [4] NCCL INFO + SYS[16.0] - CPU/1 node-0:1933434:1934850 [4] NCCL INFO + SYS[16.0] - CPU/2 node-0:1933434:1934850 [4] NCCL INFO ========================================== node-0:1933434:1934850 [4] NCCL INFO GPU/100000 :GPU/100000 (0/5000.000000/LOC) GPU/200000 (2/240.000000/NVL) GPU/300000 (2/240.000000/NVL) GPU/400000 (2/240.000000/NVL) GPU/B00000 (2/240.000000/NVL) GPU/C00000 (2/240.000000/NVL) GPU/D00000 (2/240.000000/NVL) GPU/E00000 (2/240.000000/NVL) NVS/0 (1/240.000000/NVL) CPU/0 (2/24.000000/PHB) CPU/1 (3/16.000000/SYS) CPU/2 (3/16.000000/SYS) CPU/3 (3/16.000000/SYS) node-0:1933434:1934850 [4] NCCL INFO GPU/200000 :GPU/100000 (2/240.000000/NVL) GPU/200000 (0/5000.000000/LOC) GPU/300000 (2/240.000000/NVL) GPU/400000 (2/240.000000/NVL) GPU/B00000 (2/240.000000/NVL) GPU/C00000 (2/240.000000/NVL) GPU/D00000 (2/240.000000/NVL) GPU/E00000 (2/240.000000/NVL) NVS/0 (1/240.000000/NVL) CPU/0 (2/24.000000/PHB) CPU/1 (3/16.000000/SYS) CPU/2 (3/16.000000/SYS) CPU/3 (3/16.000000/SYS) node-0:1933436:1934547 [6] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:03.0 node-0:1933434:1934850 [4] NCCL INFO GPU/300000 :GPU/100000 (2/240.000000/NVL) GPU/200000 (2/240.000000/NVL) GPU/300000 (0/5000.000000/LOC) GPU/400000 (2/240.000000/NVL) GPU/B00000 (2/240.000000/NVL) GPU/C00000 (2/240.000000/NVL) GPU/D00000 (2/240.000000/NVL) GPU/E00000 (2/240.000000/NVL) NVS/0 (1/240.000000/NVL) CPU/0 (3/16.000000/SYS) CPU/1 (2/24.000000/PHB) CPU/2 (3/16.000000/SYS) CPU/3 (3/16.000000/SYS) node-0:1933434:1934850 [4] NCCL INFO GPU/400000 :GPU/100000 (2/240.000000/NVL) GPU/200000 (2/240.000000/NVL) GPU/300000 (2/240.000000/NVL) GPU/400000 (0/5000.000000/LOC) GPU/B00000 (2/240.000000/NVL) GPU/C00000 (2/240.000000/NVL) GPU/D00000 (2/240.000000/NVL) GPU/E00000 (2/240.000000/NVL) NVS/0 (1/240.000000/NVL) CPU/0 (3/16.000000/SYS) CPU/1 (2/24.000000/PHB) CPU/2 (3/16.000000/SYS) CPU/3 (3/16.000000/SYS) node-0:1933434:1934850 [4] NCCL INFO GPU/B00000 :GPU/100000 (2/240.000000/NVL) GPU/200000 (2/240.000000/NVL) GPU/300000 (2/240.000000/NVL) GPU/400000 (2/240.000000/NVL) GPU/B00000 (0/5000.000000/LOC) GPU/C00000 (2/240.000000/NVL) GPU/D00000 (2/240.000000/NVL) GPU/E00000 (2/240.000000/NVL) NVS/0 (1/240.000000/NVL) CPU/0 (3/16.000000/SYS) CPU/1 (3/16.000000/SYS) CPU/2 (2/24.000000/PHB) CPU/3 (3/16.000000/SYS) node-0:1933434:1934850 [4] NCCL INFO GPU/C00000 :GPU/100000 (2/240.000000/NVL) GPU/200000 (2/240.000000/NVL) GPU/300000 (2/240.000000/NVL) GPU/400000 (2/240.000000/NVL) GPU/B00000 (2/240.000000/NVL) GPU/C00000 (0/5000.000000/LOC) GPU/D00000 (2/240.000000/NVL) GPU/E00000 (2/240.000000/NVL) NVS/0 (1/240.000000/NVL) CPU/0 (3/16.000000/SYS) CPU/1 (3/16.000000/SYS) CPU/2 (2/24.000000/PHB) CPU/3 (3/16.000000/SYS) node-0:1933432:1934230 [2] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:03.0 node-0:1933434:1934850 [4] NCCL INFO GPU/D00000 :GPU/100000 (2/240.000000/NVL) GPU/200000 (2/240.000000/NVL) GPU/300000 (2/240.000000/NVL) GPU/400000 (2/240.000000/NVL) GPU/B00000 (2/240.000000/NVL) GPU/C00000 (2/240.000000/NVL) GPU/D00000 (0/5000.000000/LOC) GPU/E00000 (2/240.000000/NVL) NVS/0 (1/240.000000/NVL) CPU/0 (3/16.000000/SYS) CPU/1 (3/16.000000/SYS) CPU/2 (3/16.000000/SYS) CPU/3 (2/24.000000/PHB) node-0:1933434:1934850 [4] NCCL INFO GPU/E00000 :GPU/100000 (2/240.000000/NVL) GPU/200000 (2/240.000000/NVL) GPU/300000 (2/240.000000/NVL) GPU/400000 (2/240.000000/NVL) GPU/B00000 (2/240.000000/NVL) GPU/C00000 (2/240.000000/NVL) GPU/D00000 (2/240.000000/NVL) GPU/E00000 (0/5000.000000/LOC) NVS/0 (1/240.000000/NVL) CPU/0 (3/16.000000/SYS) CPU/1 (3/16.000000/SYS) CPU/2 (3/16.000000/SYS) CPU/3 (2/24.000000/PHB) node-0:1933434:1934850 [4] NCCL INFO Setting affinity for GPU 4 to ffff,0000ffff node-0:1933435:1934848 [5] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:04.0 node-0:1933431:1934545 [1] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:02.0 node-0:1933436:1934547 [6] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:04.0 node-0:1933433:1934551 [3] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:01.0 node-0:1933435:1934848 [5] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:02.0 node-0:1933431:1934545 [1] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:03.0 node-0:1933433:1934551 [3] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:03.0 node-0:1933436:1934547 [6] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:04.0 node-0:1933435:1934848 [5] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:01.0 node-0:1933434:1934850 [4] NCCL INFO NVLS multicast support is not available on dev 4 node-0:1933434:1934850 [4] NCCL INFO Pattern 4, crossNic 0, nChannels 12, bw 20.000000/20.000000, type NVL/PIX, sameChannels 1 node-0:1933434:1934850 [4] NCCL INFO 0 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933434:1934850 [4] NCCL INFO 1 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933433:1934551 [3] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:04.0 node-0:1933434:1934850 [4] NCCL INFO 2 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933430:1934228 [0] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:01.0 node-0:1933434:1934850 [4] NCCL INFO 3 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933434:1934850 [4] NCCL INFO 4 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933434:1934850 [4] NCCL INFO 5 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933434:1934850 [4] NCCL INFO 6 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933434:1934850 [4] NCCL INFO 7 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933434:1934850 [4] NCCL INFO 8 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933434:1934850 [4] NCCL INFO 9 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933436:1934547 [6] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:02.0 node-0:1933434:1934850 [4] NCCL INFO 10 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933434:1934850 [4] NCCL INFO 11 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933435:1934848 [5] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:02.0 node-0:1933434:1934850 [4] NCCL INFO Pattern 1, crossNic 0, nChannels 12, bw 20.000000/20.000000, type NVL/PIX, sameChannels 1 node-0:1933434:1934850 [4] NCCL INFO 0 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933434:1934850 [4] NCCL INFO 1 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933434:1934850 [4] NCCL INFO 2 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933434:1934850 [4] NCCL INFO 3 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933434:1934850 [4] NCCL INFO 4 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933434:1934850 [4] NCCL INFO 5 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933434:1934850 [4] NCCL INFO 6 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933434:1934850 [4] NCCL INFO 7 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933434:1934850 [4] NCCL INFO 8 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933434:1934850 [4] NCCL INFO 9 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933434:1934850 [4] NCCL INFO 10 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933434:1934850 [4] NCCL INFO 11 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933433:1934551 [3] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:04.0 node-0:1933430:1934228 [0] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:03.0 node-0:1933435:1934848 [5] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:03.0 node-0:1933436:1934547 [6] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:01.0 node-0:1933433:1934551 [3] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:02.0 node-0:1933432:1934230 [2] NCCL INFO NCCL_NET_GDR_LEVEL set by environment to SYS node-0:1933430:1934228 [0] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:04.0 node-0:1933436:1934547 [6] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:02.0 node-0:1933433:1934551 [3] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:01.0 node-0:1933430:1934228 [0] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:04.0 node-0:1933432:1934230 [2] NCCL INFO === System : maxBw 240.0 totalBw 240.0 === node-0:1933436:1934547 [6] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:03.0 node-0:1933432:1934230 [2] NCCL INFO CPU/0 (1/2/-1) node-0:1933432:1934230 [2] NCCL INFO + PCI[24.0] - PCI/FFFFFF010 (0) node-0:1933432:1934230 [2] NCCL INFO + PCI[24.0] - GPU/100000 (0) node-0:1933432:1934230 [2] NCCL INFO + NVL[240.0] - NVS/0 node-0:1933432:1934230 [2] NCCL INFO + PCI[24.0] - NIC/10100000 node-0:1933432:1934230 [2] NCCL INFO + PCI[24.0] - GPU/200000 (1) node-0:1933432:1934230 [2] NCCL INFO + NVL[240.0] - NVS/0 node-0:1933432:1934230 [2] NCCL INFO + PCI[24.0] - NIC/10200000 node-0:1933432:1934230 [2] NCCL INFO + SYS[16.0] - CPU/1 node-0:1933432:1934230 [2] NCCL INFO + SYS[16.0] - CPU/2 node-0:1933432:1934230 [2] NCCL INFO + SYS[16.0] - CPU/3 node-0:1933432:1934230 [2] NCCL INFO CPU/1 (1/2/-1) node-0:1933432:1934230 [2] NCCL INFO + PCI[24.0] - PCI/FFFFFF020 (0) node-0:1933432:1934230 [2] NCCL INFO + PCI[24.0] - GPU/300000 (2) node-0:1933432:1934230 [2] NCCL INFO + NVL[240.0] - NVS/0 node-0:1933432:1934230 [2] NCCL INFO + PCI[24.0] - NIC/10300000 node-0:1933432:1934230 [2] NCCL INFO + PCI[24.0] - GPU/400000 (3) node-0:1933432:1934230 [2] NCCL INFO + NVL[240.0] - NVS/0 node-0:1933432:1934230 [2] NCCL INFO + PCI[24.0] - NIC/10400000 node-0:1933432:1934230 [2] NCCL INFO + SYS[16.0] - CPU/0 node-0:1933432:1934230 [2] NCCL INFO + SYS[16.0] - CPU/2 node-0:1933432:1934230 [2] NCCL INFO + SYS[16.0] - CPU/3 node-0:1933433:1934551 [3] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:02.0 node-0:1933432:1934230 [2] NCCL INFO CPU/2 (1/2/-1) node-0:1933432:1934230 [2] NCCL INFO + PCI[24.0] - PCI/FFFFFF030 (0) node-0:1933432:1934230 [2] NCCL INFO + PCI[24.0] - GPU/B00000 (4) node-0:1933432:1934230 [2] NCCL INFO + NVL[240.0] - NVS/0 node-0:1933431:1934545 [1] NCCL INFO NCCL_NET_GDR_LEVEL set by environment to SYS node-0:1933432:1934230 [2] NCCL INFO + PCI[24.0] - NIC/10500000 node-0:1933432:1934230 [2] NCCL INFO + PCI[24.0] - GPU/C00000 (5) node-0:1933432:1934230 [2] NCCL INFO + NVL[240.0] - NVS/0 node-0:1933432:1934230 [2] NCCL INFO + PCI[24.0] - NIC/10600000 node-0:1933432:1934230 [2] NCCL INFO + SYS[16.0] - CPU/0 node-0:1933432:1934230 [2] NCCL INFO + SYS[16.0] - CPU/1 node-0:1933432:1934230 [2] NCCL INFO + SYS[16.0] - CPU/3 node-0:1933432:1934230 [2] NCCL INFO CPU/3 (1/2/-1) node-0:1933432:1934230 [2] NCCL INFO + PCI[24.0] - PCI/FFFFFF040 (0) node-0:1933432:1934230 [2] NCCL INFO + PCI[24.0] - GPU/D00000 (6) node-0:1933432:1934230 [2] NCCL INFO + NVL[240.0] - NVS/0 node-0:1933432:1934230 [2] NCCL INFO + PCI[24.0] - NIC/10700000 node-0:1933432:1934230 [2] NCCL INFO + PCI[24.0] - GPU/E00000 (7) node-0:1933432:1934230 [2] NCCL INFO + NVL[240.0] - NVS/0 node-0:1933432:1934230 [2] NCCL INFO + PCI[24.0] - NIC/10800000 node-0:1933432:1934230 [2] NCCL INFO + SYS[16.0] - CPU/0 node-0:1933432:1934230 [2] NCCL INFO + SYS[16.0] - CPU/1 node-0:1933432:1934230 [2] NCCL INFO + SYS[16.0] - CPU/2 node-0:1933432:1934230 [2] NCCL INFO ========================================== node-0:1933432:1934230 [2] NCCL INFO GPU/100000 :GPU/100000 (0/5000.000000/LOC) GPU/200000 (2/240.000000/NVL) GPU/300000 (2/240.000000/NVL) GPU/400000 (2/240.000000/NVL) GPU/B00000 (2/240.000000/NVL) GPU/C00000 (2/240.000000/NVL) GPU/D00000 (2/240.000000/NVL) GPU/E00000 (2/240.000000/NVL) NVS/0 (1/240.000000/NVL) CPU/0 (2/24.000000/PHB) CPU/1 (3/16.000000/SYS) CPU/2 (3/16.000000/SYS) CPU/3 (3/16.000000/SYS) node-0:1933432:1934230 [2] NCCL INFO GPU/200000 :GPU/100000 (2/240.000000/NVL) GPU/200000 (0/5000.000000/LOC) GPU/300000 (2/240.000000/NVL) GPU/400000 (2/240.000000/NVL) GPU/B00000 (2/240.000000/NVL) GPU/C00000 (2/240.000000/NVL) GPU/D00000 (2/240.000000/NVL) GPU/E00000 (2/240.000000/NVL) NVS/0 (1/240.000000/NVL) CPU/0 (2/24.000000/PHB) CPU/1 (3/16.000000/SYS) CPU/2 (3/16.000000/SYS) CPU/3 (3/16.000000/SYS) node-0:1933432:1934230 [2] NCCL INFO GPU/300000 :GPU/100000 (2/240.000000/NVL) GPU/200000 (2/240.000000/NVL) GPU/300000 (0/5000.000000/LOC) GPU/400000 (2/240.000000/NVL) GPU/B00000 (2/240.000000/NVL) GPU/C00000 (2/240.000000/NVL) GPU/D00000 (2/240.000000/NVL) GPU/E00000 (2/240.000000/NVL) NVS/0 (1/240.000000/NVL) CPU/0 (3/16.000000/SYS) CPU/1 (2/24.000000/PHB) CPU/2 (3/16.000000/SYS) CPU/3 (3/16.000000/SYS) node-0:1933432:1934230 [2] NCCL INFO GPU/400000 :GPU/100000 (2/240.000000/NVL) GPU/200000 (2/240.000000/NVL) GPU/300000 (2/240.000000/NVL) GPU/400000 (0/5000.000000/LOC) GPU/B00000 (2/240.000000/NVL) GPU/C00000 (2/240.000000/NVL) GPU/D00000 (2/240.000000/NVL) GPU/E00000 (2/240.000000/NVL) NVS/0 (1/240.000000/NVL) CPU/0 (3/16.000000/SYS) CPU/1 (2/24.000000/PHB) CPU/2 (3/16.000000/SYS) CPU/3 (3/16.000000/SYS) node-0:1933432:1934230 [2] NCCL INFO GPU/B00000 :GPU/100000 (2/240.000000/NVL) GPU/200000 (2/240.000000/NVL) GPU/300000 (2/240.000000/NVL) GPU/400000 (2/240.000000/NVL) GPU/B00000 (0/5000.000000/LOC) GPU/C00000 (2/240.000000/NVL) GPU/D00000 (2/240.000000/NVL) GPU/E00000 (2/240.000000/NVL) NVS/0 (1/240.000000/NVL) CPU/0 (3/16.000000/SYS) CPU/1 (3/16.000000/SYS) CPU/2 (2/24.000000/PHB) CPU/3 (3/16.000000/SYS) node-0:1933432:1934230 [2] NCCL INFO GPU/C00000 :GPU/100000 (2/240.000000/NVL) GPU/200000 (2/240.000000/NVL) GPU/300000 (2/240.000000/NVL) GPU/400000 (2/240.000000/NVL) GPU/B00000 (2/240.000000/NVL) GPU/C00000 (0/5000.000000/LOC) GPU/D00000 (2/240.000000/NVL) GPU/E00000 (2/240.000000/NVL) NVS/0 (1/240.000000/NVL) CPU/0 (3/16.000000/SYS) CPU/1 (3/16.000000/SYS) CPU/2 (2/24.000000/PHB) CPU/3 (3/16.000000/SYS) node-0:1933430:1934228 [0] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:02.0 node-0:1933432:1934230 [2] NCCL INFO GPU/D00000 :GPU/100000 (2/240.000000/NVL) GPU/200000 (2/240.000000/NVL) GPU/300000 (2/240.000000/NVL) GPU/400000 (2/240.000000/NVL) GPU/B00000 (2/240.000000/NVL) GPU/C00000 (2/240.000000/NVL) GPU/D00000 (0/5000.000000/LOC) GPU/E00000 (2/240.000000/NVL) NVS/0 (1/240.000000/NVL) CPU/0 (3/16.000000/SYS) CPU/1 (3/16.000000/SYS) CPU/2 (3/16.000000/SYS) CPU/3 (2/24.000000/PHB) node-0:1933432:1934230 [2] NCCL INFO GPU/E00000 :GPU/100000 (2/240.000000/NVL) GPU/200000 (2/240.000000/NVL) GPU/300000 (2/240.000000/NVL) GPU/400000 (2/240.000000/NVL) GPU/B00000 (2/240.000000/NVL) GPU/C00000 (2/240.000000/NVL) GPU/D00000 (2/240.000000/NVL) GPU/E00000 (0/5000.000000/LOC) NVS/0 (1/240.000000/NVL) CPU/0 (3/16.000000/SYS) CPU/1 (3/16.000000/SYS) CPU/2 (3/16.000000/SYS) CPU/3 (2/24.000000/PHB) node-0:1933432:1934230 [2] NCCL INFO Setting affinity for GPU 2 to ffff,0000ffff node-0:1933433:1934551 [3] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:03.0 node-0:1933432:1934230 [2] NCCL INFO NVLS multicast support is not available on dev 2 node-0:1933430:1934228 [0] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:01.0 node-0:1933432:1934230 [2] NCCL INFO Pattern 4, crossNic 0, nChannels 12, bw 20.000000/20.000000, type NVL/PIX, sameChannels 1 node-0:1933432:1934230 [2] NCCL INFO 0 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933432:1934230 [2] NCCL INFO 1 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933432:1934230 [2] NCCL INFO 2 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933432:1934230 [2] NCCL INFO 3 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933432:1934230 [2] NCCL INFO 4 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933432:1934230 [2] NCCL INFO 5 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933432:1934230 [2] NCCL INFO 6 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933432:1934230 [2] NCCL INFO 7 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933432:1934230 [2] NCCL INFO 8 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933432:1934230 [2] NCCL INFO 9 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933432:1934230 [2] NCCL INFO 10 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933432:1934230 [2] NCCL INFO 11 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933431:1934545 [1] NCCL INFO === System : maxBw 240.0 totalBw 240.0 === node-0:1933432:1934230 [2] NCCL INFO Pattern 1, crossNic 0, nChannels 12, bw 20.000000/20.000000, type NVL/PIX, sameChannels 1 node-0:1933432:1934230 [2] NCCL INFO 0 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933432:1934230 [2] NCCL INFO 1 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933431:1934545 [1] NCCL INFO CPU/0 (1/2/-1) node-0:1933432:1934230 [2] NCCL INFO 2 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933431:1934545 [1] NCCL INFO + PCI[24.0] - PCI/FFFFFF010 (0) node-0:1933432:1934230 [2] NCCL INFO 3 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933430:1934228 [0] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:02.0 node-0:1933432:1934230 [2] NCCL INFO 4 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933431:1934545 [1] NCCL INFO + PCI[24.0] - GPU/100000 (0) node-0:1933432:1934230 [2] NCCL INFO 5 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933431:1934545 [1] NCCL INFO + NVL[240.0] - NVS/0 node-0:1933432:1934230 [2] NCCL INFO 6 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933431:1934545 [1] NCCL INFO + PCI[24.0] - NIC/10100000 node-0:1933431:1934545 [1] NCCL INFO + PCI[24.0] - GPU/200000 (1) node-0:1933432:1934230 [2] NCCL INFO 7 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933431:1934545 [1] NCCL INFO + NVL[240.0] - NVS/0 node-0:1933432:1934230 [2] NCCL INFO 8 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933431:1934545 [1] NCCL INFO + PCI[24.0] - NIC/10200000 node-0:1933432:1934230 [2] NCCL INFO 9 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933431:1934545 [1] NCCL INFO + SYS[16.0] - CPU/1 node-0:1933431:1934545 [1] NCCL INFO + SYS[16.0] - CPU/2 node-0:1933432:1934230 [2] NCCL INFO 10 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933431:1934545 [1] NCCL INFO + SYS[16.0] - CPU/3 node-0:1933432:1934230 [2] NCCL INFO 11 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933431:1934545 [1] NCCL INFO CPU/1 (1/2/-1) node-0:1933431:1934545 [1] NCCL INFO + PCI[24.0] - PCI/FFFFFF020 (0) node-0:1933431:1934545 [1] NCCL INFO + PCI[24.0] - GPU/300000 (2) node-0:1933431:1934545 [1] NCCL INFO + NVL[240.0] - NVS/0 node-0:1933431:1934545 [1] NCCL INFO + PCI[24.0] - NIC/10300000 node-0:1933431:1934545 [1] NCCL INFO + PCI[24.0] - GPU/400000 (3) node-0:1933431:1934545 [1] NCCL INFO + NVL[240.0] - NVS/0 node-0:1933431:1934545 [1] NCCL INFO + PCI[24.0] - NIC/10400000 node-0:1933431:1934545 [1] NCCL INFO + SYS[16.0] - CPU/0 node-0:1933431:1934545 [1] NCCL INFO + SYS[16.0] - CPU/2 node-0:1933431:1934545 [1] NCCL INFO + SYS[16.0] - CPU/3 node-0:1933431:1934545 [1] NCCL INFO CPU/2 (1/2/-1) node-0:1933431:1934545 [1] NCCL INFO + PCI[24.0] - PCI/FFFFFF030 (0) node-0:1933431:1934545 [1] NCCL INFO + PCI[24.0] - GPU/B00000 (4) node-0:1933431:1934545 [1] NCCL INFO + NVL[240.0] - NVS/0 node-0:1933431:1934545 [1] NCCL INFO + PCI[24.0] - NIC/10500000 node-0:1933431:1934545 [1] NCCL INFO + PCI[24.0] - GPU/C00000 (5) node-0:1933431:1934545 [1] NCCL INFO + NVL[240.0] - NVS/0 node-0:1933431:1934545 [1] NCCL INFO + PCI[24.0] - NIC/10600000 node-0:1933431:1934545 [1] NCCL INFO + SYS[16.0] - CPU/0 node-0:1933431:1934545 [1] NCCL INFO + SYS[16.0] - CPU/1 node-0:1933431:1934545 [1] NCCL INFO + SYS[16.0] - CPU/3 node-0:1933431:1934545 [1] NCCL INFO CPU/3 (1/2/-1) node-0:1933431:1934545 [1] NCCL INFO + PCI[24.0] - PCI/FFFFFF040 (0) node-0:1933431:1934545 [1] NCCL INFO + PCI[24.0] - GPU/D00000 (6) node-0:1933431:1934545 [1] NCCL INFO + NVL[240.0] - NVS/0 node-0:1933431:1934545 [1] NCCL INFO + PCI[24.0] - NIC/10700000 node-0:1933431:1934545 [1] NCCL INFO + PCI[24.0] - GPU/E00000 (7) node-0:1933431:1934545 [1] NCCL INFO + NVL[240.0] - NVS/0 node-0:1933431:1934545 [1] NCCL INFO + PCI[24.0] - NIC/10800000 node-0:1933431:1934545 [1] NCCL INFO + SYS[16.0] - CPU/0 node-0:1933431:1934545 [1] NCCL INFO + SYS[16.0] - CPU/1 node-0:1933431:1934545 [1] NCCL INFO + SYS[16.0] - CPU/2 node-0:1933431:1934545 [1] NCCL INFO ========================================== node-0:1933431:1934545 [1] NCCL INFO GPU/100000 :GPU/100000 (0/5000.000000/LOC) GPU/200000 (2/240.000000/NVL) GPU/300000 (2/240.000000/NVL) GPU/400000 (2/240.000000/NVL) GPU/B00000 (2/240.000000/NVL) GPU/C00000 (2/240.000000/NVL) GPU/D00000 (2/240.000000/NVL) GPU/E00000 (2/240.000000/NVL) NVS/0 (1/240.000000/NVL) CPU/0 (2/24.000000/PHB) CPU/1 (3/16.000000/SYS) CPU/2 (3/16.000000/SYS) CPU/3 (3/16.000000/SYS) node-0:1933431:1934545 [1] NCCL INFO GPU/200000 :GPU/100000 (2/240.000000/NVL) GPU/200000 (0/5000.000000/LOC) GPU/300000 (2/240.000000/NVL) GPU/400000 (2/240.000000/NVL) GPU/B00000 (2/240.000000/NVL) GPU/C00000 (2/240.000000/NVL) GPU/D00000 (2/240.000000/NVL) GPU/E00000 (2/240.000000/NVL) NVS/0 (1/240.000000/NVL) CPU/0 (2/24.000000/PHB) CPU/1 (3/16.000000/SYS) CPU/2 (3/16.000000/SYS) CPU/3 (3/16.000000/SYS) node-0:1933431:1934545 [1] NCCL INFO GPU/300000 :GPU/100000 (2/240.000000/NVL) GPU/200000 (2/240.000000/NVL) GPU/300000 (0/5000.000000/LOC) GPU/400000 (2/240.000000/NVL) GPU/B00000 (2/240.000000/NVL) GPU/C00000 (2/240.000000/NVL) GPU/D00000 (2/240.000000/NVL) GPU/E00000 (2/240.000000/NVL) NVS/0 (1/240.000000/NVL) CPU/0 (3/16.000000/SYS) CPU/1 (2/24.000000/PHB) CPU/2 (3/16.000000/SYS) CPU/3 (3/16.000000/SYS) node-0:1933431:1934545 [1] NCCL INFO GPU/400000 :GPU/100000 (2/240.000000/NVL) GPU/200000 (2/240.000000/NVL) GPU/300000 (2/240.000000/NVL) GPU/400000 (0/5000.000000/LOC) GPU/B00000 (2/240.000000/NVL) GPU/C00000 (2/240.000000/NVL) GPU/D00000 (2/240.000000/NVL) GPU/E00000 (2/240.000000/NVL) NVS/0 (1/240.000000/NVL) CPU/0 (3/16.000000/SYS) CPU/1 (2/24.000000/PHB) CPU/2 (3/16.000000/SYS) CPU/3 (3/16.000000/SYS) node-0:1933431:1934545 [1] NCCL INFO GPU/B00000 :GPU/100000 (2/240.000000/NVL) GPU/200000 (2/240.000000/NVL) GPU/300000 (2/240.000000/NVL) GPU/400000 (2/240.000000/NVL) GPU/B00000 (0/5000.000000/LOC) GPU/C00000 (2/240.000000/NVL) GPU/D00000 (2/240.000000/NVL) GPU/E00000 (2/240.000000/NVL) NVS/0 (1/240.000000/NVL) CPU/0 (3/16.000000/SYS) CPU/1 (3/16.000000/SYS) CPU/2 (2/24.000000/PHB) CPU/3 (3/16.000000/SYS) node-0:1933430:1934228 [0] NCCL INFO Could not find real path of /sys/class/pci_bus/ffff:ff/../../ffff:ff:03.0 node-0:1933431:1934545 [1] NCCL INFO GPU/C00000 :GPU/100000 (2/240.000000/NVL) GPU/200000 (2/240.000000/NVL) GPU/300000 (2/240.000000/NVL) GPU/400000 (2/240.000000/NVL) GPU/B00000 (2/240.000000/NVL) GPU/C00000 (0/5000.000000/LOC) GPU/D00000 (2/240.000000/NVL) GPU/E00000 (2/240.000000/NVL) NVS/0 (1/240.000000/NVL) CPU/0 (3/16.000000/SYS) CPU/1 (3/16.000000/SYS) CPU/2 (2/24.000000/PHB) CPU/3 (3/16.000000/SYS) node-0:1933431:1934545 [1] NCCL INFO GPU/D00000 :GPU/100000 (2/240.000000/NVL) GPU/200000 (2/240.000000/NVL) GPU/300000 (2/240.000000/NVL) GPU/400000 (2/240.000000/NVL) GPU/B00000 (2/240.000000/NVL) GPU/C00000 (2/240.000000/NVL) GPU/D00000 (0/5000.000000/LOC) GPU/E00000 (2/240.000000/NVL) NVS/0 (1/240.000000/NVL) CPU/0 (3/16.000000/SYS) CPU/1 (3/16.000000/SYS) CPU/2 (3/16.000000/SYS) CPU/3 (2/24.000000/PHB) node-0:1933431:1934545 [1] NCCL INFO GPU/E00000 :GPU/100000 (2/240.000000/NVL) GPU/200000 (2/240.000000/NVL) GPU/300000 (2/240.000000/NVL) GPU/400000 (2/240.000000/NVL) GPU/B00000 (2/240.000000/NVL) GPU/C00000 (2/240.000000/NVL) GPU/D00000 (2/240.000000/NVL) GPU/E00000 (0/5000.000000/LOC) NVS/0 (1/240.000000/NVL) CPU/0 (3/16.000000/SYS) CPU/1 (3/16.000000/SYS) CPU/2 (3/16.000000/SYS) CPU/3 (2/24.000000/PHB) node-0:1933431:1934545 [1] NCCL INFO Setting affinity for GPU 1 to ffff,0000ffff node-0:1933431:1934545 [1] NCCL INFO NVLS multicast support is not available on dev 1 node-0:1933431:1934545 [1] NCCL INFO Pattern 4, crossNic 0, nChannels 12, bw 20.000000/20.000000, type NVL/PIX, sameChannels 1 node-0:1933431:1934545 [1] NCCL INFO 0 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933431:1934545 [1] NCCL INFO 1 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933431:1934545 [1] NCCL INFO 2 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933431:1934545 [1] NCCL INFO 3 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933431:1934545 [1] NCCL INFO 4 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933431:1934545 [1] NCCL INFO 5 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933431:1934545 [1] NCCL INFO 6 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933431:1934545 [1] NCCL INFO 7 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933431:1934545 [1] NCCL INFO 8 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933431:1934545 [1] NCCL INFO 9 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933431:1934545 [1] NCCL INFO 10 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933431:1934545 [1] NCCL INFO 11 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933431:1934545 [1] NCCL INFO Pattern 1, crossNic 0, nChannels 12, bw 20.000000/20.000000, type NVL/PIX, sameChannels 1 node-0:1933431:1934545 [1] NCCL INFO 0 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933431:1934545 [1] NCCL INFO 1 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933431:1934545 [1] NCCL INFO 2 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933431:1934545 [1] NCCL INFO 3 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933431:1934545 [1] NCCL INFO 4 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933431:1934545 [1] NCCL INFO 5 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933431:1934545 [1] NCCL INFO 6 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933431:1934545 [1] NCCL INFO 7 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933431:1934545 [1] NCCL INFO 8 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933431:1934545 [1] NCCL INFO 9 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933431:1934545 [1] NCCL INFO 10 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933431:1934545 [1] NCCL INFO 11 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933435:1934848 [5] NCCL INFO NCCL_NET_GDR_LEVEL set by environment to SYS node-0:1933436:1934547 [6] NCCL INFO NCCL_NET_GDR_LEVEL set by environment to SYS node-0:1933435:1934848 [5] NCCL INFO === System : maxBw 240.0 totalBw 240.0 === node-0:1933435:1934848 [5] NCCL INFO CPU/0 (1/2/-1) node-0:1933435:1934848 [5] NCCL INFO + PCI[24.0] - PCI/FFFFFF010 (0) node-0:1933435:1934848 [5] NCCL INFO + PCI[24.0] - GPU/100000 (0) node-0:1933435:1934848 [5] NCCL INFO + NVL[240.0] - NVS/0 node-0:1933435:1934848 [5] NCCL INFO + PCI[24.0] - NIC/10100000 node-0:1933435:1934848 [5] NCCL INFO + PCI[24.0] - GPU/200000 (1) node-0:1933435:1934848 [5] NCCL INFO + NVL[240.0] - NVS/0 node-0:1933435:1934848 [5] NCCL INFO + PCI[24.0] - NIC/10200000 node-0:1933435:1934848 [5] NCCL INFO + SYS[16.0] - CPU/1 node-0:1933435:1934848 [5] NCCL INFO + SYS[16.0] - CPU/2 node-0:1933435:1934848 [5] NCCL INFO + SYS[16.0] - CPU/3 node-0:1933435:1934848 [5] NCCL INFO CPU/1 (1/2/-1) node-0:1933435:1934848 [5] NCCL INFO + PCI[24.0] - PCI/FFFFFF020 (0) node-0:1933435:1934848 [5] NCCL INFO + PCI[24.0] - GPU/300000 (2) node-0:1933435:1934848 [5] NCCL INFO + NVL[240.0] - NVS/0 node-0:1933435:1934848 [5] NCCL INFO + PCI[24.0] - NIC/10300000 node-0:1933435:1934848 [5] NCCL INFO + PCI[24.0] - GPU/400000 (3) node-0:1933435:1934848 [5] NCCL INFO + NVL[240.0] - NVS/0 node-0:1933435:1934848 [5] NCCL INFO + PCI[24.0] - NIC/10400000 node-0:1933435:1934848 [5] NCCL INFO + SYS[16.0] - CPU/0 node-0:1933435:1934848 [5] NCCL INFO + SYS[16.0] - CPU/2 node-0:1933435:1934848 [5] NCCL INFO + SYS[16.0] - CPU/3 node-0:1933435:1934848 [5] NCCL INFO CPU/2 (1/2/-1) node-0:1933435:1934848 [5] NCCL INFO + PCI[24.0] - PCI/FFFFFF030 (0) node-0:1933435:1934848 [5] NCCL INFO + PCI[24.0] - GPU/B00000 (4) node-0:1933435:1934848 [5] NCCL INFO + NVL[240.0] - NVS/0 node-0:1933435:1934848 [5] NCCL INFO + PCI[24.0] - NIC/10500000 node-0:1933435:1934848 [5] NCCL INFO + PCI[24.0] - GPU/C00000 (5) node-0:1933435:1934848 [5] NCCL INFO + NVL[240.0] - NVS/0 node-0:1933435:1934848 [5] NCCL INFO + PCI[24.0] - NIC/10600000 node-0:1933435:1934848 [5] NCCL INFO + SYS[16.0] - CPU/0 node-0:1933435:1934848 [5] NCCL INFO + SYS[16.0] - CPU/1 node-0:1933435:1934848 [5] NCCL INFO + SYS[16.0] - CPU/3 node-0:1933435:1934848 [5] NCCL INFO CPU/3 (1/2/-1) node-0:1933435:1934848 [5] NCCL INFO + PCI[24.0] - PCI/FFFFFF040 (0) node-0:1933435:1934848 [5] NCCL INFO + PCI[24.0] - GPU/D00000 (6) node-0:1933435:1934848 [5] NCCL INFO + NVL[240.0] - NVS/0 node-0:1933435:1934848 [5] NCCL INFO + PCI[24.0] - NIC/10700000 node-0:1933435:1934848 [5] NCCL INFO + PCI[24.0] - GPU/E00000 (7) node-0:1933435:1934848 [5] NCCL INFO + NVL[240.0] - NVS/0 node-0:1933435:1934848 [5] NCCL INFO + PCI[24.0] - NIC/10800000 node-0:1933435:1934848 [5] NCCL INFO + SYS[16.0] - CPU/0 node-0:1933435:1934848 [5] NCCL INFO + SYS[16.0] - CPU/1 node-0:1933435:1934848 [5] NCCL INFO + SYS[16.0] - CPU/2 node-0:1933435:1934848 [5] NCCL INFO ========================================== node-0:1933435:1934848 [5] NCCL INFO GPU/100000 :GPU/100000 (0/5000.000000/LOC) GPU/200000 (2/240.000000/NVL) GPU/300000 (2/240.000000/NVL) GPU/400000 (2/240.000000/NVL) GPU/B00000 (2/240.000000/NVL) GPU/C00000 (2/240.000000/NVL) GPU/D00000 (2/240.000000/NVL) GPU/E00000 (2/240.000000/NVL) NVS/0 (1/240.000000/NVL) CPU/0 (2/24.000000/PHB) CPU/1 (3/16.000000/SYS) CPU/2 (3/16.000000/SYS) CPU/3 (3/16.000000/SYS) node-0:1933435:1934848 [5] NCCL INFO GPU/200000 :GPU/100000 (2/240.000000/NVL) GPU/200000 (0/5000.000000/LOC) GPU/300000 (2/240.000000/NVL) GPU/400000 (2/240.000000/NVL) GPU/B00000 (2/240.000000/NVL) GPU/C00000 (2/240.000000/NVL) GPU/D00000 (2/240.000000/NVL) GPU/E00000 (2/240.000000/NVL) NVS/0 (1/240.000000/NVL) CPU/0 (2/24.000000/PHB) CPU/1 (3/16.000000/SYS) CPU/2 (3/16.000000/SYS) CPU/3 (3/16.000000/SYS) node-0:1933435:1934848 [5] NCCL INFO GPU/300000 :GPU/100000 (2/240.000000/NVL) GPU/200000 (2/240.000000/NVL) GPU/300000 (0/5000.000000/LOC) GPU/400000 (2/240.000000/NVL) GPU/B00000 (2/240.000000/NVL) GPU/C00000 (2/240.000000/NVL) GPU/D00000 (2/240.000000/NVL) GPU/E00000 (2/240.000000/NVL) NVS/0 (1/240.000000/NVL) CPU/0 (3/16.000000/SYS) CPU/1 (2/24.000000/PHB) CPU/2 (3/16.000000/SYS) CPU/3 (3/16.000000/SYS) node-0:1933435:1934848 [5] NCCL INFO GPU/400000 :GPU/100000 (2/240.000000/NVL) GPU/200000 (2/240.000000/NVL) GPU/300000 (2/240.000000/NVL) GPU/400000 (0/5000.000000/LOC) GPU/B00000 (2/240.000000/NVL) GPU/C00000 (2/240.000000/NVL) GPU/D00000 (2/240.000000/NVL) GPU/E00000 (2/240.000000/NVL) NVS/0 (1/240.000000/NVL) CPU/0 (3/16.000000/SYS) CPU/1 (2/24.000000/PHB) CPU/2 (3/16.000000/SYS) CPU/3 (3/16.000000/SYS) node-0:1933435:1934848 [5] NCCL INFO GPU/B00000 :GPU/100000 (2/240.000000/NVL) GPU/200000 (2/240.000000/NVL) GPU/300000 (2/240.000000/NVL) GPU/400000 (2/240.000000/NVL) GPU/B00000 (0/5000.000000/LOC) GPU/C00000 (2/240.000000/NVL) GPU/D00000 (2/240.000000/NVL) GPU/E00000 (2/240.000000/NVL) NVS/0 (1/240.000000/NVL) CPU/0 (3/16.000000/SYS) CPU/1 (3/16.000000/SYS) CPU/2 (2/24.000000/PHB) CPU/3 (3/16.000000/SYS) node-0:1933435:1934848 [5] NCCL INFO GPU/C00000 :GPU/100000 (2/240.000000/NVL) GPU/200000 (2/240.000000/NVL) GPU/300000 (2/240.000000/NVL) GPU/400000 (2/240.000000/NVL) GPU/B00000 (2/240.000000/NVL) GPU/C00000 (0/5000.000000/LOC) GPU/D00000 (2/240.000000/NVL) GPU/E00000 (2/240.000000/NVL) NVS/0 (1/240.000000/NVL) CPU/0 (3/16.000000/SYS) CPU/1 (3/16.000000/SYS) CPU/2 (2/24.000000/PHB) CPU/3 (3/16.000000/SYS) node-0:1933435:1934848 [5] NCCL INFO GPU/D00000 :GPU/100000 (2/240.000000/NVL) GPU/200000 (2/240.000000/NVL) GPU/300000 (2/240.000000/NVL) GPU/400000 (2/240.000000/NVL) GPU/B00000 (2/240.000000/NVL) GPU/C00000 (2/240.000000/NVL) GPU/D00000 (0/5000.000000/LOC) GPU/E00000 (2/240.000000/NVL) NVS/0 (1/240.000000/NVL) CPU/0 (3/16.000000/SYS) CPU/1 (3/16.000000/SYS) CPU/2 (3/16.000000/SYS) CPU/3 (2/24.000000/PHB) node-0:1933435:1934848 [5] NCCL INFO GPU/E00000 :GPU/100000 (2/240.000000/NVL) GPU/200000 (2/240.000000/NVL) GPU/300000 (2/240.000000/NVL) GPU/400000 (2/240.000000/NVL) GPU/B00000 (2/240.000000/NVL) GPU/C00000 (2/240.000000/NVL) GPU/D00000 (2/240.000000/NVL) GPU/E00000 (0/5000.000000/LOC) NVS/0 (1/240.000000/NVL) CPU/0 (3/16.000000/SYS) CPU/1 (3/16.000000/SYS) CPU/2 (3/16.000000/SYS) CPU/3 (2/24.000000/PHB) node-0:1933435:1934848 [5] NCCL INFO Setting affinity for GPU 5 to ffff,0000ffff node-0:1933433:1934551 [3] NCCL INFO NCCL_NET_GDR_LEVEL set by environment to SYS node-0:1933436:1934547 [6] NCCL INFO === System : maxBw 240.0 totalBw 240.0 === node-0:1933436:1934547 [6] NCCL INFO CPU/0 (1/2/-1) node-0:1933436:1934547 [6] NCCL INFO + PCI[24.0] - PCI/FFFFFF010 (0) node-0:1933436:1934547 [6] NCCL INFO + PCI[24.0] - GPU/100000 (0) node-0:1933436:1934547 [6] NCCL INFO + NVL[240.0] - NVS/0 node-0:1933436:1934547 [6] NCCL INFO + PCI[24.0] - NIC/10100000 node-0:1933436:1934547 [6] NCCL INFO + PCI[24.0] - GPU/200000 (1) node-0:1933436:1934547 [6] NCCL INFO + NVL[240.0] - NVS/0 node-0:1933436:1934547 [6] NCCL INFO + PCI[24.0] - NIC/10200000 node-0:1933436:1934547 [6] NCCL INFO + SYS[16.0] - CPU/1 node-0:1933436:1934547 [6] NCCL INFO + SYS[16.0] - CPU/2 node-0:1933436:1934547 [6] NCCL INFO + SYS[16.0] - CPU/3 node-0:1933436:1934547 [6] NCCL INFO CPU/1 (1/2/-1) node-0:1933436:1934547 [6] NCCL INFO + PCI[24.0] - PCI/FFFFFF020 (0) node-0:1933436:1934547 [6] NCCL INFO + PCI[24.0] - GPU/300000 (2) node-0:1933436:1934547 [6] NCCL INFO + NVL[240.0] - NVS/0 node-0:1933436:1934547 [6] NCCL INFO + PCI[24.0] - NIC/10300000 node-0:1933436:1934547 [6] NCCL INFO + PCI[24.0] - GPU/400000 (3) node-0:1933436:1934547 [6] NCCL INFO + NVL[240.0] - NVS/0 node-0:1933436:1934547 [6] NCCL INFO + PCI[24.0] - NIC/10400000 node-0:1933436:1934547 [6] NCCL INFO + SYS[16.0] - CPU/0 node-0:1933436:1934547 [6] NCCL INFO + SYS[16.0] - CPU/2 node-0:1933436:1934547 [6] NCCL INFO + SYS[16.0] - CPU/3 node-0:1933436:1934547 [6] NCCL INFO CPU/2 (1/2/-1) node-0:1933436:1934547 [6] NCCL INFO + PCI[24.0] - PCI/FFFFFF030 (0) node-0:1933436:1934547 [6] NCCL INFO + PCI[24.0] - GPU/B00000 (4) node-0:1933436:1934547 [6] NCCL INFO + NVL[240.0] - NVS/0 node-0:1933436:1934547 [6] NCCL INFO + PCI[24.0] - NIC/10500000 node-0:1933436:1934547 [6] NCCL INFO + PCI[24.0] - GPU/C00000 (5) node-0:1933436:1934547 [6] NCCL INFO + NVL[240.0] - NVS/0 node-0:1933436:1934547 [6] NCCL INFO + PCI[24.0] - NIC/10600000 node-0:1933436:1934547 [6] NCCL INFO + SYS[16.0] - CPU/0 node-0:1933436:1934547 [6] NCCL INFO + SYS[16.0] - CPU/1 node-0:1933436:1934547 [6] NCCL INFO + SYS[16.0] - CPU/3 node-0:1933436:1934547 [6] NCCL INFO CPU/3 (1/2/-1) node-0:1933436:1934547 [6] NCCL INFO + PCI[24.0] - PCI/FFFFFF040 (0) node-0:1933436:1934547 [6] NCCL INFO + PCI[24.0] - GPU/D00000 (6) node-0:1933436:1934547 [6] NCCL INFO + NVL[240.0] - NVS/0 node-0:1933436:1934547 [6] NCCL INFO + PCI[24.0] - NIC/10700000 node-0:1933436:1934547 [6] NCCL INFO + PCI[24.0] - GPU/E00000 (7) node-0:1933436:1934547 [6] NCCL INFO + NVL[240.0] - NVS/0 node-0:1933436:1934547 [6] NCCL INFO + PCI[24.0] - NIC/10800000 node-0:1933436:1934547 [6] NCCL INFO + SYS[16.0] - CPU/0 node-0:1933436:1934547 [6] NCCL INFO + SYS[16.0] - CPU/1 node-0:1933436:1934547 [6] NCCL INFO + SYS[16.0] - CPU/2 node-0:1933436:1934547 [6] NCCL INFO ========================================== node-0:1933436:1934547 [6] NCCL INFO GPU/100000 :GPU/100000 (0/5000.000000/LOC) GPU/200000 (2/240.000000/NVL) GPU/300000 (2/240.000000/NVL) GPU/400000 (2/240.000000/NVL) GPU/B00000 (2/240.000000/NVL) GPU/C00000 (2/240.000000/NVL) GPU/D00000 (2/240.000000/NVL) GPU/E00000 (2/240.000000/NVL) NVS/0 (1/240.000000/NVL) CPU/0 (2/24.000000/PHB) CPU/1 (3/16.000000/SYS) CPU/2 (3/16.000000/SYS) CPU/3 (3/16.000000/SYS) node-0:1933436:1934547 [6] NCCL INFO GPU/200000 :GPU/100000 (2/240.000000/NVL) GPU/200000 (0/5000.000000/LOC) GPU/300000 (2/240.000000/NVL) GPU/400000 (2/240.000000/NVL) GPU/B00000 (2/240.000000/NVL) GPU/C00000 (2/240.000000/NVL) GPU/D00000 (2/240.000000/NVL) GPU/E00000 (2/240.000000/NVL) NVS/0 (1/240.000000/NVL) CPU/0 (2/24.000000/PHB) CPU/1 (3/16.000000/SYS) CPU/2 (3/16.000000/SYS) CPU/3 (3/16.000000/SYS) node-0:1933436:1934547 [6] NCCL INFO GPU/300000 :GPU/100000 (2/240.000000/NVL) GPU/200000 (2/240.000000/NVL) GPU/300000 (0/5000.000000/LOC) GPU/400000 (2/240.000000/NVL) GPU/B00000 (2/240.000000/NVL) GPU/C00000 (2/240.000000/NVL) GPU/D00000 (2/240.000000/NVL) GPU/E00000 (2/240.000000/NVL) NVS/0 (1/240.000000/NVL) CPU/0 (3/16.000000/SYS) CPU/1 (2/24.000000/PHB) CPU/2 (3/16.000000/SYS) CPU/3 (3/16.000000/SYS) node-0:1933436:1934547 [6] NCCL INFO GPU/400000 :GPU/100000 (2/240.000000/NVL) GPU/200000 (2/240.000000/NVL) GPU/300000 (2/240.000000/NVL) GPU/400000 (0/5000.000000/LOC) GPU/B00000 (2/240.000000/NVL) GPU/C00000 (2/240.000000/NVL) GPU/D00000 (2/240.000000/NVL) GPU/E00000 (2/240.000000/NVL) NVS/0 (1/240.000000/NVL) CPU/0 (3/16.000000/SYS) CPU/1 (2/24.000000/PHB) CPU/2 (3/16.000000/SYS) CPU/3 (3/16.000000/SYS) node-0:1933436:1934547 [6] NCCL INFO GPU/B00000 :GPU/100000 (2/240.000000/NVL) GPU/200000 (2/240.000000/NVL) GPU/300000 (2/240.000000/NVL) GPU/400000 (2/240.000000/NVL) GPU/B00000 (0/5000.000000/LOC) GPU/C00000 (2/240.000000/NVL) GPU/D00000 (2/240.000000/NVL) GPU/E00000 (2/240.000000/NVL) NVS/0 (1/240.000000/NVL) CPU/0 (3/16.000000/SYS) CPU/1 (3/16.000000/SYS) CPU/2 (2/24.000000/PHB) CPU/3 (3/16.000000/SYS) node-0:1933436:1934547 [6] NCCL INFO GPU/C00000 :GPU/100000 (2/240.000000/NVL) GPU/200000 (2/240.000000/NVL) GPU/300000 (2/240.000000/NVL) GPU/400000 (2/240.000000/NVL) GPU/B00000 (2/240.000000/NVL) GPU/C00000 (0/5000.000000/LOC) GPU/D00000 (2/240.000000/NVL) GPU/E00000 (2/240.000000/NVL) NVS/0 (1/240.000000/NVL) CPU/0 (3/16.000000/SYS) CPU/1 (3/16.000000/SYS) CPU/2 (2/24.000000/PHB) CPU/3 (3/16.000000/SYS) node-0:1933436:1934547 [6] NCCL INFO GPU/D00000 :GPU/100000 (2/240.000000/NVL) GPU/200000 (2/240.000000/NVL) GPU/300000 (2/240.000000/NVL) GPU/400000 (2/240.000000/NVL) GPU/B00000 (2/240.000000/NVL) GPU/C00000 (2/240.000000/NVL) GPU/D00000 (0/5000.000000/LOC) GPU/E00000 (2/240.000000/NVL) NVS/0 (1/240.000000/NVL) CPU/0 (3/16.000000/SYS) CPU/1 (3/16.000000/SYS) CPU/2 (3/16.000000/SYS) CPU/3 (2/24.000000/PHB) node-0:1933436:1934547 [6] NCCL INFO GPU/E00000 :GPU/100000 (2/240.000000/NVL) GPU/200000 (2/240.000000/NVL) GPU/300000 (2/240.000000/NVL) GPU/400000 (2/240.000000/NVL) GPU/B00000 (2/240.000000/NVL) GPU/C00000 (2/240.000000/NVL) GPU/D00000 (2/240.000000/NVL) GPU/E00000 (0/5000.000000/LOC) NVS/0 (1/240.000000/NVL) CPU/0 (3/16.000000/SYS) CPU/1 (3/16.000000/SYS) CPU/2 (3/16.000000/SYS) CPU/3 (2/24.000000/PHB) node-0:1933430:1934228 [0] NCCL INFO NCCL_NET_GDR_LEVEL set by environment to SYS node-0:1933436:1934547 [6] NCCL INFO Setting affinity for GPU 6 to ffff,0000ffff node-0:1933436:1934547 [6] NCCL INFO NVLS multicast support is not available on dev 6 node-0:1933436:1934547 [6] NCCL INFO Pattern 4, crossNic 0, nChannels 12, bw 20.000000/20.000000, type NVL/PIX, sameChannels 1 node-0:1933436:1934547 [6] NCCL INFO 0 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933436:1934547 [6] NCCL INFO 1 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933436:1934547 [6] NCCL INFO 2 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933436:1934547 [6] NCCL INFO 3 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933436:1934547 [6] NCCL INFO 4 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933436:1934547 [6] NCCL INFO 5 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933436:1934547 [6] NCCL INFO 6 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933436:1934547 [6] NCCL INFO 7 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933436:1934547 [6] NCCL INFO 8 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933436:1934547 [6] NCCL INFO 9 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933436:1934547 [6] NCCL INFO 10 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933436:1934547 [6] NCCL INFO 11 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933436:1934547 [6] NCCL INFO Pattern 1, crossNic 0, nChannels 12, bw 20.000000/20.000000, type NVL/PIX, sameChannels 1 node-0:1933436:1934547 [6] NCCL INFO 0 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933436:1934547 [6] NCCL INFO 1 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933436:1934547 [6] NCCL INFO 2 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933436:1934547 [6] NCCL INFO 3 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933436:1934547 [6] NCCL INFO 4 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933436:1934547 [6] NCCL INFO 5 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933436:1934547 [6] NCCL INFO 6 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933436:1934547 [6] NCCL INFO 7 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933436:1934547 [6] NCCL INFO 8 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933436:1934547 [6] NCCL INFO 9 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933436:1934547 [6] NCCL INFO 10 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933436:1934547 [6] NCCL INFO 11 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933433:1934551 [3] NCCL INFO === System : maxBw 240.0 totalBw 240.0 === node-0:1933433:1934551 [3] NCCL INFO CPU/0 (1/2/-1) node-0:1933433:1934551 [3] NCCL INFO + PCI[24.0] - PCI/FFFFFF010 (0) node-0:1933433:1934551 [3] NCCL INFO + PCI[24.0] - GPU/100000 (0) node-0:1933433:1934551 [3] NCCL INFO + NVL[240.0] - NVS/0 node-0:1933433:1934551 [3] NCCL INFO + PCI[24.0] - NIC/10100000 node-0:1933433:1934551 [3] NCCL INFO + PCI[24.0] - GPU/200000 (1) node-0:1933433:1934551 [3] NCCL INFO + NVL[240.0] - NVS/0 node-0:1933433:1934551 [3] NCCL INFO + PCI[24.0] - NIC/10200000 node-0:1933433:1934551 [3] NCCL INFO + SYS[16.0] - CPU/1 node-0:1933433:1934551 [3] NCCL INFO + SYS[16.0] - CPU/2 node-0:1933433:1934551 [3] NCCL INFO + SYS[16.0] - CPU/3 node-0:1933433:1934551 [3] NCCL INFO CPU/1 (1/2/-1) node-0:1933433:1934551 [3] NCCL INFO + PCI[24.0] - PCI/FFFFFF020 (0) node-0:1933433:1934551 [3] NCCL INFO + PCI[24.0] - GPU/300000 (2) node-0:1933433:1934551 [3] NCCL INFO + NVL[240.0] - NVS/0 node-0:1933433:1934551 [3] NCCL INFO + PCI[24.0] - NIC/10300000 node-0:1933433:1934551 [3] NCCL INFO + PCI[24.0] - GPU/400000 (3) node-0:1933433:1934551 [3] NCCL INFO + NVL[240.0] - NVS/0 node-0:1933433:1934551 [3] NCCL INFO + PCI[24.0] - NIC/10400000 node-0:1933433:1934551 [3] NCCL INFO + SYS[16.0] - CPU/0 node-0:1933433:1934551 [3] NCCL INFO + SYS[16.0] - CPU/2 node-0:1933433:1934551 [3] NCCL INFO + SYS[16.0] - CPU/3 node-0:1933433:1934551 [3] NCCL INFO CPU/2 (1/2/-1) node-0:1933433:1934551 [3] NCCL INFO + PCI[24.0] - PCI/FFFFFF030 (0) node-0:1933433:1934551 [3] NCCL INFO + PCI[24.0] - GPU/B00000 (4) node-0:1933433:1934551 [3] NCCL INFO + NVL[240.0] - NVS/0 node-0:1933433:1934551 [3] NCCL INFO + PCI[24.0] - NIC/10500000 node-0:1933433:1934551 [3] NCCL INFO + PCI[24.0] - GPU/C00000 (5) node-0:1933433:1934551 [3] NCCL INFO + NVL[240.0] - NVS/0 node-0:1933433:1934551 [3] NCCL INFO + PCI[24.0] - NIC/10600000 node-0:1933433:1934551 [3] NCCL INFO + SYS[16.0] - CPU/0 node-0:1933433:1934551 [3] NCCL INFO + SYS[16.0] - CPU/1 node-0:1933433:1934551 [3] NCCL INFO + SYS[16.0] - CPU/3 node-0:1933433:1934551 [3] NCCL INFO CPU/3 (1/2/-1) node-0:1933433:1934551 [3] NCCL INFO + PCI[24.0] - PCI/FFFFFF040 (0) node-0:1933433:1934551 [3] NCCL INFO + PCI[24.0] - GPU/D00000 (6) node-0:1933433:1934551 [3] NCCL INFO + NVL[240.0] - NVS/0 node-0:1933433:1934551 [3] NCCL INFO + PCI[24.0] - NIC/10700000 node-0:1933433:1934551 [3] NCCL INFO + PCI[24.0] - GPU/E00000 (7) node-0:1933433:1934551 [3] NCCL INFO + NVL[240.0] - NVS/0 node-0:1933433:1934551 [3] NCCL INFO + PCI[24.0] - NIC/10800000 node-0:1933433:1934551 [3] NCCL INFO + SYS[16.0] - CPU/0 node-0:1933433:1934551 [3] NCCL INFO + SYS[16.0] - CPU/1 node-0:1933433:1934551 [3] NCCL INFO + SYS[16.0] - CPU/2 node-0:1933433:1934551 [3] NCCL INFO ========================================== node-0:1933433:1934551 [3] NCCL INFO GPU/100000 :GPU/100000 (0/5000.000000/LOC) GPU/200000 (2/240.000000/NVL) GPU/300000 (2/240.000000/NVL) GPU/400000 (2/240.000000/NVL) GPU/B00000 (2/240.000000/NVL) GPU/C00000 (2/240.000000/NVL) GPU/D00000 (2/240.000000/NVL) GPU/E00000 (2/240.000000/NVL) NVS/0 (1/240.000000/NVL) CPU/0 (2/24.000000/PHB) CPU/1 (3/16.000000/SYS) CPU/2 (3/16.000000/SYS) CPU/3 (3/16.000000/SYS) node-0:1933433:1934551 [3] NCCL INFO GPU/200000 :GPU/100000 (2/240.000000/NVL) GPU/200000 (0/5000.000000/LOC) GPU/300000 (2/240.000000/NVL) GPU/400000 (2/240.000000/NVL) GPU/B00000 (2/240.000000/NVL) GPU/C00000 (2/240.000000/NVL) GPU/D00000 (2/240.000000/NVL) GPU/E00000 (2/240.000000/NVL) NVS/0 (1/240.000000/NVL) CPU/0 (2/24.000000/PHB) CPU/1 (3/16.000000/SYS) CPU/2 (3/16.000000/SYS) CPU/3 (3/16.000000/SYS) node-0:1933433:1934551 [3] NCCL INFO GPU/300000 :GPU/100000 (2/240.000000/NVL) GPU/200000 (2/240.000000/NVL) GPU/300000 (0/5000.000000/LOC) GPU/400000 (2/240.000000/NVL) GPU/B00000 (2/240.000000/NVL) GPU/C00000 (2/240.000000/NVL) GPU/D00000 (2/240.000000/NVL) GPU/E00000 (2/240.000000/NVL) NVS/0 (1/240.000000/NVL) CPU/0 (3/16.000000/SYS) CPU/1 (2/24.000000/PHB) CPU/2 (3/16.000000/SYS) CPU/3 (3/16.000000/SYS) node-0:1933433:1934551 [3] NCCL INFO GPU/400000 :GPU/100000 (2/240.000000/NVL) GPU/200000 (2/240.000000/NVL) GPU/300000 (2/240.000000/NVL) GPU/400000 (0/5000.000000/LOC) GPU/B00000 (2/240.000000/NVL) GPU/C00000 (2/240.000000/NVL) GPU/D00000 (2/240.000000/NVL) GPU/E00000 (2/240.000000/NVL) NVS/0 (1/240.000000/NVL) CPU/0 (3/16.000000/SYS) CPU/1 (2/24.000000/PHB) CPU/2 (3/16.000000/SYS) CPU/3 (3/16.000000/SYS) node-0:1933433:1934551 [3] NCCL INFO GPU/B00000 :GPU/100000 (2/240.000000/NVL) GPU/200000 (2/240.000000/NVL) GPU/300000 (2/240.000000/NVL) GPU/400000 (2/240.000000/NVL) GPU/B00000 (0/5000.000000/LOC) GPU/C00000 (2/240.000000/NVL) GPU/D00000 (2/240.000000/NVL) GPU/E00000 (2/240.000000/NVL) NVS/0 (1/240.000000/NVL) CPU/0 (3/16.000000/SYS) CPU/1 (3/16.000000/SYS) CPU/2 (2/24.000000/PHB) CPU/3 (3/16.000000/SYS) node-0:1933433:1934551 [3] NCCL INFO GPU/C00000 :GPU/100000 (2/240.000000/NVL) GPU/200000 (2/240.000000/NVL) GPU/300000 (2/240.000000/NVL) GPU/400000 (2/240.000000/NVL) GPU/B00000 (2/240.000000/NVL) GPU/C00000 (0/5000.000000/LOC) GPU/D00000 (2/240.000000/NVL) GPU/E00000 (2/240.000000/NVL) NVS/0 (1/240.000000/NVL) CPU/0 (3/16.000000/SYS) CPU/1 (3/16.000000/SYS) CPU/2 (2/24.000000/PHB) CPU/3 (3/16.000000/SYS) node-0:1933433:1934551 [3] NCCL INFO GPU/D00000 :GPU/100000 (2/240.000000/NVL) GPU/200000 (2/240.000000/NVL) GPU/300000 (2/240.000000/NVL) GPU/400000 (2/240.000000/NVL) GPU/B00000 (2/240.000000/NVL) GPU/C00000 (2/240.000000/NVL) GPU/D00000 (0/5000.000000/LOC) GPU/E00000 (2/240.000000/NVL) NVS/0 (1/240.000000/NVL) CPU/0 (3/16.000000/SYS) CPU/1 (3/16.000000/SYS) CPU/2 (3/16.000000/SYS) CPU/3 (2/24.000000/PHB) node-0:1933430:1934228 [0] NCCL INFO === System : maxBw 240.0 totalBw 240.0 === node-0:1933430:1934228 [0] NCCL INFO CPU/0 (1/2/-1) node-0:1933430:1934228 [0] NCCL INFO + PCI[24.0] - PCI/FFFFFF010 (0) node-0:1933433:1934551 [3] NCCL INFO GPU/E00000 :GPU/100000 (2/240.000000/NVL) GPU/200000 (2/240.000000/NVL) GPU/300000 (2/240.000000/NVL) GPU/400000 (2/240.000000/NVL) GPU/B00000 (2/240.000000/NVL) GPU/C00000 (2/240.000000/NVL) GPU/D00000 (2/240.000000/NVL) GPU/E00000 (0/5000.000000/LOC) NVS/0 (1/240.000000/NVL) CPU/0 (3/16.000000/SYS) CPU/1 (3/16.000000/SYS) CPU/2 (3/16.000000/SYS) CPU/3 (2/24.000000/PHB) node-0:1933430:1934228 [0] NCCL INFO + PCI[24.0] - GPU/100000 (0) node-0:1933430:1934228 [0] NCCL INFO + NVL[240.0] - NVS/0 node-0:1933430:1934228 [0] NCCL INFO + PCI[24.0] - NIC/10100000 node-0:1933430:1934228 [0] NCCL INFO + PCI[24.0] - GPU/200000 (1) node-0:1933433:1934551 [3] NCCL INFO Setting affinity for GPU 3 to ffff,0000ffff node-0:1933430:1934228 [0] NCCL INFO + NVL[240.0] - NVS/0 node-0:1933430:1934228 [0] NCCL INFO + PCI[24.0] - NIC/10200000 node-0:1933430:1934228 [0] NCCL INFO + SYS[16.0] - CPU/1 node-0:1933430:1934228 [0] NCCL INFO + SYS[16.0] - CPU/2 node-0:1933430:1934228 [0] NCCL INFO + SYS[16.0] - CPU/3 node-0:1933430:1934228 [0] NCCL INFO CPU/1 (1/2/-1) node-0:1933430:1934228 [0] NCCL INFO + PCI[24.0] - PCI/FFFFFF020 (0) node-0:1933430:1934228 [0] NCCL INFO + PCI[24.0] - GPU/300000 (2) node-0:1933430:1934228 [0] NCCL INFO + NVL[240.0] - NVS/0 node-0:1933430:1934228 [0] NCCL INFO + PCI[24.0] - NIC/10300000 node-0:1933430:1934228 [0] NCCL INFO + PCI[24.0] - GPU/400000 (3) node-0:1933430:1934228 [0] NCCL INFO + NVL[240.0] - NVS/0 node-0:1933430:1934228 [0] NCCL INFO + PCI[24.0] - NIC/10400000 node-0:1933430:1934228 [0] NCCL INFO + SYS[16.0] - CPU/0 node-0:1933430:1934228 [0] NCCL INFO + SYS[16.0] - CPU/2 node-0:1933430:1934228 [0] NCCL INFO + SYS[16.0] - CPU/3 node-0:1933430:1934228 [0] NCCL INFO CPU/2 (1/2/-1) node-0:1933430:1934228 [0] NCCL INFO + PCI[24.0] - PCI/FFFFFF030 (0) node-0:1933430:1934228 [0] NCCL INFO + PCI[24.0] - GPU/B00000 (4) node-0:1933430:1934228 [0] NCCL INFO + NVL[240.0] - NVS/0 node-0:1933430:1934228 [0] NCCL INFO + PCI[24.0] - NIC/10500000 node-0:1933430:1934228 [0] NCCL INFO + PCI[24.0] - GPU/C00000 (5) node-0:1933430:1934228 [0] NCCL INFO + NVL[240.0] - NVS/0 node-0:1933430:1934228 [0] NCCL INFO + PCI[24.0] - NIC/10600000 node-0:1933430:1934228 [0] NCCL INFO + SYS[16.0] - CPU/0 node-0:1933430:1934228 [0] NCCL INFO + SYS[16.0] - CPU/1 node-0:1933430:1934228 [0] NCCL INFO + SYS[16.0] - CPU/3 node-0:1933430:1934228 [0] NCCL INFO CPU/3 (1/2/-1) node-0:1933430:1934228 [0] NCCL INFO + PCI[24.0] - PCI/FFFFFF040 (0) node-0:1933430:1934228 [0] NCCL INFO + PCI[24.0] - GPU/D00000 (6) node-0:1933430:1934228 [0] NCCL INFO + NVL[240.0] - NVS/0 node-0:1933430:1934228 [0] NCCL INFO + PCI[24.0] - NIC/10700000 node-0:1933430:1934228 [0] NCCL INFO + PCI[24.0] - GPU/E00000 (7) node-0:1933430:1934228 [0] NCCL INFO + NVL[240.0] - NVS/0 node-0:1933430:1934228 [0] NCCL INFO + PCI[24.0] - NIC/10800000 node-0:1933430:1934228 [0] NCCL INFO + SYS[16.0] - CPU/0 node-0:1933430:1934228 [0] NCCL INFO + SYS[16.0] - CPU/1 node-0:1933430:1934228 [0] NCCL INFO + SYS[16.0] - CPU/2 node-0:1933430:1934228 [0] NCCL INFO ========================================== node-0:1933430:1934228 [0] NCCL INFO GPU/100000 :GPU/100000 (0/5000.000000/LOC) GPU/200000 (2/240.000000/NVL) GPU/300000 (2/240.000000/NVL) GPU/400000 (2/240.000000/NVL) GPU/B00000 (2/240.000000/NVL) GPU/C00000 (2/240.000000/NVL) GPU/D00000 (2/240.000000/NVL) GPU/E00000 (2/240.000000/NVL) NVS/0 (1/240.000000/NVL) CPU/0 (2/24.000000/PHB) CPU/1 (3/16.000000/SYS) CPU/2 (3/16.000000/SYS) CPU/3 (3/16.000000/SYS) node-0:1933430:1934228 [0] NCCL INFO GPU/200000 :GPU/100000 (2/240.000000/NVL) GPU/200000 (0/5000.000000/LOC) GPU/300000 (2/240.000000/NVL) GPU/400000 (2/240.000000/NVL) GPU/B00000 (2/240.000000/NVL) GPU/C00000 (2/240.000000/NVL) GPU/D00000 (2/240.000000/NVL) GPU/E00000 (2/240.000000/NVL) NVS/0 (1/240.000000/NVL) CPU/0 (2/24.000000/PHB) CPU/1 (3/16.000000/SYS) CPU/2 (3/16.000000/SYS) CPU/3 (3/16.000000/SYS) node-0:1933430:1934228 [0] NCCL INFO GPU/300000 :GPU/100000 (2/240.000000/NVL) GPU/200000 (2/240.000000/NVL) GPU/300000 (0/5000.000000/LOC) GPU/400000 (2/240.000000/NVL) GPU/B00000 (2/240.000000/NVL) GPU/C00000 (2/240.000000/NVL) GPU/D00000 (2/240.000000/NVL) GPU/E00000 (2/240.000000/NVL) NVS/0 (1/240.000000/NVL) CPU/0 (3/16.000000/SYS) CPU/1 (2/24.000000/PHB) CPU/2 (3/16.000000/SYS) CPU/3 (3/16.000000/SYS) node-0:1933430:1934228 [0] NCCL INFO GPU/400000 :GPU/100000 (2/240.000000/NVL) GPU/200000 (2/240.000000/NVL) GPU/300000 (2/240.000000/NVL) GPU/400000 (0/5000.000000/LOC) GPU/B00000 (2/240.000000/NVL) GPU/C00000 (2/240.000000/NVL) GPU/D00000 (2/240.000000/NVL) GPU/E00000 (2/240.000000/NVL) NVS/0 (1/240.000000/NVL) CPU/0 (3/16.000000/SYS) CPU/1 (2/24.000000/PHB) CPU/2 (3/16.000000/SYS) CPU/3 (3/16.000000/SYS) node-0:1933430:1934228 [0] NCCL INFO GPU/B00000 :GPU/100000 (2/240.000000/NVL) GPU/200000 (2/240.000000/NVL) GPU/300000 (2/240.000000/NVL) GPU/400000 (2/240.000000/NVL) GPU/B00000 (0/5000.000000/LOC) GPU/C00000 (2/240.000000/NVL) GPU/D00000 (2/240.000000/NVL) GPU/E00000 (2/240.000000/NVL) NVS/0 (1/240.000000/NVL) CPU/0 (3/16.000000/SYS) CPU/1 (3/16.000000/SYS) CPU/2 (2/24.000000/PHB) CPU/3 (3/16.000000/SYS) node-0:1933430:1934228 [0] NCCL INFO GPU/C00000 :GPU/100000 (2/240.000000/NVL) GPU/200000 (2/240.000000/NVL) GPU/300000 (2/240.000000/NVL) GPU/400000 (2/240.000000/NVL) GPU/B00000 (2/240.000000/NVL) GPU/C00000 (0/5000.000000/LOC) GPU/D00000 (2/240.000000/NVL) GPU/E00000 (2/240.000000/NVL) NVS/0 (1/240.000000/NVL) CPU/0 (3/16.000000/SYS) CPU/1 (3/16.000000/SYS) CPU/2 (2/24.000000/PHB) CPU/3 (3/16.000000/SYS) node-0:1933430:1934228 [0] NCCL INFO GPU/D00000 :GPU/100000 (2/240.000000/NVL) GPU/200000 (2/240.000000/NVL) GPU/300000 (2/240.000000/NVL) GPU/400000 (2/240.000000/NVL) GPU/B00000 (2/240.000000/NVL) GPU/C00000 (2/240.000000/NVL) GPU/D00000 (0/5000.000000/LOC) GPU/E00000 (2/240.000000/NVL) NVS/0 (1/240.000000/NVL) CPU/0 (3/16.000000/SYS) CPU/1 (3/16.000000/SYS) CPU/2 (3/16.000000/SYS) CPU/3 (2/24.000000/PHB) node-0:1933430:1934228 [0] NCCL INFO GPU/E00000 :GPU/100000 (2/240.000000/NVL) GPU/200000 (2/240.000000/NVL) GPU/300000 (2/240.000000/NVL) GPU/400000 (2/240.000000/NVL) GPU/B00000 (2/240.000000/NVL) GPU/C00000 (2/240.000000/NVL) GPU/D00000 (2/240.000000/NVL) GPU/E00000 (0/5000.000000/LOC) NVS/0 (1/240.000000/NVL) CPU/0 (3/16.000000/SYS) CPU/1 (3/16.000000/SYS) CPU/2 (3/16.000000/SYS) CPU/3 (2/24.000000/PHB) node-0:1933430:1934228 [0] NCCL INFO Setting affinity for GPU 0 to ffff,0000ffff node-0:1933433:1934551 [3] NCCL INFO NVLS multicast support is not available on dev 3 node-0:1933435:1934848 [5] NCCL INFO NVLS multicast support is not available on dev 5 node-0:1933435:1934848 [5] NCCL INFO Pattern 4, crossNic 0, nChannels 12, bw 20.000000/20.000000, type NVL/PIX, sameChannels 1 node-0:1933433:1934551 [3] NCCL INFO Pattern 4, crossNic 0, nChannels 12, bw 20.000000/20.000000, type NVL/PIX, sameChannels 1 node-0:1933435:1934848 [5] NCCL INFO 0 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933433:1934551 [3] NCCL INFO 0 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933435:1934848 [5] NCCL INFO 1 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933433:1934551 [3] NCCL INFO 1 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933435:1934848 [5] NCCL INFO 2 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933433:1934551 [3] NCCL INFO 2 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933435:1934848 [5] NCCL INFO 3 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933433:1934551 [3] NCCL INFO 3 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933435:1934848 [5] NCCL INFO 4 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933433:1934551 [3] NCCL INFO 4 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933435:1934848 [5] NCCL INFO 5 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933433:1934551 [3] NCCL INFO 5 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933435:1934848 [5] NCCL INFO 6 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933433:1934551 [3] NCCL INFO 6 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933435:1934848 [5] NCCL INFO 7 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933433:1934551 [3] NCCL INFO 7 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933435:1934848 [5] NCCL INFO 8 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933433:1934551 [3] NCCL INFO 8 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933435:1934848 [5] NCCL INFO 9 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933433:1934551 [3] NCCL INFO 9 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933435:1934848 [5] NCCL INFO 10 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933433:1934551 [3] NCCL INFO 10 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933435:1934848 [5] NCCL INFO 11 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933433:1934551 [3] NCCL INFO 11 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933435:1934848 [5] NCCL INFO Pattern 1, crossNic 0, nChannels 12, bw 20.000000/20.000000, type NVL/PIX, sameChannels 1 node-0:1933435:1934848 [5] NCCL INFO 0 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933435:1934848 [5] NCCL INFO 1 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933435:1934848 [5] NCCL INFO 2 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933433:1934551 [3] NCCL INFO Pattern 1, crossNic 0, nChannels 12, bw 20.000000/20.000000, type NVL/PIX, sameChannels 1 node-0:1933435:1934848 [5] NCCL INFO 3 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933433:1934551 [3] NCCL INFO 0 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933435:1934848 [5] NCCL INFO 4 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933433:1934551 [3] NCCL INFO 1 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933435:1934848 [5] NCCL INFO 5 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933433:1934551 [3] NCCL INFO 2 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933435:1934848 [5] NCCL INFO 6 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933433:1934551 [3] NCCL INFO 3 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933435:1934848 [5] NCCL INFO 7 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933433:1934551 [3] NCCL INFO 4 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933435:1934848 [5] NCCL INFO 8 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933433:1934551 [3] NCCL INFO 5 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933435:1934848 [5] NCCL INFO 9 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933433:1934551 [3] NCCL INFO 6 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933435:1934848 [5] NCCL INFO 10 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933433:1934551 [3] NCCL INFO 7 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933435:1934848 [5] NCCL INFO 11 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933433:1934551 [3] NCCL INFO 8 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933433:1934551 [3] NCCL INFO 9 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933433:1934551 [3] NCCL INFO 10 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933433:1934551 [3] NCCL INFO 11 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933430:1934228 [0] NCCL INFO NVLS multicast support is not available on dev 0 node-0:1933430:1934228 [0] NCCL INFO Pattern 4, crossNic 0, nChannels 12, bw 20.000000/20.000000, type NVL/PIX, sameChannels 1 node-0:1933430:1934228 [0] NCCL INFO 0 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933430:1934228 [0] NCCL INFO 1 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933430:1934228 [0] NCCL INFO 2 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933430:1934228 [0] NCCL INFO 3 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933430:1934228 [0] NCCL INFO 4 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933430:1934228 [0] NCCL INFO 5 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933430:1934228 [0] NCCL INFO 6 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933430:1934228 [0] NCCL INFO 7 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933430:1934228 [0] NCCL INFO 8 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933430:1934228 [0] NCCL INFO 9 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933430:1934228 [0] NCCL INFO 10 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933430:1934228 [0] NCCL INFO 11 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933430:1934228 [0] NCCL INFO Pattern 1, crossNic 0, nChannels 12, bw 20.000000/20.000000, type NVL/PIX, sameChannels 1 node-0:1933430:1934228 [0] NCCL INFO 0 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933430:1934228 [0] NCCL INFO 1 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933430:1934228 [0] NCCL INFO 2 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933430:1934228 [0] NCCL INFO 3 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933430:1934228 [0] NCCL INFO 4 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933430:1934228 [0] NCCL INFO 5 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933430:1934228 [0] NCCL INFO 6 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933430:1934228 [0] NCCL INFO 7 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933430:1934228 [0] NCCL INFO 8 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933430:1934228 [0] NCCL INFO 9 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933430:1934228 [0] NCCL INFO 10 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933430:1934228 [0] NCCL INFO 11 : GPU/0 GPU/1 GPU/2 GPU/3 GPU/4 GPU/5 GPU/6 GPU/7 node-0:1933430:1934228 [0] NCCL INFO Tree 0 : -1 -> 0 -> 1/-1/-1 node-0:1933430:1934228 [0] NCCL INFO Tree 12 : -1 -> 0 -> 1/-1/-1 node-0:1933430:1934228 [0] NCCL INFO Tree 1 : -1 -> 0 -> 1/-1/-1 node-0:1933430:1934228 [0] NCCL INFO Tree 13 : -1 -> 0 -> 1/-1/-1 node-0:1933430:1934228 [0] NCCL INFO Tree 2 : -1 -> 0 -> 1/-1/-1 node-0:1933430:1934228 [0] NCCL INFO Tree 14 : -1 -> 0 -> 1/-1/-1 node-0:1933430:1934228 [0] NCCL INFO Tree 3 : -1 -> 0 -> 1/-1/-1 node-0:1933430:1934228 [0] NCCL INFO Tree 15 : -1 -> 0 -> 1/-1/-1 node-0:1933430:1934228 [0] NCCL INFO Tree 4 : -1 -> 0 -> 1/-1/-1 node-0:1933430:1934228 [0] NCCL INFO Tree 16 : -1 -> 0 -> 1/-1/-1 node-0:1933430:1934228 [0] NCCL INFO Tree 5 : -1 -> 0 -> 1/-1/-1 node-0:1933430:1934228 [0] NCCL INFO Tree 17 : -1 -> 0 -> 1/-1/-1 node-0:1933430:1934228 [0] NCCL INFO Tree 6 : -1 -> 0 -> 1/-1/-1 node-0:1933430:1934228 [0] NCCL INFO Tree 18 : -1 -> 0 -> 1/-1/-1 node-0:1933430:1934228 [0] NCCL INFO Tree 7 : -1 -> 0 -> 1/-1/-1 node-0:1933430:1934228 [0] NCCL INFO Tree 19 : -1 -> 0 -> 1/-1/-1 node-0:1933430:1934228 [0] NCCL INFO Tree 8 : -1 -> 0 -> 1/-1/-1 node-0:1933436:1934547 [6] NCCL INFO Ring 00 : 5 -> 6 -> 7 node-0:1933430:1934228 [0] NCCL INFO Tree 20 : -1 -> 0 -> 1/-1/-1 node-0:1933430:1934228 [0] NCCL INFO Tree 9 : -1 -> 0 -> 1/-1/-1 node-0:1933436:1934547 [6] NCCL INFO Ring 01 : 5 -> 6 -> 7 node-0:1933431:1934545 [1] NCCL INFO Tree 0 : 0 -> 1 -> 2/-1/-1 node-0:1933430:1934228 [0] NCCL INFO Tree 21 : -1 -> 0 -> 1/-1/-1 node-0:1933436:1934547 [6] NCCL INFO Ring 02 : 5 -> 6 -> 7 node-0:1933431:1934545 [1] NCCL INFO Tree 12 : 0 -> 1 -> 2/-1/-1 node-0:1933430:1934228 [0] NCCL INFO Tree 10 : -1 -> 0 -> 1/-1/-1 node-0:1933436:1934547 [6] NCCL INFO Ring 03 : 5 -> 6 -> 7 node-0:1933431:1934545 [1] NCCL INFO Tree 1 : 0 -> 1 -> 2/-1/-1 node-0:1933430:1934228 [0] NCCL INFO Tree 22 : -1 -> 0 -> 1/-1/-1 node-0:1933437:1934852 [7] NCCL INFO Ring 00 : 6 -> 7 -> 0 node-0:1933431:1934545 [1] NCCL INFO Tree 13 : 0 -> 1 -> 2/-1/-1 node-0:1933430:1934228 [0] NCCL INFO Tree 11 : -1 -> 0 -> 1/-1/-1 node-0:1933435:1934848 [5] NCCL INFO Ring 00 : 4 -> 5 -> 6 node-0:1933437:1934852 [7] NCCL INFO Ring 01 : 6 -> 7 -> 0 node-0:1933431:1934545 [1] NCCL INFO Tree 2 : 0 -> 1 -> 2/-1/-1 node-0:1933436:1934547 [6] NCCL INFO Ring 04 : 5 -> 6 -> 7 node-0:1933430:1934228 [0] NCCL INFO Tree 23 : -1 -> 0 -> 1/-1/-1 node-0:1933435:1934848 [5] NCCL INFO Ring 01 : 4 -> 5 -> 6 node-0:1933437:1934852 [7] NCCL INFO Ring 02 : 6 -> 7 -> 0 node-0:1933436:1934547 [6] NCCL INFO Ring 05 : 5 -> 6 -> 7 node-0:1933431:1934545 [1] NCCL INFO Tree 14 : 0 -> 1 -> 2/-1/-1 node-0:1933432:1934230 [2] NCCL INFO Ring 00 : 1 -> 2 -> 3 node-0:1933435:1934848 [5] NCCL INFO Ring 02 : 4 -> 5 -> 6 node-0:1933437:1934852 [7] NCCL INFO Ring 03 : 6 -> 7 -> 0 node-0:1933431:1934545 [1] NCCL INFO Tree 3 : 0 -> 1 -> 2/-1/-1 node-0:1933436:1934547 [6] NCCL INFO Ring 06 : 5 -> 6 -> 7 node-0:1933434:1934850 [4] NCCL INFO Ring 00 : 3 -> 4 -> 5 node-0:1933432:1934230 [2] NCCL INFO Ring 01 : 1 -> 2 -> 3 node-0:1933435:1934848 [5] NCCL INFO Ring 03 : 4 -> 5 -> 6 node-0:1933437:1934852 [7] NCCL INFO Ring 04 : 6 -> 7 -> 0 node-0:1933431:1934545 [1] NCCL INFO Tree 15 : 0 -> 1 -> 2/-1/-1 node-0:1933436:1934547 [6] NCCL INFO Ring 07 : 5 -> 6 -> 7 node-0:1933433:1934551 [3] NCCL INFO Ring 00 : 2 -> 3 -> 4 node-0:1933431:1934545 [1] NCCL INFO Tree 4 : 0 -> 1 -> 2/-1/-1 node-0:1933434:1934850 [4] NCCL INFO Ring 01 : 3 -> 4 -> 5 node-0:1933432:1934230 [2] NCCL INFO Ring 02 : 1 -> 2 -> 3 node-0:1933431:1934545 [1] NCCL INFO Tree 16 : 0 -> 1 -> 2/-1/-1 node-0:1933435:1934848 [5] NCCL INFO Ring 04 : 4 -> 5 -> 6 node-0:1933437:1934852 [7] NCCL INFO Ring 05 : 6 -> 7 -> 0 node-0:1933436:1934547 [6] NCCL INFO Ring 08 : 5 -> 6 -> 7 node-0:1933433:1934551 [3] NCCL INFO Ring 01 : 2 -> 3 -> 4 node-0:1933434:1934850 [4] NCCL INFO Ring 02 : 3 -> 4 -> 5 node-0:1933436:1934547 [6] NCCL INFO Ring 09 : 5 -> 6 -> 7 node-0:1933433:1934551 [3] NCCL INFO Ring 02 : 2 -> 3 -> 4 node-0:1933434:1934850 [4] NCCL INFO Ring 03 : 3 -> 4 -> 5 node-0:1933436:1934547 [6] NCCL INFO Ring 10 : 5 -> 6 -> 7 node-0:1933432:1934230 [2] NCCL INFO Ring 03 : 1 -> 2 -> 3 node-0:1933433:1934551 [3] NCCL INFO Ring 03 : 2 -> 3 -> 4 node-0:1933434:1934850 [4] NCCL INFO Ring 04 : 3 -> 4 -> 5 node-0:1933431:1934545 [1] NCCL INFO Tree 5 : 0 -> 1 -> 2/-1/-1 node-0:1933435:1934848 [5] NCCL INFO Ring 05 : 4 -> 5 -> 6 node-0:1933437:1934852 [7] NCCL INFO Ring 06 : 6 -> 7 -> 0 node-0:1933430:1934228 [0] NCCL INFO Channel 00/24 : 0 1 2 3 4 5 6 7 node-0:1933436:1934547 [6] NCCL INFO Ring 11 : 5 -> 6 -> 7 node-0:1933432:1934230 [2] NCCL INFO Ring 04 : 1 -> 2 -> 3 node-0:1933433:1934551 [3] NCCL INFO Ring 04 : 2 -> 3 -> 4 node-0:1933436:1934547 [6] NCCL INFO Ring 12 : 5 -> 6 -> 7 node-0:1933431:1934545 [1] NCCL INFO Tree 17 : 0 -> 1 -> 2/-1/-1 node-0:1933434:1934850 [4] NCCL INFO Ring 05 : 3 -> 4 -> 5 node-0:1933435:1934848 [5] NCCL INFO Ring 06 : 4 -> 5 -> 6 node-0:1933437:1934852 [7] NCCL INFO Ring 07 : 6 -> 7 -> 0 node-0:1933430:1934228 [0] NCCL INFO Channel 01/24 : 0 1 2 3 4 5 6 7 node-0:1933432:1934230 [2] NCCL INFO Ring 05 : 1 -> 2 -> 3 node-0:1933433:1934551 [3] NCCL INFO Ring 05 : 2 -> 3 -> 4 node-0:1933431:1934545 [1] NCCL INFO Tree 6 : 0 -> 1 -> 2/-1/-1 node-0:1933436:1934547 [6] NCCL INFO Ring 13 : 5 -> 6 -> 7 node-0:1933434:1934850 [4] NCCL INFO Ring 06 : 3 -> 4 -> 5 node-0:1933435:1934848 [5] NCCL INFO Ring 07 : 4 -> 5 -> 6 node-0:1933437:1934852 [7] NCCL INFO Ring 08 : 6 -> 7 -> 0 node-0:1933430:1934228 [0] NCCL INFO Channel 02/24 : 0 1 2 3 4 5 6 7 node-0:1933432:1934230 [2] NCCL INFO Ring 06 : 1 -> 2 -> 3 node-0:1933433:1934551 [3] NCCL INFO Ring 06 : 2 -> 3 -> 4 node-0:1933431:1934545 [1] NCCL INFO Tree 18 : 0 -> 1 -> 2/-1/-1 node-0:1933436:1934547 [6] NCCL INFO Ring 14 : 5 -> 6 -> 7 node-0:1933434:1934850 [4] NCCL INFO Ring 07 : 3 -> 4 -> 5 node-0:1933431:1934545 [1] NCCL INFO Tree 7 : 0 -> 1 -> 2/-1/-1 node-0:1933435:1934848 [5] NCCL INFO Ring 08 : 4 -> 5 -> 6 node-0:1933437:1934852 [7] NCCL INFO Ring 09 : 6 -> 7 -> 0 node-0:1933434:1934850 [4] NCCL INFO Ring 08 : 3 -> 4 -> 5 node-0:1933430:1934228 [0] NCCL INFO Channel 03/24 : 0 1 2 3 4 5 6 7 node-0:1933432:1934230 [2] NCCL INFO Ring 07 : 1 -> 2 -> 3 node-0:1933433:1934551 [3] NCCL INFO Ring 07 : 2 -> 3 -> 4 node-0:1933437:1934852 [7] NCCL INFO Ring 10 : 6 -> 7 -> 0 node-0:1933436:1934547 [6] NCCL INFO Ring 15 : 5 -> 6 -> 7 node-0:1933431:1934545 [1] NCCL INFO Tree 19 : 0 -> 1 -> 2/-1/-1 node-0:1933435:1934848 [5] NCCL INFO Ring 09 : 4 -> 5 -> 6 node-0:1933434:1934850 [4] NCCL INFO Ring 09 : 3 -> 4 -> 5 node-0:1933432:1934230 [2] NCCL INFO Ring 08 : 1 -> 2 -> 3 node-0:1933430:1934228 [0] NCCL INFO Channel 04/24 : 0 1 2 3 4 5 6 7 node-0:1933433:1934551 [3] NCCL INFO Ring 08 : 2 -> 3 -> 4 node-0:1933437:1934852 [7] NCCL INFO Ring 11 : 6 -> 7 -> 0 node-0:1933436:1934547 [6] NCCL INFO Ring 16 : 5 -> 6 -> 7 node-0:1933431:1934545 [1] NCCL INFO Tree 8 : 0 -> 1 -> 2/-1/-1 node-0:1933433:1934551 [3] NCCL INFO Ring 09 : 2 -> 3 -> 4 node-0:1933435:1934848 [5] NCCL INFO Ring 10 : 4 -> 5 -> 6 node-0:1933434:1934850 [4] NCCL INFO Ring 10 : 3 -> 4 -> 5 node-0:1933432:1934230 [2] NCCL INFO Ring 09 : 1 -> 2 -> 3 node-0:1933430:1934228 [0] NCCL INFO Channel 05/24 : 0 1 2 3 4 5 6 7 node-0:1933437:1934852 [7] NCCL INFO Ring 12 : 6 -> 7 -> 0 node-0:1933436:1934547 [6] NCCL INFO Ring 17 : 5 -> 6 -> 7 node-0:1933431:1934545 [1] NCCL INFO Tree 20 : 0 -> 1 -> 2/-1/-1 node-0:1933432:1934230 [2] NCCL INFO Ring 10 : 1 -> 2 -> 3 node-0:1933433:1934551 [3] NCCL INFO Ring 10 : 2 -> 3 -> 4 node-0:1933435:1934848 [5] NCCL INFO Ring 11 : 4 -> 5 -> 6 node-0:1933434:1934850 [4] NCCL INFO Ring 11 : 3 -> 4 -> 5 node-0:1933437:1934852 [7] NCCL INFO Ring 13 : 6 -> 7 -> 0 node-0:1933430:1934228 [0] NCCL INFO Channel 06/24 : 0 1 2 3 4 5 6 7 node-0:1933436:1934547 [6] NCCL INFO Ring 18 : 5 -> 6 -> 7 node-0:1933431:1934545 [1] NCCL INFO Tree 9 : 0 -> 1 -> 2/-1/-1 node-0:1933432:1934230 [2] NCCL INFO Ring 11 : 1 -> 2 -> 3 node-0:1933433:1934551 [3] NCCL INFO Ring 11 : 2 -> 3 -> 4 node-0:1933435:1934848 [5] NCCL INFO Ring 12 : 4 -> 5 -> 6 node-0:1933434:1934850 [4] NCCL INFO Ring 12 : 3 -> 4 -> 5 node-0:1933437:1934852 [7] NCCL INFO Ring 14 : 6 -> 7 -> 0 node-0:1933430:1934228 [0] NCCL INFO Channel 07/24 : 0 1 2 3 4 5 6 7 node-0:1933436:1934547 [6] NCCL INFO Ring 19 : 5 -> 6 -> 7 node-0:1933431:1934545 [1] NCCL INFO Tree 21 : 0 -> 1 -> 2/-1/-1 node-0:1933432:1934230 [2] NCCL INFO Ring 12 : 1 -> 2 -> 3 node-0:1933437:1934852 [7] NCCL INFO Ring 15 : 6 -> 7 -> 0 node-0:1933433:1934551 [3] NCCL INFO Ring 12 : 2 -> 3 -> 4 node-0:1933435:1934848 [5] NCCL INFO Ring 13 : 4 -> 5 -> 6 node-0:1933434:1934850 [4] NCCL INFO Ring 13 : 3 -> 4 -> 5 node-0:1933431:1934545 [1] NCCL INFO Tree 10 : 0 -> 1 -> 2/-1/-1 node-0:1933436:1934547 [6] NCCL INFO Ring 20 : 5 -> 6 -> 7 node-0:1933430:1934228 [0] NCCL INFO Channel 08/24 : 0 1 2 3 4 5 6 7 node-0:1933435:1934848 [5] NCCL INFO Ring 14 : 4 -> 5 -> 6 node-0:1933431:1934545 [1] NCCL INFO Tree 22 : 0 -> 1 -> 2/-1/-1 node-0:1933436:1934547 [6] NCCL INFO Ring 21 : 5 -> 6 -> 7 node-0:1933432:1934230 [2] NCCL INFO Ring 13 : 1 -> 2 -> 3 node-0:1933433:1934551 [3] NCCL INFO Ring 13 : 2 -> 3 -> 4 node-0:1933435:1934848 [5] NCCL INFO Ring 15 : 4 -> 5 -> 6 node-0:1933437:1934852 [7] NCCL INFO Ring 16 : 6 -> 7 -> 0 node-0:1933432:1934230 [2] NCCL INFO Ring 14 : 1 -> 2 -> 3 node-0:1933434:1934850 [4] NCCL INFO Ring 14 : 3 -> 4 -> 5 node-0:1933430:1934228 [0] NCCL INFO Channel 09/24 : 0 1 2 3 4 5 6 7 node-0:1933431:1934545 [1] NCCL INFO Tree 11 : 0 -> 1 -> 2/-1/-1 node-0:1933436:1934547 [6] NCCL INFO Ring 22 : 5 -> 6 -> 7 node-0:1933433:1934551 [3] NCCL INFO Ring 14 : 2 -> 3 -> 4 node-0:1933435:1934848 [5] NCCL INFO Ring 16 : 4 -> 5 -> 6 node-0:1933437:1934852 [7] NCCL INFO Ring 17 : 6 -> 7 -> 0 node-0:1933432:1934230 [2] NCCL INFO Ring 15 : 1 -> 2 -> 3 node-0:1933434:1934850 [4] NCCL INFO Ring 15 : 3 -> 4 -> 5 node-0:1933430:1934228 [0] NCCL INFO Channel 10/24 : 0 1 2 3 4 5 6 7 node-0:1933436:1934547 [6] NCCL INFO Ring 23 : 5 -> 6 -> 7 node-0:1933431:1934545 [1] NCCL INFO Tree 23 : 0 -> 1 -> 2/-1/-1 node-0:1933435:1934848 [5] NCCL INFO Ring 17 : 4 -> 5 -> 6 node-0:1933433:1934551 [3] NCCL INFO Ring 15 : 2 -> 3 -> 4 node-0:1933437:1934852 [7] NCCL INFO Ring 18 : 6 -> 7 -> 0 node-0:1933432:1934230 [2] NCCL INFO Ring 16 : 1 -> 2 -> 3 node-0:1933434:1934850 [4] NCCL INFO Ring 16 : 3 -> 4 -> 5 node-0:1933430:1934228 [0] NCCL INFO Channel 11/24 : 0 1 2 3 4 5 6 7 node-0:1933436:1934547 [6] NCCL INFO Trees [0] 7/-1/-1->6->5 [1] 7/-1/-1->6->5 [2] 7/-1/-1->6->5 [3] 7/-1/-1->6->5 [4] 7/-1/-1->6->5 [5] 7/-1/-1->6->5 [6] 7/-1/-1->6->5 [7] 7/-1/-1->6->5 [8] 7/-1/-1->6->5 [9] 7/-1/-1->6->5 [10] 7/-1/-1->6->5 [11] 7/-1/-1->6->5 [12] 7/-1/-1->6->5 [13] 7/-1/-1->6->5 [14] 7/-1/-1->6->5 [15] 7/-1/-1->6->5 [16] 7/-1/-1->6->5 [17] 7/-1/-1->6->5 [18] 7/-1/-1->6->5 [19] 7/-1/-1->6->5 [20] 7/-1/-1->6->5 [21] 7/-1/-1->6->5 [22] 7/-1/-1->6->5 [23] 7/-1/-1->6->5 node-0:1933435:1934848 [5] NCCL INFO Ring 18 : 4 -> 5 -> 6 node-0:1933433:1934551 [3] NCCL INFO Ring 16 : 2 -> 3 -> 4 node-0:1933437:1934852 [7] NCCL INFO Ring 19 : 6 -> 7 -> 0 node-0:1933432:1934230 [2] NCCL INFO Ring 17 : 1 -> 2 -> 3 node-0:1933434:1934850 [4] NCCL INFO Ring 17 : 3 -> 4 -> 5 node-0:1933430:1934228 [0] NCCL INFO Channel 12/24 : 0 1 2 3 4 5 6 7 node-0:1933435:1934848 [5] NCCL INFO Ring 19 : 4 -> 5 -> 6 node-0:1933433:1934551 [3] NCCL INFO Ring 17 : 2 -> 3 -> 4 node-0:1933437:1934852 [7] NCCL INFO Ring 20 : 6 -> 7 -> 0 node-0:1933430:1934228 [0] NCCL INFO Channel 13/24 : 0 1 2 3 4 5 6 7 node-0:1933432:1934230 [2] NCCL INFO Ring 18 : 1 -> 2 -> 3 node-0:1933436:1934547 [6] NCCL INFO P2P Chunksize set to 524288 node-0:1933437:1934852 [7] NCCL INFO Ring 21 : 6 -> 7 -> 0 node-0:1933434:1934850 [4] NCCL INFO Ring 18 : 3 -> 4 -> 5 node-0:1933435:1934848 [5] NCCL INFO Ring 20 : 4 -> 5 -> 6 node-0:1933433:1934551 [3] NCCL INFO Ring 18 : 2 -> 3 -> 4 node-0:1933430:1934228 [0] NCCL INFO Channel 14/24 : 0 1 2 3 4 5 6 7 node-0:1933437:1934852 [7] NCCL INFO Ring 22 : 6 -> 7 -> 0 node-0:1933432:1934230 [2] NCCL INFO Ring 19 : 1 -> 2 -> 3 node-0:1933434:1934850 [4] NCCL INFO Ring 19 : 3 -> 4 -> 5 node-0:1933430:1934228 [0] NCCL INFO Channel 15/24 : 0 1 2 3 4 5 6 7 node-0:1933437:1934852 [7] NCCL INFO Ring 23 : 6 -> 7 -> 0 node-0:1933431:1934545 [1] NCCL INFO Ring 00 : 0 -> 1 -> 2 node-0:1933435:1934848 [5] NCCL INFO Ring 21 : 4 -> 5 -> 6 node-0:1933437:1934852 [7] NCCL INFO Trees [0] -1/-1/-1->7->6 [1] -1/-1/-1->7->6 [2] -1/-1/-1->7->6 [3] -1/-1/-1->7->6 [4] -1/-1/-1->7->6 [5] -1/-1/-1->7->6 [6] -1/-1/-1->7->6 [7] -1/-1/-1->7->6 [8] -1/-1/-1->7->6 [9] -1/-1/-1->7->6 [10] -1/-1/-1->7->6 [11] -1/-1/-1->7->6 [12] -1/-1/-1->7->6 [13] -1/-1/-1->7->6 [14] -1/-1/-1->7->6 [15] -1/-1/-1->7->6 [16] -1/-1/-1->7->6 [17] -1/-1/-1->7->6 [18] -1/-1/-1->7->6 [19] -1/-1/-1->7->6 [20] -1/-1/-1->7->6 [21] -1/-1/-1->7->6 [22] -1/-1/-1->7->6 [23] -1/-1/-1->7->6 node-0:1933430:1934228 [0] NCCL INFO Channel 16/24 : 0 1 2 3 4 5 6 7 node-0:1933431:1934545 [1] NCCL INFO Ring 01 : 0 -> 1 -> 2 node-0:1933433:1934551 [3] NCCL INFO Ring 19 : 2 -> 3 -> 4 node-0:1933431:1934545 [1] NCCL INFO Ring 02 : 0 -> 1 -> 2 node-0:1933432:1934230 [2] NCCL INFO Ring 20 : 1 -> 2 -> 3 node-0:1933434:1934850 [4] NCCL INFO Ring 20 : 3 -> 4 -> 5 node-0:1933435:1934848 [5] NCCL INFO Ring 22 : 4 -> 5 -> 6 node-0:1933430:1934228 [0] NCCL INFO Channel 17/24 : 0 1 2 3 4 5 6 7 node-0:1933433:1934551 [3] NCCL INFO Ring 20 : 2 -> 3 -> 4 node-0:1933437:1934852 [7] NCCL INFO P2P Chunksize set to 524288 node-0:1933431:1934545 [1] NCCL INFO Ring 03 : 0 -> 1 -> 2 node-0:1933432:1934230 [2] NCCL INFO Ring 21 : 1 -> 2 -> 3 node-0:1933434:1934850 [4] NCCL INFO Ring 21 : 3 -> 4 -> 5 node-0:1933435:1934848 [5] NCCL INFO Ring 23 : 4 -> 5 -> 6 node-0:1933431:1934545 [1] NCCL INFO Ring 04 : 0 -> 1 -> 2 node-0:1933430:1934228 [0] NCCL INFO Channel 18/24 : 0 1 2 3 4 5 6 7 node-0:1933434:1934850 [4] NCCL INFO Ring 22 : 3 -> 4 -> 5 node-0:1933435:1934848 [5] NCCL INFO Trees [0] 6/-1/-1->5->4 [1] 6/-1/-1->5->4 [2] 6/-1/-1->5->4 [3] 6/-1/-1->5->4 [4] 6/-1/-1->5->4 [5] 6/-1/-1->5->4 [6] 6/-1/-1->5->4 [7] 6/-1/-1->5->4 [8] 6/-1/-1->5->4 [9] 6/-1/-1->5->4 [10] 6/-1/-1->5->4 [11] 6/-1/-1->5->4 [12] 6/-1/-1->5->4 [13] 6/-1/-1->5->4 [14] 6/-1/-1->5->4 [15] 6/-1/-1->5->4 [16] 6/-1/-1->5->4 [17] 6/-1/-1->5->4 [18] 6/-1/-1->5->4 [19] 6/-1/-1->5->4 [20] 6/-1/-1->5->4 [21] 6/-1/-1->5->4 [22] 6/-1/-1->5->4 [23] 6/-1/-1->5->4 node-0:1933433:1934551 [3] NCCL INFO Ring 21 : 2 -> 3 -> 4 node-0:1933432:1934230 [2] NCCL INFO Ring 22 : 1 -> 2 -> 3 node-0:1933431:1934545 [1] NCCL INFO Ring 05 : 0 -> 1 -> 2 node-0:1933430:1934228 [0] NCCL INFO Channel 19/24 : 0 1 2 3 4 5 6 7 node-0:1933434:1934850 [4] NCCL INFO Ring 23 : 3 -> 4 -> 5 node-0:1933433:1934551 [3] NCCL INFO Ring 22 : 2 -> 3 -> 4 node-0:1933435:1934848 [5] NCCL INFO P2P Chunksize set to 524288 node-0:1933432:1934230 [2] NCCL INFO Ring 23 : 1 -> 2 -> 3 node-0:1933431:1934545 [1] NCCL INFO Ring 06 : 0 -> 1 -> 2 node-0:1933434:1934850 [4] NCCL INFO Trees [0] 5/-1/-1->4->3 [1] 5/-1/-1->4->3 [2] 5/-1/-1->4->3 [3] 5/-1/-1->4->3 [4] 5/-1/-1->4->3 [5] 5/-1/-1->4->3 [6] 5/-1/-1->4->3 [7] 5/-1/-1->4->3 [8] 5/-1/-1->4->3 [9] 5/-1/-1->4->3 [10] 5/-1/-1->4->3 [11] 5/-1/-1->4->3 [12] 5/-1/-1->4->3 [13] 5/-1/-1->4->3 [14] 5/-1/-1->4->3 [15] 5/-1/-1->4->3 [16] 5/-1/-1->4->3 [17] 5/-1/-1->4->3 [18] 5/-1/-1->4->3 [19] 5/-1/-1->4->3 [20] 5/-1/-1->4->3 [21] 5/-1/-1->4->3 [22] 5/-1/-1->4->3 [23] 5/-1/-1->4->3 node-0:1933432:1934230 [2] NCCL INFO Trees [0] 3/-1/-1->2->1 [1] 3/-1/-1->2->1 [2] 3/-1/-1->2->1 [3] 3/-1/-1->2->1 [4] 3/-1/-1->2->1 [5] 3/-1/-1->2->1 [6] 3/-1/-1->2->1 [7] 3/-1/-1->2->1 [8] 3/-1/-1->2->1 [9] 3/-1/-1->2->1 [10] 3/-1/-1->2->1 [11] 3/-1/-1->2->1 [12] 3/-1/-1->2->1 [13] 3/-1/-1->2->1 [14] 3/-1/-1->2->1 [15] 3/-1/-1->2->1 [16] 3/-1/-1->2->1 [17] 3/-1/-1->2->1 [18] 3/-1/-1->2->1 [19] 3/-1/-1->2->1 [20] 3/-1/-1->2->1 [21] 3/-1/-1->2->1 [22] 3/-1/-1->2->1 [23] 3/-1/-1->2->1 node-0:1933431:1934545 [1] NCCL INFO Ring 07 : 0 -> 1 -> 2 node-0:1933430:1934228 [0] NCCL INFO Channel 20/24 : 0 1 2 3 4 5 6 7 node-0:1933433:1934551 [3] NCCL INFO Ring 23 : 2 -> 3 -> 4 node-0:1933431:1934545 [1] NCCL INFO Ring 08 : 0 -> 1 -> 2 node-0:1933433:1934551 [3] NCCL INFO Trees [0] 4/-1/-1->3->2 [1] 4/-1/-1->3->2 [2] 4/-1/-1->3->2 [3] 4/-1/-1->3->2 [4] 4/-1/-1->3->2 [5] 4/-1/-1->3->2 [6] 4/-1/-1->3->2 [7] 4/-1/-1->3->2 [8] 4/-1/-1->3->2 [9] 4/-1/-1->3->2 [10] 4/-1/-1->3->2 [11] 4/-1/-1->3->2 [12] 4/-1/-1->3->2 [13] 4/-1/-1->3->2 [14] 4/-1/-1->3->2 [15] 4/-1/-1->3->2 [16] 4/-1/-1->3->2 [17] 4/-1/-1->3->2 [18] 4/-1/-1->3->2 [19] 4/-1/-1->3->2 [20] 4/-1/-1->3->2 [21] 4/-1/-1->3->2 [22] 4/-1/-1->3->2 [23] 4/-1/-1->3->2 node-0:1933430:1934228 [0] NCCL INFO Channel 21/24 : 0 1 2 3 4 5 6 7 node-0:1933432:1934230 [2] NCCL INFO P2P Chunksize set to 524288 node-0:1933434:1934850 [4] NCCL INFO P2P Chunksize set to 524288 node-0:1933431:1934545 [1] NCCL INFO Ring 09 : 0 -> 1 -> 2 node-0:1933430:1934228 [0] NCCL INFO Channel 22/24 : 0 1 2 3 4 5 6 7 node-0:1933431:1934545 [1] NCCL INFO Ring 10 : 0 -> 1 -> 2 node-0:1933433:1934551 [3] NCCL INFO P2P Chunksize set to 524288 node-0:1933430:1934228 [0] NCCL INFO Channel 23/24 : 0 1 2 3 4 5 6 7 node-0:1933431:1934545 [1] NCCL INFO Ring 11 : 0 -> 1 -> 2 node-0:1933431:1934545 [1] NCCL INFO Ring 12 : 0 -> 1 -> 2 node-0:1933430:1934228 [0] NCCL INFO Ring 00 : 7 -> 0 -> 1 node-0:1933431:1934545 [1] NCCL INFO Ring 13 : 0 -> 1 -> 2 node-0:1933430:1934228 [0] NCCL INFO Ring 01 : 7 -> 0 -> 1 node-0:1933431:1934545 [1] NCCL INFO Ring 14 : 0 -> 1 -> 2 node-0:1933430:1934228 [0] NCCL INFO Ring 02 : 7 -> 0 -> 1 node-0:1933431:1934545 [1] NCCL INFO Ring 15 : 0 -> 1 -> 2 node-0:1933430:1934228 [0] NCCL INFO Ring 03 : 7 -> 0 -> 1 node-0:1933431:1934545 [1] NCCL INFO Ring 16 : 0 -> 1 -> 2 node-0:1933430:1934228 [0] NCCL INFO Ring 04 : 7 -> 0 -> 1 node-0:1933431:1934545 [1] NCCL INFO Ring 17 : 0 -> 1 -> 2 node-0:1933430:1934228 [0] NCCL INFO Ring 05 : 7 -> 0 -> 1 node-0:1933431:1934545 [1] NCCL INFO Ring 18 : 0 -> 1 -> 2 node-0:1933430:1934228 [0] NCCL INFO Ring 06 : 7 -> 0 -> 1 node-0:1933431:1934545 [1] NCCL INFO Ring 19 : 0 -> 1 -> 2 node-0:1933430:1934228 [0] NCCL INFO Ring 07 : 7 -> 0 -> 1 node-0:1933431:1934545 [1] NCCL INFO Ring 20 : 0 -> 1 -> 2 node-0:1933430:1934228 [0] NCCL INFO Ring 08 : 7 -> 0 -> 1 node-0:1933431:1934545 [1] NCCL INFO Ring 21 : 0 -> 1 -> 2 node-0:1933430:1934228 [0] NCCL INFO Ring 09 : 7 -> 0 -> 1 node-0:1933431:1934545 [1] NCCL INFO Ring 22 : 0 -> 1 -> 2 node-0:1933430:1934228 [0] NCCL INFO Ring 10 : 7 -> 0 -> 1 node-0:1933431:1934545 [1] NCCL INFO Ring 23 : 0 -> 1 -> 2 node-0:1933431:1934545 [1] NCCL INFO Trees [0] 2/-1/-1->1->0 [1] 2/-1/-1->1->0 [2] 2/-1/-1->1->0 [3] 2/-1/-1->1->0 [4] 2/-1/-1->1->0 [5] 2/-1/-1->1->0 [6] 2/-1/-1->1->0 [7] 2/-1/-1->1->0 [8] 2/-1/-1->1->0 [9] 2/-1/-1->1->0 [10] 2/-1/-1->1->0 [11] 2/-1/-1->1->0 [12] 2/-1/-1->1->0 [13] 2/-1/-1->1->0 [14] 2/-1/-1->1->0 [15] 2/-1/-1->1->0 [16] 2/-1/-1->1->0 [17] 2/-1/-1->1->0 [18] 2/-1/-1->1->0 [19] 2/-1/-1->1->0 [20] 2/-1/-1->1->0 [21] 2/-1/-1->1->0 [22] 2/-1/-1->1->0 [23] 2/-1/-1->1->0 node-0:1933430:1934228 [0] NCCL INFO Ring 11 : 7 -> 0 -> 1 node-0:1933430:1934228 [0] NCCL INFO Ring 12 : 7 -> 0 -> 1 node-0:1933431:1934545 [1] NCCL INFO P2P Chunksize set to 524288 node-0:1933430:1934228 [0] NCCL INFO Ring 13 : 7 -> 0 -> 1 node-0:1933430:1934228 [0] NCCL INFO Ring 14 : 7 -> 0 -> 1 node-0:1933430:1934228 [0] NCCL INFO Ring 15 : 7 -> 0 -> 1 node-0:1933430:1934228 [0] NCCL INFO Ring 16 : 7 -> 0 -> 1 node-0:1933430:1934228 [0] NCCL INFO Ring 17 : 7 -> 0 -> 1 node-0:1933430:1934228 [0] NCCL INFO Ring 18 : 7 -> 0 -> 1 node-0:1933430:1934228 [0] NCCL INFO Ring 19 : 7 -> 0 -> 1 node-0:1933430:1934228 [0] NCCL INFO Ring 20 : 7 -> 0 -> 1 node-0:1933430:1934228 [0] NCCL INFO Ring 21 : 7 -> 0 -> 1 node-0:1933430:1934228 [0] NCCL INFO Ring 22 : 7 -> 0 -> 1 node-0:1933430:1934228 [0] NCCL INFO Ring 23 : 7 -> 0 -> 1 node-0:1933430:1934228 [0] NCCL INFO Trees [0] 1/-1/-1->0->-1 [1] 1/-1/-1->0->-1 [2] 1/-1/-1->0->-1 [3] 1/-1/-1->0->-1 [4] 1/-1/-1->0->-1 [5] 1/-1/-1->0->-1 [6] 1/-1/-1->0->-1 [7] 1/-1/-1->0->-1 [8] 1/-1/-1->0->-1 [9] 1/-1/-1->0->-1 [10] 1/-1/-1->0->-1 [11] 1/-1/-1->0->-1 [12] 1/-1/-1->0->-1 [13] 1/-1/-1->0->-1 [14] 1/-1/-1->0->-1 [15] 1/-1/-1->0->-1 [16] 1/-1/-1->0->-1 [17] 1/-1/-1->0->-1 [18] 1/-1/-1->0->-1 [19] 1/-1/-1->0->-1 [20] 1/-1/-1->0->-1 [21] 1/-1/-1->0->-1 [22] 1/-1/-1->0->-1 [23] 1/-1/-1->0->-1 node-0:1933430:1934228 [0] NCCL INFO P2P Chunksize set to 524288 node-0:1933436:1934547 [6] NCCL INFO Rank 6 selecting transport for rank 5 node-0:1933436:1934547 [6] NCCL INFO Transport 0 canConnect 1 node-0:1933430:1934228 [0] NCCL INFO Rank 0 selecting transport for rank 7 node-0:1933430:1934228 [0] NCCL INFO Transport 0 canConnect 1 node-0:1933434:1934850 [4] NCCL INFO Rank 4 selecting transport for rank 3 node-0:1933434:1934850 [4] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Rank 2 selecting transport for rank 1 node-0:1933432:1934230 [2] NCCL INFO Transport 0 canConnect 1 node-0:1933430:1934228 [0] NCCL INFO Rank 0 selecting transport for rank 7 node-0:1933430:1934228 [0] NCCL INFO Transport 0 canConnect 1 node-0:1933436:1934547 [6] NCCL INFO Rank 6 selecting transport for rank 5 node-0:1933436:1934547 [6] NCCL INFO Transport 0 canConnect 1 node-0:1933434:1934850 [4] NCCL INFO Rank 4 selecting transport for rank 3 node-0:1933434:1934850 [4] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Rank 2 selecting transport for rank 1 node-0:1933432:1934230 [2] NCCL INFO Transport 0 canConnect 1 node-0:1933430:1934228 [0] NCCL INFO Rank 0 selecting transport for rank 7 node-0:1933430:1934228 [0] NCCL INFO Transport 0 canConnect 1 node-0:1933436:1934547 [6] NCCL INFO Rank 6 selecting transport for rank 5 node-0:1933436:1934547 [6] NCCL INFO Transport 0 canConnect 1 node-0:1933434:1934850 [4] NCCL INFO Rank 4 selecting transport for rank 3 node-0:1933434:1934850 [4] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Rank 2 selecting transport for rank 1 node-0:1933432:1934230 [2] NCCL INFO Transport 0 canConnect 1 node-0:1933430:1934228 [0] NCCL INFO Rank 0 selecting transport for rank 7 node-0:1933430:1934228 [0] NCCL INFO Transport 0 canConnect 1 node-0:1933436:1934547 [6] NCCL INFO Rank 6 selecting transport for rank 5 node-0:1933436:1934547 [6] NCCL INFO Transport 0 canConnect 1 node-0:1933434:1934850 [4] NCCL INFO Rank 4 selecting transport for rank 3 node-0:1933434:1934850 [4] NCCL INFO Transport 0 canConnect 1 node-0:1933434:1934850 [4] NCCL INFO Rank 4 selecting transport for rank 3 node-0:1933434:1934850 [4] NCCL INFO Transport 0 canConnect 1 node-0:1933434:1934850 [4] NCCL INFO Rank 4 selecting transport for rank 3 node-0:1933434:1934850 [4] NCCL INFO Transport 0 canConnect 1 node-0:1933434:1934850 [4] NCCL INFO Rank 4 selecting transport for rank 3 node-0:1933434:1934850 [4] NCCL INFO Transport 0 canConnect 1 node-0:1933433:1934551 [3] NCCL INFO Rank 3 selecting transport for rank 2 node-0:1933433:1934551 [3] NCCL INFO Transport 0 canConnect 1 node-0:1933434:1934850 [4] NCCL INFO Rank 4 selecting transport for rank 3 node-0:1933434:1934850 [4] NCCL INFO Transport 0 canConnect 1 node-0:1933434:1934850 [4] NCCL INFO Rank 4 selecting transport for rank 3 node-0:1933434:1934850 [4] NCCL INFO Transport 0 canConnect 1 node-0:1933433:1934551 [3] NCCL INFO Rank 3 selecting transport for rank 2 node-0:1933433:1934551 [3] NCCL INFO Transport 0 canConnect 1 node-0:1933437:1934852 [7] NCCL INFO Rank 7 selecting transport for rank 6 node-0:1933437:1934852 [7] NCCL INFO Transport 0 canConnect 1 node-0:1933434:1934850 [4] NCCL INFO Rank 4 selecting transport for rank 3 node-0:1933434:1934850 [4] NCCL INFO Transport 0 canConnect 1 node-0:1933433:1934551 [3] NCCL INFO Rank 3 selecting transport for rank 2 node-0:1933433:1934551 [3] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Rank 2 selecting transport for rank 1 node-0:1933432:1934230 [2] NCCL INFO Transport 0 canConnect 1 node-0:1933434:1934850 [4] NCCL INFO Rank 4 selecting transport for rank 3 node-0:1933434:1934850 [4] NCCL INFO Transport 0 canConnect 1 node-0:1933433:1934551 [3] NCCL INFO Rank 3 selecting transport for rank 2 node-0:1933433:1934551 [3] NCCL INFO Transport 0 canConnect 1 node-0:1933437:1934852 [7] NCCL INFO Rank 7 selecting transport for rank 6 node-0:1933437:1934852 [7] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Rank 2 selecting transport for rank 1 node-0:1933432:1934230 [2] NCCL INFO Transport 0 canConnect 1 node-0:1933431:1934545 [1] NCCL INFO Rank 1 selecting transport for rank 0 node-0:1933431:1934545 [1] NCCL INFO Transport 0 canConnect 1 node-0:1933434:1934850 [4] NCCL INFO Rank 4 selecting transport for rank 3 node-0:1933434:1934850 [4] NCCL INFO Transport 0 canConnect 1 node-0:1933433:1934551 [3] NCCL INFO Rank 3 selecting transport for rank 2 node-0:1933433:1934551 [3] NCCL INFO Transport 0 canConnect 1 node-0:1933430:1934228 [0] NCCL INFO Rank 0 selecting transport for rank 7 node-0:1933430:1934228 [0] NCCL INFO Transport 0 canConnect 1 node-0:1933436:1934547 [6] NCCL INFO Rank 6 selecting transport for rank 5 node-0:1933436:1934547 [6] NCCL INFO Transport 0 canConnect 1 node-0:1933437:1934852 [7] NCCL INFO Rank 7 selecting transport for rank 6 node-0:1933437:1934852 [7] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Rank 2 selecting transport for rank 1 node-0:1933432:1934230 [2] NCCL INFO Transport 0 canConnect 1 node-0:1933431:1934545 [1] NCCL INFO Rank 1 selecting transport for rank 0 node-0:1933431:1934545 [1] NCCL INFO Transport 0 canConnect 1 node-0:1933434:1934850 [4] NCCL INFO Rank 4 selecting transport for rank 3 node-0:1933434:1934850 [4] NCCL INFO Transport 0 canConnect 1 node-0:1933433:1934551 [3] NCCL INFO Rank 3 selecting transport for rank 2 node-0:1933433:1934551 [3] NCCL INFO Transport 0 canConnect 1 node-0:1933430:1934228 [0] NCCL INFO Rank 0 selecting transport for rank 7 node-0:1933430:1934228 [0] NCCL INFO Transport 0 canConnect 1 node-0:1933436:1934547 [6] NCCL INFO Rank 6 selecting transport for rank 5 node-0:1933436:1934547 [6] NCCL INFO Transport 0 canConnect 1 node-0:1933437:1934852 [7] NCCL INFO Rank 7 selecting transport for rank 6 node-0:1933437:1934852 [7] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Rank 2 selecting transport for rank 1 node-0:1933432:1934230 [2] NCCL INFO Transport 0 canConnect 1 node-0:1933431:1934545 [1] NCCL INFO Rank 1 selecting transport for rank 0 node-0:1933431:1934545 [1] NCCL INFO Transport 0 canConnect 1 node-0:1933434:1934850 [4] NCCL INFO Rank 4 selecting transport for rank 3 node-0:1933434:1934850 [4] NCCL INFO Transport 0 canConnect 1 node-0:1933433:1934551 [3] NCCL INFO Rank 3 selecting transport for rank 2 node-0:1933433:1934551 [3] NCCL INFO Transport 0 canConnect 1 node-0:1933430:1934228 [0] NCCL INFO Rank 0 selecting transport for rank 7 node-0:1933430:1934228 [0] NCCL INFO Transport 0 canConnect 1 node-0:1933436:1934547 [6] NCCL INFO Rank 6 selecting transport for rank 5 node-0:1933436:1934547 [6] NCCL INFO Transport 0 canConnect 1 node-0:1933437:1934852 [7] NCCL INFO Rank 7 selecting transport for rank 6 node-0:1933437:1934852 [7] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Rank 2 selecting transport for rank 1 node-0:1933432:1934230 [2] NCCL INFO Transport 0 canConnect 1 node-0:1933431:1934545 [1] NCCL INFO Rank 1 selecting transport for rank 0 node-0:1933431:1934545 [1] NCCL INFO Transport 0 canConnect 1 node-0:1933435:1934848 [5] NCCL INFO Rank 5 selecting transport for rank 4 node-0:1933435:1934848 [5] NCCL INFO Transport 0 canConnect 1 node-0:1933434:1934850 [4] NCCL INFO Rank 4 selecting transport for rank 3 node-0:1933434:1934850 [4] NCCL INFO Transport 0 canConnect 1 node-0:1933430:1934228 [0] NCCL INFO Rank 0 selecting transport for rank 7 node-0:1933430:1934228 [0] NCCL INFO Transport 0 canConnect 1 node-0:1933436:1934547 [6] NCCL INFO Rank 6 selecting transport for rank 5 node-0:1933436:1934547 [6] NCCL INFO Transport 0 canConnect 1 node-0:1933437:1934852 [7] NCCL INFO Rank 7 selecting transport for rank 6 node-0:1933437:1934852 [7] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Rank 2 selecting transport for rank 1 node-0:1933432:1934230 [2] NCCL INFO Transport 0 canConnect 1 node-0:1933431:1934545 [1] NCCL INFO Rank 1 selecting transport for rank 0 node-0:1933431:1934545 [1] NCCL INFO Transport 0 canConnect 1 node-0:1933434:1934850 [4] NCCL INFO Rank 4 selecting transport for rank 3 node-0:1933434:1934850 [4] NCCL INFO Transport 0 canConnect 1 node-0:1933430:1934228 [0] NCCL INFO Rank 0 selecting transport for rank 7 node-0:1933430:1934228 [0] NCCL INFO Transport 0 canConnect 1 node-0:1933436:1934547 [6] NCCL INFO Rank 6 selecting transport for rank 5 node-0:1933436:1934547 [6] NCCL INFO Transport 0 canConnect 1 node-0:1933437:1934852 [7] NCCL INFO Rank 7 selecting transport for rank 6 node-0:1933437:1934852 [7] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Rank 2 selecting transport for rank 1 node-0:1933432:1934230 [2] NCCL INFO Transport 0 canConnect 1 node-0:1933435:1934848 [5] NCCL INFO Rank 5 selecting transport for rank 4 node-0:1933435:1934848 [5] NCCL INFO Transport 0 canConnect 1 node-0:1933431:1934545 [1] NCCL INFO Rank 1 selecting transport for rank 0 node-0:1933431:1934545 [1] NCCL INFO Transport 0 canConnect 1 node-0:1933434:1934850 [4] NCCL INFO Rank 4 selecting transport for rank 3 node-0:1933434:1934850 [4] NCCL INFO Transport 0 canConnect 1 node-0:1933430:1934228 [0] NCCL INFO Rank 0 selecting transport for rank 7 node-0:1933430:1934228 [0] NCCL INFO Transport 0 canConnect 1 node-0:1933436:1934547 [6] NCCL INFO Rank 6 selecting transport for rank 5 node-0:1933436:1934547 [6] NCCL INFO Transport 0 canConnect 1 node-0:1933437:1934852 [7] NCCL INFO Rank 7 selecting transport for rank 6 node-0:1933437:1934852 [7] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Rank 2 selecting transport for rank 1 node-0:1933432:1934230 [2] NCCL INFO Transport 0 canConnect 1 node-0:1933435:1934848 [5] NCCL INFO Rank 5 selecting transport for rank 4 node-0:1933435:1934848 [5] NCCL INFO Transport 0 canConnect 1 node-0:1933433:1934551 [3] NCCL INFO Rank 3 selecting transport for rank 2 node-0:1933433:1934551 [3] NCCL INFO Transport 0 canConnect 1 node-0:1933431:1934545 [1] NCCL INFO Rank 1 selecting transport for rank 0 node-0:1933431:1934545 [1] NCCL INFO Transport 0 canConnect 1 node-0:1933434:1934850 [4] NCCL INFO Rank 4 selecting transport for rank 3 node-0:1933434:1934850 [4] NCCL INFO Transport 0 canConnect 1 node-0:1933430:1934228 [0] NCCL INFO Rank 0 selecting transport for rank 7 node-0:1933430:1934228 [0] NCCL INFO Transport 0 canConnect 1 node-0:1933436:1934547 [6] NCCL INFO Rank 6 selecting transport for rank 5 node-0:1933436:1934547 [6] NCCL INFO Transport 0 canConnect 1 node-0:1933437:1934852 [7] NCCL INFO Rank 7 selecting transport for rank 6 node-0:1933437:1934852 [7] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Rank 2 selecting transport for rank 1 node-0:1933432:1934230 [2] NCCL INFO Transport 0 canConnect 1 node-0:1933435:1934848 [5] NCCL INFO Rank 5 selecting transport for rank 4 node-0:1933435:1934848 [5] NCCL INFO Transport 0 canConnect 1 node-0:1933431:1934545 [1] NCCL INFO Rank 1 selecting transport for rank 0 node-0:1933431:1934545 [1] NCCL INFO Transport 0 canConnect 1 node-0:1933434:1934850 [4] NCCL INFO Rank 4 selecting transport for rank 3 node-0:1933434:1934850 [4] NCCL INFO Transport 0 canConnect 1 node-0:1933430:1934228 [0] NCCL INFO Rank 0 selecting transport for rank 7 node-0:1933430:1934228 [0] NCCL INFO Transport 0 canConnect 1 node-0:1933436:1934547 [6] NCCL INFO Rank 6 selecting transport for rank 5 node-0:1933436:1934547 [6] NCCL INFO Transport 0 canConnect 1 node-0:1933437:1934852 [7] NCCL INFO Rank 7 selecting transport for rank 6 node-0:1933437:1934852 [7] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Rank 2 selecting transport for rank 1 node-0:1933432:1934230 [2] NCCL INFO Transport 0 canConnect 1 node-0:1933435:1934848 [5] NCCL INFO Rank 5 selecting transport for rank 4 node-0:1933435:1934848 [5] NCCL INFO Transport 0 canConnect 1 node-0:1933433:1934551 [3] NCCL INFO Rank 3 selecting transport for rank 2 node-0:1933433:1934551 [3] NCCL INFO Transport 0 canConnect 1 node-0:1933431:1934545 [1] NCCL INFO Rank 1 selecting transport for rank 0 node-0:1933431:1934545 [1] NCCL INFO Transport 0 canConnect 1 node-0:1933434:1934850 [4] NCCL INFO Rank 4 selecting transport for rank 3 node-0:1933434:1934850 [4] NCCL INFO Transport 0 canConnect 1 node-0:1933430:1934228 [0] NCCL INFO Rank 0 selecting transport for rank 7 node-0:1933430:1934228 [0] NCCL INFO Transport 0 canConnect 1 node-0:1933436:1934547 [6] NCCL INFO Rank 6 selecting transport for rank 5 node-0:1933436:1934547 [6] NCCL INFO Transport 0 canConnect 1 node-0:1933437:1934852 [7] NCCL INFO Rank 7 selecting transport for rank 6 node-0:1933437:1934852 [7] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Rank 2 selecting transport for rank 1 node-0:1933432:1934230 [2] NCCL INFO Transport 0 canConnect 1 node-0:1933433:1934551 [3] NCCL INFO Rank 3 selecting transport for rank 2 node-0:1933433:1934551 [3] NCCL INFO Transport 0 canConnect 1 node-0:1933431:1934545 [1] NCCL INFO Rank 1 selecting transport for rank 0 node-0:1933431:1934545 [1] NCCL INFO Transport 0 canConnect 1 node-0:1933434:1934850 [4] NCCL INFO Rank 4 selecting transport for rank 3 node-0:1933434:1934850 [4] NCCL INFO Transport 0 canConnect 1 node-0:1933430:1934228 [0] NCCL INFO Rank 0 selecting transport for rank 7 node-0:1933430:1934228 [0] NCCL INFO Transport 0 canConnect 1 node-0:1933436:1934547 [6] NCCL INFO Rank 6 selecting transport for rank 5 node-0:1933436:1934547 [6] NCCL INFO Transport 0 canConnect 1 node-0:1933437:1934852 [7] NCCL INFO Rank 7 selecting transport for rank 6 node-0:1933437:1934852 [7] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Rank 2 selecting transport for rank 1 node-0:1933432:1934230 [2] NCCL INFO Transport 0 canConnect 1 node-0:1933433:1934551 [3] NCCL INFO Rank 3 selecting transport for rank 2 node-0:1933433:1934551 [3] NCCL INFO Transport 0 canConnect 1 node-0:1933431:1934545 [1] NCCL INFO Rank 1 selecting transport for rank 0 node-0:1933431:1934545 [1] NCCL INFO Transport 0 canConnect 1 node-0:1933434:1934850 [4] NCCL INFO Rank 4 selecting transport for rank 3 node-0:1933434:1934850 [4] NCCL INFO Transport 0 canConnect 1 node-0:1933430:1934228 [0] NCCL INFO Rank 0 selecting transport for rank 7 node-0:1933430:1934228 [0] NCCL INFO Transport 0 canConnect 1 node-0:1933436:1934547 [6] NCCL INFO Rank 6 selecting transport for rank 5 node-0:1933436:1934547 [6] NCCL INFO Transport 0 canConnect 1 node-0:1933437:1934852 [7] NCCL INFO Rank 7 selecting transport for rank 6 node-0:1933437:1934852 [7] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Rank 2 selecting transport for rank 1 node-0:1933432:1934230 [2] NCCL INFO Transport 0 canConnect 1 node-0:1933435:1934848 [5] NCCL INFO Rank 5 selecting transport for rank 4 node-0:1933435:1934848 [5] NCCL INFO Transport 0 canConnect 1 node-0:1933431:1934545 [1] NCCL INFO Rank 1 selecting transport for rank 0 node-0:1933431:1934545 [1] NCCL INFO Transport 0 canConnect 1 node-0:1933434:1934850 [4] NCCL INFO Rank 4 selecting transport for rank 3 node-0:1933434:1934850 [4] NCCL INFO Transport 0 canConnect 1 node-0:1933430:1934228 [0] NCCL INFO Rank 0 selecting transport for rank 7 node-0:1933430:1934228 [0] NCCL INFO Transport 0 canConnect 1 node-0:1933436:1934547 [6] NCCL INFO Rank 6 selecting transport for rank 5 node-0:1933436:1934547 [6] NCCL INFO Transport 0 canConnect 1 node-0:1933437:1934852 [7] NCCL INFO Rank 7 selecting transport for rank 6 node-0:1933437:1934852 [7] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Rank 2 selecting transport for rank 1 node-0:1933432:1934230 [2] NCCL INFO Transport 0 canConnect 1 node-0:1933435:1934848 [5] NCCL INFO Rank 5 selecting transport for rank 4 node-0:1933435:1934848 [5] NCCL INFO Transport 0 canConnect 1 node-0:1933431:1934545 [1] NCCL INFO Rank 1 selecting transport for rank 0 node-0:1933431:1934545 [1] NCCL INFO Transport 0 canConnect 1 node-0:1933434:1934850 [4] NCCL INFO Rank 4 selecting transport for rank 3 node-0:1933434:1934850 [4] NCCL INFO Transport 0 canConnect 1 node-0:1933430:1934228 [0] NCCL INFO Rank 0 selecting transport for rank 7 node-0:1933430:1934228 [0] NCCL INFO Transport 0 canConnect 1 node-0:1933436:1934547 [6] NCCL INFO Rank 6 selecting transport for rank 5 node-0:1933436:1934547 [6] NCCL INFO Transport 0 canConnect 1 node-0:1933437:1934852 [7] NCCL INFO Rank 7 selecting transport for rank 6 node-0:1933437:1934852 [7] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Rank 2 selecting transport for rank 1 node-0:1933432:1934230 [2] NCCL INFO Transport 0 canConnect 1 node-0:1933435:1934848 [5] NCCL INFO Rank 5 selecting transport for rank 4 node-0:1933435:1934848 [5] NCCL INFO Transport 0 canConnect 1 node-0:1933431:1934545 [1] NCCL INFO Rank 1 selecting transport for rank 0 node-0:1933431:1934545 [1] NCCL INFO Transport 0 canConnect 1 node-0:1933434:1934850 [4] NCCL INFO Rank 4 selecting transport for rank 5 node-0:1933434:1934850 [4] NCCL INFO Transport 0 canConnect 1 node-0:1933434:1934850 [4] NCCL INFO Channel 00/0 : 4[4] -> 5[5] via P2P/CUMEM/read node-0:1933430:1934228 [0] NCCL INFO Rank 0 selecting transport for rank 7 node-0:1933430:1934228 [0] NCCL INFO Transport 0 canConnect 1 node-0:1933436:1934547 [6] NCCL INFO Rank 6 selecting transport for rank 5 node-0:1933436:1934547 [6] NCCL INFO Transport 0 canConnect 1 node-0:1933437:1934852 [7] NCCL INFO Rank 7 selecting transport for rank 6 node-0:1933437:1934852 [7] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Rank 2 selecting transport for rank 1 node-0:1933432:1934230 [2] NCCL INFO Transport 0 canConnect 1 node-0:1933435:1934848 [5] NCCL INFO Rank 5 selecting transport for rank 4 node-0:1933435:1934848 [5] NCCL INFO Transport 0 canConnect 1 node-0:1933431:1934545 [1] NCCL INFO Rank 1 selecting transport for rank 0 node-0:1933431:1934545 [1] NCCL INFO Transport 0 canConnect 1 node-0:1933434:1934850 [4] NCCL INFO Rank 4 selecting transport for rank 5 node-0:1933434:1934850 [4] NCCL INFO Transport 0 canConnect 1 node-0:1933434:1934850 [4] NCCL INFO Channel 01/0 : 4[4] -> 5[5] via P2P/CUMEM/read node-0:1933430:1934228 [0] NCCL INFO Rank 0 selecting transport for rank 7 node-0:1933430:1934228 [0] NCCL INFO Transport 0 canConnect 1 node-0:1933436:1934547 [6] NCCL INFO Rank 6 selecting transport for rank 5 node-0:1933436:1934547 [6] NCCL INFO Transport 0 canConnect 1 node-0:1933437:1934852 [7] NCCL INFO Rank 7 selecting transport for rank 6 node-0:1933437:1934852 [7] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Rank 2 selecting transport for rank 1 node-0:1933432:1934230 [2] NCCL INFO Transport 0 canConnect 1 node-0:1933433:1934551 [3] NCCL INFO Rank 3 selecting transport for rank 2 node-0:1933433:1934551 [3] NCCL INFO Transport 0 canConnect 1 node-0:1933431:1934545 [1] NCCL INFO Rank 1 selecting transport for rank 0 node-0:1933431:1934545 [1] NCCL INFO Transport 0 canConnect 1 node-0:1933435:1934848 [5] NCCL INFO Rank 5 selecting transport for rank 4 node-0:1933435:1934848 [5] NCCL INFO Transport 0 canConnect 1 node-0:1933434:1934850 [4] NCCL INFO Rank 4 selecting transport for rank 5 node-0:1933434:1934850 [4] NCCL INFO Transport 0 canConnect 1 node-0:1933434:1934850 [4] NCCL INFO Channel 02/0 : 4[4] -> 5[5] via P2P/CUMEM/read node-0:1933430:1934228 [0] NCCL INFO Rank 0 selecting transport for rank 7 node-0:1933430:1934228 [0] NCCL INFO Transport 0 canConnect 1 node-0:1933436:1934547 [6] NCCL INFO Rank 6 selecting transport for rank 5 node-0:1933436:1934547 [6] NCCL INFO Transport 0 canConnect 1 node-0:1933437:1934852 [7] NCCL INFO Rank 7 selecting transport for rank 6 node-0:1933437:1934852 [7] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Rank 2 selecting transport for rank 1 node-0:1933432:1934230 [2] NCCL INFO Transport 0 canConnect 1 node-0:1933433:1934551 [3] NCCL INFO Rank 3 selecting transport for rank 2 node-0:1933433:1934551 [3] NCCL INFO Transport 0 canConnect 1 node-0:1933431:1934545 [1] NCCL INFO Rank 1 selecting transport for rank 0 node-0:1933431:1934545 [1] NCCL INFO Transport 0 canConnect 1 node-0:1933435:1934848 [5] NCCL INFO Rank 5 selecting transport for rank 4 node-0:1933435:1934848 [5] NCCL INFO Transport 0 canConnect 1 node-0:1933434:1934850 [4] NCCL INFO Rank 4 selecting transport for rank 5 node-0:1933434:1934850 [4] NCCL INFO Transport 0 canConnect 1 node-0:1933434:1934850 [4] NCCL INFO Channel 03/0 : 4[4] -> 5[5] via P2P/CUMEM/read node-0:1933430:1934228 [0] NCCL INFO Rank 0 selecting transport for rank 7 node-0:1933430:1934228 [0] NCCL INFO Transport 0 canConnect 1 node-0:1933436:1934547 [6] NCCL INFO Rank 6 selecting transport for rank 5 node-0:1933436:1934547 [6] NCCL INFO Transport 0 canConnect 1 node-0:1933437:1934852 [7] NCCL INFO Rank 7 selecting transport for rank 6 node-0:1933437:1934852 [7] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Rank 2 selecting transport for rank 1 node-0:1933432:1934230 [2] NCCL INFO Transport 0 canConnect 1 node-0:1933433:1934551 [3] NCCL INFO Rank 3 selecting transport for rank 2 node-0:1933433:1934551 [3] NCCL INFO Transport 0 canConnect 1 node-0:1933431:1934545 [1] NCCL INFO Rank 1 selecting transport for rank 0 node-0:1933431:1934545 [1] NCCL INFO Transport 0 canConnect 1 node-0:1933434:1934850 [4] NCCL INFO Rank 4 selecting transport for rank 5 node-0:1933434:1934850 [4] NCCL INFO Transport 0 canConnect 1 node-0:1933434:1934850 [4] NCCL INFO Channel 04/0 : 4[4] -> 5[5] via P2P/CUMEM/read node-0:1933430:1934228 [0] NCCL INFO Rank 0 selecting transport for rank 7 node-0:1933430:1934228 [0] NCCL INFO Transport 0 canConnect 1 node-0:1933436:1934547 [6] NCCL INFO Rank 6 selecting transport for rank 5 node-0:1933436:1934547 [6] NCCL INFO Transport 0 canConnect 1 node-0:1933437:1934852 [7] NCCL INFO Rank 7 selecting transport for rank 6 node-0:1933437:1934852 [7] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Rank 2 selecting transport for rank 1 node-0:1933432:1934230 [2] NCCL INFO Transport 0 canConnect 1 node-0:1933433:1934551 [3] NCCL INFO Rank 3 selecting transport for rank 2 node-0:1933433:1934551 [3] NCCL INFO Transport 0 canConnect 1 node-0:1933431:1934545 [1] NCCL INFO Rank 1 selecting transport for rank 0 node-0:1933431:1934545 [1] NCCL INFO Transport 0 canConnect 1 node-0:1933434:1934850 [4] NCCL INFO Rank 4 selecting transport for rank 5 node-0:1933434:1934850 [4] NCCL INFO Transport 0 canConnect 1 node-0:1933434:1934850 [4] NCCL INFO Channel 05/0 : 4[4] -> 5[5] via P2P/CUMEM/read node-0:1933430:1934228 [0] NCCL INFO Rank 0 selecting transport for rank 7 node-0:1933430:1934228 [0] NCCL INFO Transport 0 canConnect 1 node-0:1933436:1934547 [6] NCCL INFO Rank 6 selecting transport for rank 5 node-0:1933436:1934547 [6] NCCL INFO Transport 0 canConnect 1 node-0:1933437:1934852 [7] NCCL INFO Rank 7 selecting transport for rank 6 node-0:1933437:1934852 [7] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Rank 2 selecting transport for rank 1 node-0:1933432:1934230 [2] NCCL INFO Transport 0 canConnect 1 node-0:1933433:1934551 [3] NCCL INFO Rank 3 selecting transport for rank 2 node-0:1933433:1934551 [3] NCCL INFO Transport 0 canConnect 1 node-0:1933431:1934545 [1] NCCL INFO Rank 1 selecting transport for rank 0 node-0:1933431:1934545 [1] NCCL INFO Transport 0 canConnect 1 node-0:1933434:1934850 [4] NCCL INFO Rank 4 selecting transport for rank 5 node-0:1933434:1934850 [4] NCCL INFO Transport 0 canConnect 1 node-0:1933434:1934850 [4] NCCL INFO Channel 06/0 : 4[4] -> 5[5] via P2P/CUMEM/read node-0:1933430:1934228 [0] NCCL INFO Rank 0 selecting transport for rank 7 node-0:1933430:1934228 [0] NCCL INFO Transport 0 canConnect 1 node-0:1933436:1934547 [6] NCCL INFO Rank 6 selecting transport for rank 5 node-0:1933436:1934547 [6] NCCL INFO Transport 0 canConnect 1 node-0:1933437:1934852 [7] NCCL INFO Rank 7 selecting transport for rank 6 node-0:1933437:1934852 [7] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Rank 2 selecting transport for rank 3 node-0:1933432:1934230 [2] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Channel 00/0 : 2[2] -> 3[3] via P2P/CUMEM/read node-0:1933435:1934848 [5] NCCL INFO Rank 5 selecting transport for rank 4 node-0:1933435:1934848 [5] NCCL INFO Transport 0 canConnect 1 node-0:1933433:1934551 [3] NCCL INFO Rank 3 selecting transport for rank 2 node-0:1933433:1934551 [3] NCCL INFO Transport 0 canConnect 1 node-0:1933431:1934545 [1] NCCL INFO Rank 1 selecting transport for rank 0 node-0:1933431:1934545 [1] NCCL INFO Transport 0 canConnect 1 node-0:1933434:1934850 [4] NCCL INFO Rank 4 selecting transport for rank 5 node-0:1933434:1934850 [4] NCCL INFO Transport 0 canConnect 1 node-0:1933434:1934850 [4] NCCL INFO Channel 07/0 : 4[4] -> 5[5] via P2P/CUMEM/read node-0:1933430:1934228 [0] NCCL INFO Rank 0 selecting transport for rank 1 node-0:1933430:1934228 [0] NCCL INFO Transport 0 canConnect 1 node-0:1933430:1934228 [0] NCCL INFO Channel 00/0 : 0[0] -> 1[1] via P2P/CUMEM/read node-0:1933436:1934547 [6] NCCL INFO Rank 6 selecting transport for rank 7 node-0:1933436:1934547 [6] NCCL INFO Transport 0 canConnect 1 node-0:1933436:1934547 [6] NCCL INFO Channel 00/0 : 6[6] -> 7[7] via P2P/CUMEM/read node-0:1933437:1934852 [7] NCCL INFO Rank 7 selecting transport for rank 6 node-0:1933437:1934852 [7] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Rank 2 selecting transport for rank 3 node-0:1933432:1934230 [2] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Channel 01/0 : 2[2] -> 3[3] via P2P/CUMEM/read node-0:1933435:1934848 [5] NCCL INFO Rank 5 selecting transport for rank 4 node-0:1933435:1934848 [5] NCCL INFO Transport 0 canConnect 1 node-0:1933433:1934551 [3] NCCL INFO Rank 3 selecting transport for rank 2 node-0:1933433:1934551 [3] NCCL INFO Transport 0 canConnect 1 node-0:1933431:1934545 [1] NCCL INFO Rank 1 selecting transport for rank 0 node-0:1933431:1934545 [1] NCCL INFO Transport 0 canConnect 1 node-0:1933434:1934850 [4] NCCL INFO Rank 4 selecting transport for rank 5 node-0:1933434:1934850 [4] NCCL INFO Transport 0 canConnect 1 node-0:1933434:1934850 [4] NCCL INFO Channel 08/0 : 4[4] -> 5[5] via P2P/CUMEM/read node-0:1933430:1934228 [0] NCCL INFO Rank 0 selecting transport for rank 1 node-0:1933430:1934228 [0] NCCL INFO Transport 0 canConnect 1 node-0:1933430:1934228 [0] NCCL INFO Channel 01/0 : 0[0] -> 1[1] via P2P/CUMEM/read node-0:1933436:1934547 [6] NCCL INFO Rank 6 selecting transport for rank 7 node-0:1933436:1934547 [6] NCCL INFO Transport 0 canConnect 1 node-0:1933436:1934547 [6] NCCL INFO Channel 01/0 : 6[6] -> 7[7] via P2P/CUMEM/read node-0:1933432:1934230 [2] NCCL INFO Rank 2 selecting transport for rank 3 node-0:1933432:1934230 [2] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Channel 02/0 : 2[2] -> 3[3] via P2P/CUMEM/read node-0:1933435:1934848 [5] NCCL INFO Rank 5 selecting transport for rank 4 node-0:1933435:1934848 [5] NCCL INFO Transport 0 canConnect 1 node-0:1933437:1934852 [7] NCCL INFO Rank 7 selecting transport for rank 6 node-0:1933437:1934852 [7] NCCL INFO Transport 0 canConnect 1 node-0:1933433:1934551 [3] NCCL INFO Rank 3 selecting transport for rank 2 node-0:1933433:1934551 [3] NCCL INFO Transport 0 canConnect 1 node-0:1933431:1934545 [1] NCCL INFO Rank 1 selecting transport for rank 0 node-0:1933431:1934545 [1] NCCL INFO Transport 0 canConnect 1 node-0:1933434:1934850 [4] NCCL INFO Rank 4 selecting transport for rank 5 node-0:1933434:1934850 [4] NCCL INFO Transport 0 canConnect 1 node-0:1933434:1934850 [4] NCCL INFO Channel 09/0 : 4[4] -> 5[5] via P2P/CUMEM/read node-0:1933430:1934228 [0] NCCL INFO Rank 0 selecting transport for rank 1 node-0:1933430:1934228 [0] NCCL INFO Transport 0 canConnect 1 node-0:1933430:1934228 [0] NCCL INFO Channel 02/0 : 0[0] -> 1[1] via P2P/CUMEM/read node-0:1933436:1934547 [6] NCCL INFO Rank 6 selecting transport for rank 7 node-0:1933436:1934547 [6] NCCL INFO Transport 0 canConnect 1 node-0:1933436:1934547 [6] NCCL INFO Channel 02/0 : 6[6] -> 7[7] via P2P/CUMEM/read node-0:1933432:1934230 [2] NCCL INFO Rank 2 selecting transport for rank 3 node-0:1933432:1934230 [2] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Channel 03/0 : 2[2] -> 3[3] via P2P/CUMEM/read node-0:1933437:1934852 [7] NCCL INFO Rank 7 selecting transport for rank 0 node-0:1933437:1934852 [7] NCCL INFO Transport 0 canConnect 1 node-0:1933437:1934852 [7] NCCL INFO Channel 00/0 : 7[7] -> 0[0] via P2P/CUMEM/read node-0:1933433:1934551 [3] NCCL INFO Rank 3 selecting transport for rank 2 node-0:1933433:1934551 [3] NCCL INFO Transport 0 canConnect 1 node-0:1933431:1934545 [1] NCCL INFO Rank 1 selecting transport for rank 0 node-0:1933431:1934545 [1] NCCL INFO Transport 0 canConnect 1 node-0:1933434:1934850 [4] NCCL INFO Rank 4 selecting transport for rank 5 node-0:1933434:1934850 [4] NCCL INFO Transport 0 canConnect 1 node-0:1933434:1934850 [4] NCCL INFO Channel 10/0 : 4[4] -> 5[5] via P2P/CUMEM/read node-0:1933430:1934228 [0] NCCL INFO Rank 0 selecting transport for rank 1 node-0:1933430:1934228 [0] NCCL INFO Transport 0 canConnect 1 node-0:1933430:1934228 [0] NCCL INFO Channel 03/0 : 0[0] -> 1[1] via P2P/CUMEM/read node-0:1933436:1934547 [6] NCCL INFO Rank 6 selecting transport for rank 7 node-0:1933436:1934547 [6] NCCL INFO Transport 0 canConnect 1 node-0:1933436:1934547 [6] NCCL INFO Channel 03/0 : 6[6] -> 7[7] via P2P/CUMEM/read node-0:1933432:1934230 [2] NCCL INFO Rank 2 selecting transport for rank 3 node-0:1933432:1934230 [2] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Channel 04/0 : 2[2] -> 3[3] via P2P/CUMEM/read node-0:1933437:1934852 [7] NCCL INFO Rank 7 selecting transport for rank 0 node-0:1933437:1934852 [7] NCCL INFO Transport 0 canConnect 1 node-0:1933437:1934852 [7] NCCL INFO Channel 01/0 : 7[7] -> 0[0] via P2P/CUMEM/read node-0:1933433:1934551 [3] NCCL INFO Rank 3 selecting transport for rank 2 node-0:1933433:1934551 [3] NCCL INFO Transport 0 canConnect 1 node-0:1933431:1934545 [1] NCCL INFO Rank 1 selecting transport for rank 2 node-0:1933431:1934545 [1] NCCL INFO Transport 0 canConnect 1 node-0:1933431:1934545 [1] NCCL INFO Channel 00/0 : 1[1] -> 2[2] via P2P/CUMEM/read node-0:1933434:1934850 [4] NCCL INFO Rank 4 selecting transport for rank 5 node-0:1933434:1934850 [4] NCCL INFO Transport 0 canConnect 1 node-0:1933434:1934850 [4] NCCL INFO Channel 11/0 : 4[4] -> 5[5] via P2P/CUMEM/read node-0:1933430:1934228 [0] NCCL INFO Rank 0 selecting transport for rank 1 node-0:1933430:1934228 [0] NCCL INFO Transport 0 canConnect 1 node-0:1933430:1934228 [0] NCCL INFO Channel 04/0 : 0[0] -> 1[1] via P2P/CUMEM/read node-0:1933436:1934547 [6] NCCL INFO Rank 6 selecting transport for rank 7 node-0:1933436:1934547 [6] NCCL INFO Transport 0 canConnect 1 node-0:1933436:1934547 [6] NCCL INFO Channel 04/0 : 6[6] -> 7[7] via P2P/CUMEM/read node-0:1933432:1934230 [2] NCCL INFO Rank 2 selecting transport for rank 3 node-0:1933432:1934230 [2] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Channel 05/0 : 2[2] -> 3[3] via P2P/CUMEM/read node-0:1933437:1934852 [7] NCCL INFO Rank 7 selecting transport for rank 0 node-0:1933437:1934852 [7] NCCL INFO Transport 0 canConnect 1 node-0:1933437:1934852 [7] NCCL INFO Channel 02/0 : 7[7] -> 0[0] via P2P/CUMEM/read node-0:1933433:1934551 [3] NCCL INFO Rank 3 selecting transport for rank 2 node-0:1933433:1934551 [3] NCCL INFO Transport 0 canConnect 1 node-0:1933431:1934545 [1] NCCL INFO Rank 1 selecting transport for rank 2 node-0:1933431:1934545 [1] NCCL INFO Transport 0 canConnect 1 node-0:1933431:1934545 [1] NCCL INFO Channel 01/0 : 1[1] -> 2[2] via P2P/CUMEM/read node-0:1933434:1934850 [4] NCCL INFO Rank 4 selecting transport for rank 5 node-0:1933434:1934850 [4] NCCL INFO Transport 0 canConnect 1 node-0:1933434:1934850 [4] NCCL INFO Channel 12/0 : 4[4] -> 5[5] via P2P/CUMEM/read node-0:1933430:1934228 [0] NCCL INFO Rank 0 selecting transport for rank 1 node-0:1933430:1934228 [0] NCCL INFO Transport 0 canConnect 1 node-0:1933430:1934228 [0] NCCL INFO Channel 05/0 : 0[0] -> 1[1] via P2P/CUMEM/read node-0:1933436:1934547 [6] NCCL INFO Rank 6 selecting transport for rank 7 node-0:1933436:1934547 [6] NCCL INFO Transport 0 canConnect 1 node-0:1933436:1934547 [6] NCCL INFO Channel 05/0 : 6[6] -> 7[7] via P2P/CUMEM/read node-0:1933432:1934230 [2] NCCL INFO Rank 2 selecting transport for rank 3 node-0:1933432:1934230 [2] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Channel 06/0 : 2[2] -> 3[3] via P2P/CUMEM/read node-0:1933437:1934852 [7] NCCL INFO Rank 7 selecting transport for rank 0 node-0:1933437:1934852 [7] NCCL INFO Transport 0 canConnect 1 node-0:1933437:1934852 [7] NCCL INFO Channel 03/0 : 7[7] -> 0[0] via P2P/CUMEM/read node-0:1933435:1934848 [5] NCCL INFO Rank 5 selecting transport for rank 4 node-0:1933435:1934848 [5] NCCL INFO Transport 0 canConnect 1 node-0:1933433:1934551 [3] NCCL INFO Rank 3 selecting transport for rank 2 node-0:1933433:1934551 [3] NCCL INFO Transport 0 canConnect 1 node-0:1933431:1934545 [1] NCCL INFO Rank 1 selecting transport for rank 2 node-0:1933431:1934545 [1] NCCL INFO Transport 0 canConnect 1 node-0:1933431:1934545 [1] NCCL INFO Channel 02/0 : 1[1] -> 2[2] via P2P/CUMEM/read node-0:1933434:1934850 [4] NCCL INFO Rank 4 selecting transport for rank 5 node-0:1933434:1934850 [4] NCCL INFO Transport 0 canConnect 1 node-0:1933434:1934850 [4] NCCL INFO Channel 13/0 : 4[4] -> 5[5] via P2P/CUMEM/read node-0:1933430:1934228 [0] NCCL INFO Rank 0 selecting transport for rank 1 node-0:1933430:1934228 [0] NCCL INFO Transport 0 canConnect 1 node-0:1933430:1934228 [0] NCCL INFO Channel 06/0 : 0[0] -> 1[1] via P2P/CUMEM/read node-0:1933436:1934547 [6] NCCL INFO Rank 6 selecting transport for rank 7 node-0:1933436:1934547 [6] NCCL INFO Transport 0 canConnect 1 node-0:1933436:1934547 [6] NCCL INFO Channel 06/0 : 6[6] -> 7[7] via P2P/CUMEM/read node-0:1933432:1934230 [2] NCCL INFO Rank 2 selecting transport for rank 3 node-0:1933432:1934230 [2] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Channel 07/0 : 2[2] -> 3[3] via P2P/CUMEM/read node-0:1933437:1934852 [7] NCCL INFO Rank 7 selecting transport for rank 0 node-0:1933437:1934852 [7] NCCL INFO Transport 0 canConnect 1 node-0:1933437:1934852 [7] NCCL INFO Channel 04/0 : 7[7] -> 0[0] via P2P/CUMEM/read node-0:1933435:1934848 [5] NCCL INFO Rank 5 selecting transport for rank 4 node-0:1933435:1934848 [5] NCCL INFO Transport 0 canConnect 1 node-0:1933433:1934551 [3] NCCL INFO Rank 3 selecting transport for rank 2 node-0:1933433:1934551 [3] NCCL INFO Transport 0 canConnect 1 node-0:1933431:1934545 [1] NCCL INFO Rank 1 selecting transport for rank 2 node-0:1933431:1934545 [1] NCCL INFO Transport 0 canConnect 1 node-0:1933431:1934545 [1] NCCL INFO Channel 03/0 : 1[1] -> 2[2] via P2P/CUMEM/read node-0:1933434:1934850 [4] NCCL INFO Rank 4 selecting transport for rank 5 node-0:1933434:1934850 [4] NCCL INFO Transport 0 canConnect 1 node-0:1933434:1934850 [4] NCCL INFO Channel 14/0 : 4[4] -> 5[5] via P2P/CUMEM/read node-0:1933430:1934228 [0] NCCL INFO Rank 0 selecting transport for rank 1 node-0:1933430:1934228 [0] NCCL INFO Transport 0 canConnect 1 node-0:1933430:1934228 [0] NCCL INFO Channel 07/0 : 0[0] -> 1[1] via P2P/CUMEM/read node-0:1933436:1934547 [6] NCCL INFO Rank 6 selecting transport for rank 7 node-0:1933436:1934547 [6] NCCL INFO Transport 0 canConnect 1 node-0:1933436:1934547 [6] NCCL INFO Channel 07/0 : 6[6] -> 7[7] via P2P/CUMEM/read node-0:1933432:1934230 [2] NCCL INFO Rank 2 selecting transport for rank 3 node-0:1933432:1934230 [2] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Channel 08/0 : 2[2] -> 3[3] via P2P/CUMEM/read node-0:1933437:1934852 [7] NCCL INFO Rank 7 selecting transport for rank 0 node-0:1933437:1934852 [7] NCCL INFO Transport 0 canConnect 1 node-0:1933437:1934852 [7] NCCL INFO Channel 05/0 : 7[7] -> 0[0] via P2P/CUMEM/read node-0:1933433:1934551 [3] NCCL INFO Rank 3 selecting transport for rank 4 node-0:1933433:1934551 [3] NCCL INFO Transport 0 canConnect 1 node-0:1933433:1934551 [3] NCCL INFO Channel 00/0 : 3[3] -> 4[4] via P2P/CUMEM/read node-0:1933431:1934545 [1] NCCL INFO Rank 1 selecting transport for rank 2 node-0:1933431:1934545 [1] NCCL INFO Transport 0 canConnect 1 node-0:1933431:1934545 [1] NCCL INFO Channel 04/0 : 1[1] -> 2[2] via P2P/CUMEM/read node-0:1933434:1934850 [4] NCCL INFO Rank 4 selecting transport for rank 5 node-0:1933434:1934850 [4] NCCL INFO Transport 0 canConnect 1 node-0:1933434:1934850 [4] NCCL INFO Channel 15/0 : 4[4] -> 5[5] via P2P/CUMEM/read node-0:1933430:1934228 [0] NCCL INFO Rank 0 selecting transport for rank 1 node-0:1933430:1934228 [0] NCCL INFO Transport 0 canConnect 1 node-0:1933430:1934228 [0] NCCL INFO Channel 08/0 : 0[0] -> 1[1] via P2P/CUMEM/read node-0:1933436:1934547 [6] NCCL INFO Rank 6 selecting transport for rank 7 node-0:1933436:1934547 [6] NCCL INFO Transport 0 canConnect 1 node-0:1933436:1934547 [6] NCCL INFO Channel 08/0 : 6[6] -> 7[7] via P2P/CUMEM/read node-0:1933432:1934230 [2] NCCL INFO Rank 2 selecting transport for rank 3 node-0:1933432:1934230 [2] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Channel 09/0 : 2[2] -> 3[3] via P2P/CUMEM/read node-0:1933437:1934852 [7] NCCL INFO Rank 7 selecting transport for rank 0 node-0:1933437:1934852 [7] NCCL INFO Transport 0 canConnect 1 node-0:1933437:1934852 [7] NCCL INFO Channel 06/0 : 7[7] -> 0[0] via P2P/CUMEM/read node-0:1933433:1934551 [3] NCCL INFO Rank 3 selecting transport for rank 4 node-0:1933433:1934551 [3] NCCL INFO Transport 0 canConnect 1 node-0:1933433:1934551 [3] NCCL INFO Channel 01/0 : 3[3] -> 4[4] via P2P/CUMEM/read node-0:1933431:1934545 [1] NCCL INFO Rank 1 selecting transport for rank 2 node-0:1933431:1934545 [1] NCCL INFO Transport 0 canConnect 1 node-0:1933431:1934545 [1] NCCL INFO Channel 05/0 : 1[1] -> 2[2] via P2P/CUMEM/read node-0:1933434:1934850 [4] NCCL INFO Rank 4 selecting transport for rank 5 node-0:1933434:1934850 [4] NCCL INFO Transport 0 canConnect 1 node-0:1933434:1934850 [4] NCCL INFO Channel 16/0 : 4[4] -> 5[5] via P2P/CUMEM/read node-0:1933430:1934228 [0] NCCL INFO Rank 0 selecting transport for rank 1 node-0:1933430:1934228 [0] NCCL INFO Transport 0 canConnect 1 node-0:1933430:1934228 [0] NCCL INFO Channel 09/0 : 0[0] -> 1[1] via P2P/CUMEM/read node-0:1933436:1934547 [6] NCCL INFO Rank 6 selecting transport for rank 7 node-0:1933436:1934547 [6] NCCL INFO Transport 0 canConnect 1 node-0:1933436:1934547 [6] NCCL INFO Channel 09/0 : 6[6] -> 7[7] via P2P/CUMEM/read node-0:1933432:1934230 [2] NCCL INFO Rank 2 selecting transport for rank 3 node-0:1933432:1934230 [2] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Channel 10/0 : 2[2] -> 3[3] via P2P/CUMEM/read node-0:1933437:1934852 [7] NCCL INFO Rank 7 selecting transport for rank 0 node-0:1933437:1934852 [7] NCCL INFO Transport 0 canConnect 1 node-0:1933437:1934852 [7] NCCL INFO Channel 07/0 : 7[7] -> 0[0] via P2P/CUMEM/read node-0:1933433:1934551 [3] NCCL INFO Rank 3 selecting transport for rank 4 node-0:1933433:1934551 [3] NCCL INFO Transport 0 canConnect 1 node-0:1933433:1934551 [3] NCCL INFO Channel 02/0 : 3[3] -> 4[4] via P2P/CUMEM/read node-0:1933431:1934545 [1] NCCL INFO Rank 1 selecting transport for rank 2 node-0:1933431:1934545 [1] NCCL INFO Transport 0 canConnect 1 node-0:1933431:1934545 [1] NCCL INFO Channel 06/0 : 1[1] -> 2[2] via P2P/CUMEM/read node-0:1933434:1934850 [4] NCCL INFO Rank 4 selecting transport for rank 5 node-0:1933434:1934850 [4] NCCL INFO Transport 0 canConnect 1 node-0:1933434:1934850 [4] NCCL INFO Channel 17/0 : 4[4] -> 5[5] via P2P/CUMEM/read node-0:1933430:1934228 [0] NCCL INFO Rank 0 selecting transport for rank 1 node-0:1933430:1934228 [0] NCCL INFO Transport 0 canConnect 1 node-0:1933430:1934228 [0] NCCL INFO Channel 10/0 : 0[0] -> 1[1] via P2P/CUMEM/read node-0:1933436:1934547 [6] NCCL INFO Rank 6 selecting transport for rank 7 node-0:1933436:1934547 [6] NCCL INFO Transport 0 canConnect 1 node-0:1933436:1934547 [6] NCCL INFO Channel 10/0 : 6[6] -> 7[7] via P2P/CUMEM/read node-0:1933432:1934230 [2] NCCL INFO Rank 2 selecting transport for rank 3 node-0:1933432:1934230 [2] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Channel 11/0 : 2[2] -> 3[3] via P2P/CUMEM/read node-0:1933437:1934852 [7] NCCL INFO Rank 7 selecting transport for rank 0 node-0:1933437:1934852 [7] NCCL INFO Transport 0 canConnect 1 node-0:1933437:1934852 [7] NCCL INFO Channel 08/0 : 7[7] -> 0[0] via P2P/CUMEM/read node-0:1933435:1934848 [5] NCCL INFO Rank 5 selecting transport for rank 4 node-0:1933435:1934848 [5] NCCL INFO Transport 0 canConnect 1 node-0:1933433:1934551 [3] NCCL INFO Rank 3 selecting transport for rank 4 node-0:1933433:1934551 [3] NCCL INFO Transport 0 canConnect 1 node-0:1933433:1934551 [3] NCCL INFO Channel 03/0 : 3[3] -> 4[4] via P2P/CUMEM/read node-0:1933431:1934545 [1] NCCL INFO Rank 1 selecting transport for rank 2 node-0:1933431:1934545 [1] NCCL INFO Transport 0 canConnect 1 node-0:1933431:1934545 [1] NCCL INFO Channel 07/0 : 1[1] -> 2[2] via P2P/CUMEM/read node-0:1933430:1934228 [0] NCCL INFO Rank 0 selecting transport for rank 1 node-0:1933430:1934228 [0] NCCL INFO Transport 0 canConnect 1 node-0:1933430:1934228 [0] NCCL INFO Channel 11/0 : 0[0] -> 1[1] via P2P/CUMEM/read node-0:1933436:1934547 [6] NCCL INFO Rank 6 selecting transport for rank 7 node-0:1933436:1934547 [6] NCCL INFO Transport 0 canConnect 1 node-0:1933436:1934547 [6] NCCL INFO Channel 11/0 : 6[6] -> 7[7] via P2P/CUMEM/read node-0:1933432:1934230 [2] NCCL INFO Rank 2 selecting transport for rank 3 node-0:1933432:1934230 [2] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Channel 12/0 : 2[2] -> 3[3] via P2P/CUMEM/read node-0:1933437:1934852 [7] NCCL INFO Rank 7 selecting transport for rank 0 node-0:1933437:1934852 [7] NCCL INFO Transport 0 canConnect 1 node-0:1933437:1934852 [7] NCCL INFO Channel 09/0 : 7[7] -> 0[0] via P2P/CUMEM/read node-0:1933435:1934848 [5] NCCL INFO Rank 5 selecting transport for rank 4 node-0:1933435:1934848 [5] NCCL INFO Transport 0 canConnect 1 node-0:1933433:1934551 [3] NCCL INFO Rank 3 selecting transport for rank 4 node-0:1933433:1934551 [3] NCCL INFO Transport 0 canConnect 1 node-0:1933433:1934551 [3] NCCL INFO Channel 04/0 : 3[3] -> 4[4] via P2P/CUMEM/read node-0:1933431:1934545 [1] NCCL INFO Rank 1 selecting transport for rank 2 node-0:1933431:1934545 [1] NCCL INFO Transport 0 canConnect 1 node-0:1933431:1934545 [1] NCCL INFO Channel 08/0 : 1[1] -> 2[2] via P2P/CUMEM/read node-0:1933430:1934228 [0] NCCL INFO Rank 0 selecting transport for rank 1 node-0:1933430:1934228 [0] NCCL INFO Transport 0 canConnect 1 node-0:1933430:1934228 [0] NCCL INFO Channel 12/0 : 0[0] -> 1[1] via P2P/CUMEM/read node-0:1933436:1934547 [6] NCCL INFO Rank 6 selecting transport for rank 7 node-0:1933436:1934547 [6] NCCL INFO Transport 0 canConnect 1 node-0:1933436:1934547 [6] NCCL INFO Channel 12/0 : 6[6] -> 7[7] via P2P/CUMEM/read node-0:1933432:1934230 [2] NCCL INFO Rank 2 selecting transport for rank 3 node-0:1933432:1934230 [2] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Channel 13/0 : 2[2] -> 3[3] via P2P/CUMEM/read node-0:1933437:1934852 [7] NCCL INFO Rank 7 selecting transport for rank 0 node-0:1933437:1934852 [7] NCCL INFO Transport 0 canConnect 1 node-0:1933437:1934852 [7] NCCL INFO Channel 10/0 : 7[7] -> 0[0] via P2P/CUMEM/read node-0:1933435:1934848 [5] NCCL INFO Rank 5 selecting transport for rank 4 node-0:1933435:1934848 [5] NCCL INFO Transport 0 canConnect 1 node-0:1933433:1934551 [3] NCCL INFO Rank 3 selecting transport for rank 4 node-0:1933433:1934551 [3] NCCL INFO Transport 0 canConnect 1 node-0:1933433:1934551 [3] NCCL INFO Channel 05/0 : 3[3] -> 4[4] via P2P/CUMEM/read node-0:1933431:1934545 [1] NCCL INFO Rank 1 selecting transport for rank 2 node-0:1933431:1934545 [1] NCCL INFO Transport 0 canConnect 1 node-0:1933431:1934545 [1] NCCL INFO Channel 09/0 : 1[1] -> 2[2] via P2P/CUMEM/read node-0:1933430:1934228 [0] NCCL INFO Rank 0 selecting transport for rank 1 node-0:1933430:1934228 [0] NCCL INFO Transport 0 canConnect 1 node-0:1933430:1934228 [0] NCCL INFO Channel 13/0 : 0[0] -> 1[1] via P2P/CUMEM/read node-0:1933436:1934547 [6] NCCL INFO Rank 6 selecting transport for rank 7 node-0:1933436:1934547 [6] NCCL INFO Transport 0 canConnect 1 node-0:1933436:1934547 [6] NCCL INFO Channel 13/0 : 6[6] -> 7[7] via P2P/CUMEM/read node-0:1933437:1934852 [7] NCCL INFO Rank 7 selecting transport for rank 0 node-0:1933437:1934852 [7] NCCL INFO Transport 0 canConnect 1 node-0:1933437:1934852 [7] NCCL INFO Channel 11/0 : 7[7] -> 0[0] via P2P/CUMEM/read node-0:1933435:1934848 [5] NCCL INFO Rank 5 selecting transport for rank 4 node-0:1933435:1934848 [5] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Rank 2 selecting transport for rank 3 node-0:1933432:1934230 [2] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Channel 14/0 : 2[2] -> 3[3] via P2P/CUMEM/read node-0:1933433:1934551 [3] NCCL INFO Rank 3 selecting transport for rank 4 node-0:1933433:1934551 [3] NCCL INFO Transport 0 canConnect 1 node-0:1933433:1934551 [3] NCCL INFO Channel 06/0 : 3[3] -> 4[4] via P2P/CUMEM/read node-0:1933431:1934545 [1] NCCL INFO Rank 1 selecting transport for rank 2 node-0:1933431:1934545 [1] NCCL INFO Transport 0 canConnect 1 node-0:1933431:1934545 [1] NCCL INFO Channel 10/0 : 1[1] -> 2[2] via P2P/CUMEM/read node-0:1933430:1934228 [0] NCCL INFO Rank 0 selecting transport for rank 1 node-0:1933430:1934228 [0] NCCL INFO Transport 0 canConnect 1 node-0:1933430:1934228 [0] NCCL INFO Channel 14/0 : 0[0] -> 1[1] via P2P/CUMEM/read node-0:1933436:1934547 [6] NCCL INFO Rank 6 selecting transport for rank 7 node-0:1933436:1934547 [6] NCCL INFO Transport 0 canConnect 1 node-0:1933436:1934547 [6] NCCL INFO Channel 14/0 : 6[6] -> 7[7] via P2P/CUMEM/read node-0:1933437:1934852 [7] NCCL INFO Rank 7 selecting transport for rank 0 node-0:1933437:1934852 [7] NCCL INFO Transport 0 canConnect 1 node-0:1933437:1934852 [7] NCCL INFO Channel 12/0 : 7[7] -> 0[0] via P2P/CUMEM/read node-0:1933432:1934230 [2] NCCL INFO Rank 2 selecting transport for rank 3 node-0:1933432:1934230 [2] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Channel 15/0 : 2[2] -> 3[3] via P2P/CUMEM/read node-0:1933433:1934551 [3] NCCL INFO Rank 3 selecting transport for rank 4 node-0:1933433:1934551 [3] NCCL INFO Transport 0 canConnect 1 node-0:1933433:1934551 [3] NCCL INFO Channel 07/0 : 3[3] -> 4[4] via P2P/CUMEM/read node-0:1933431:1934545 [1] NCCL INFO Rank 1 selecting transport for rank 2 node-0:1933431:1934545 [1] NCCL INFO Transport 0 canConnect 1 node-0:1933431:1934545 [1] NCCL INFO Channel 11/0 : 1[1] -> 2[2] via P2P/CUMEM/read node-0:1933430:1934228 [0] NCCL INFO Rank 0 selecting transport for rank 1 node-0:1933430:1934228 [0] NCCL INFO Transport 0 canConnect 1 node-0:1933430:1934228 [0] NCCL INFO Channel 15/0 : 0[0] -> 1[1] via P2P/CUMEM/read node-0:1933436:1934547 [6] NCCL INFO Rank 6 selecting transport for rank 7 node-0:1933436:1934547 [6] NCCL INFO Transport 0 canConnect 1 node-0:1933436:1934547 [6] NCCL INFO Channel 15/0 : 6[6] -> 7[7] via P2P/CUMEM/read node-0:1933437:1934852 [7] NCCL INFO Rank 7 selecting transport for rank 0 node-0:1933437:1934852 [7] NCCL INFO Transport 0 canConnect 1 node-0:1933437:1934852 [7] NCCL INFO Channel 13/0 : 7[7] -> 0[0] via P2P/CUMEM/read node-0:1933432:1934230 [2] NCCL INFO Rank 2 selecting transport for rank 3 node-0:1933432:1934230 [2] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Channel 16/0 : 2[2] -> 3[3] via P2P/CUMEM/read node-0:1933434:1934850 [4] NCCL INFO Rank 4 selecting transport for rank 5 node-0:1933434:1934850 [4] NCCL INFO Transport 0 canConnect 1 node-0:1933434:1934850 [4] NCCL INFO Channel 18/0 : 4[4] -> 5[5] via P2P/CUMEM/read node-0:1933433:1934551 [3] NCCL INFO Rank 3 selecting transport for rank 4 node-0:1933433:1934551 [3] NCCL INFO Transport 0 canConnect 1 node-0:1933433:1934551 [3] NCCL INFO Channel 08/0 : 3[3] -> 4[4] via P2P/CUMEM/read node-0:1933431:1934545 [1] NCCL INFO Rank 1 selecting transport for rank 2 node-0:1933431:1934545 [1] NCCL INFO Transport 0 canConnect 1 node-0:1933431:1934545 [1] NCCL INFO Channel 12/0 : 1[1] -> 2[2] via P2P/CUMEM/read node-0:1933430:1934228 [0] NCCL INFO Rank 0 selecting transport for rank 1 node-0:1933430:1934228 [0] NCCL INFO Transport 0 canConnect 1 node-0:1933430:1934228 [0] NCCL INFO Channel 16/0 : 0[0] -> 1[1] via P2P/CUMEM/read node-0:1933436:1934547 [6] NCCL INFO Rank 6 selecting transport for rank 7 node-0:1933436:1934547 [6] NCCL INFO Transport 0 canConnect 1 node-0:1933436:1934547 [6] NCCL INFO Channel 16/0 : 6[6] -> 7[7] via P2P/CUMEM/read node-0:1933437:1934852 [7] NCCL INFO Rank 7 selecting transport for rank 0 node-0:1933437:1934852 [7] NCCL INFO Transport 0 canConnect 1 node-0:1933437:1934852 [7] NCCL INFO Channel 14/0 : 7[7] -> 0[0] via P2P/CUMEM/read node-0:1933432:1934230 [2] NCCL INFO Rank 2 selecting transport for rank 3 node-0:1933432:1934230 [2] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Channel 17/0 : 2[2] -> 3[3] via P2P/CUMEM/read node-0:1933434:1934850 [4] NCCL INFO Rank 4 selecting transport for rank 5 node-0:1933434:1934850 [4] NCCL INFO Transport 0 canConnect 1 node-0:1933434:1934850 [4] NCCL INFO Channel 19/0 : 4[4] -> 5[5] via P2P/CUMEM/read node-0:1933433:1934551 [3] NCCL INFO Rank 3 selecting transport for rank 4 node-0:1933433:1934551 [3] NCCL INFO Transport 0 canConnect 1 node-0:1933433:1934551 [3] NCCL INFO Channel 09/0 : 3[3] -> 4[4] via P2P/CUMEM/read node-0:1933431:1934545 [1] NCCL INFO Rank 1 selecting transport for rank 2 node-0:1933431:1934545 [1] NCCL INFO Transport 0 canConnect 1 node-0:1933431:1934545 [1] NCCL INFO Channel 13/0 : 1[1] -> 2[2] via P2P/CUMEM/read node-0:1933430:1934228 [0] NCCL INFO Rank 0 selecting transport for rank 1 node-0:1933430:1934228 [0] NCCL INFO Transport 0 canConnect 1 node-0:1933430:1934228 [0] NCCL INFO Channel 17/0 : 0[0] -> 1[1] via P2P/CUMEM/read node-0:1933436:1934547 [6] NCCL INFO Rank 6 selecting transport for rank 7 node-0:1933436:1934547 [6] NCCL INFO Transport 0 canConnect 1 node-0:1933436:1934547 [6] NCCL INFO Channel 17/0 : 6[6] -> 7[7] via P2P/CUMEM/read node-0:1933437:1934852 [7] NCCL INFO Rank 7 selecting transport for rank 0 node-0:1933437:1934852 [7] NCCL INFO Transport 0 canConnect 1 node-0:1933437:1934852 [7] NCCL INFO Channel 15/0 : 7[7] -> 0[0] via P2P/CUMEM/read node-0:1933432:1934230 [2] NCCL INFO Rank 2 selecting transport for rank 3 node-0:1933432:1934230 [2] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Channel 18/0 : 2[2] -> 3[3] via P2P/CUMEM/read node-0:1933434:1934850 [4] NCCL INFO Rank 4 selecting transport for rank 5 node-0:1933434:1934850 [4] NCCL INFO Transport 0 canConnect 1 node-0:1933434:1934850 [4] NCCL INFO Channel 20/0 : 4[4] -> 5[5] via P2P/CUMEM/read node-0:1933435:1934848 [5] NCCL INFO Rank 5 selecting transport for rank 4 node-0:1933435:1934848 [5] NCCL INFO Transport 0 canConnect 1 node-0:1933431:1934545 [1] NCCL INFO Rank 1 selecting transport for rank 2 node-0:1933431:1934545 [1] NCCL INFO Transport 0 canConnect 1 node-0:1933431:1934545 [1] NCCL INFO Channel 14/0 : 1[1] -> 2[2] via P2P/CUMEM/read node-0:1933430:1934228 [0] NCCL INFO Rank 0 selecting transport for rank 1 node-0:1933430:1934228 [0] NCCL INFO Transport 0 canConnect 1 node-0:1933430:1934228 [0] NCCL INFO Channel 18/0 : 0[0] -> 1[1] via P2P/CUMEM/read node-0:1933436:1934547 [6] NCCL INFO Rank 6 selecting transport for rank 7 node-0:1933436:1934547 [6] NCCL INFO Transport 0 canConnect 1 node-0:1933436:1934547 [6] NCCL INFO Channel 18/0 : 6[6] -> 7[7] via P2P/CUMEM/read node-0:1933437:1934852 [7] NCCL INFO Rank 7 selecting transport for rank 0 node-0:1933437:1934852 [7] NCCL INFO Transport 0 canConnect 1 node-0:1933437:1934852 [7] NCCL INFO Channel 16/0 : 7[7] -> 0[0] via P2P/CUMEM/read node-0:1933432:1934230 [2] NCCL INFO Rank 2 selecting transport for rank 3 node-0:1933432:1934230 [2] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Channel 19/0 : 2[2] -> 3[3] via P2P/CUMEM/read node-0:1933434:1934850 [4] NCCL INFO Rank 4 selecting transport for rank 5 node-0:1933434:1934850 [4] NCCL INFO Transport 0 canConnect 1 node-0:1933434:1934850 [4] NCCL INFO Channel 21/0 : 4[4] -> 5[5] via P2P/CUMEM/read node-0:1933433:1934551 [3] NCCL INFO Rank 3 selecting transport for rank 4 node-0:1933435:1934848 [5] NCCL INFO Rank 5 selecting transport for rank 4 node-0:1933435:1934848 [5] NCCL INFO Transport 0 canConnect 1 node-0:1933431:1934545 [1] NCCL INFO Rank 1 selecting transport for rank 2 node-0:1933431:1934545 [1] NCCL INFO Transport 0 canConnect 1 node-0:1933431:1934545 [1] NCCL INFO Channel 15/0 : 1[1] -> 2[2] via P2P/CUMEM/read node-0:1933433:1934551 [3] NCCL INFO Transport 0 canConnect 1 node-0:1933430:1934228 [0] NCCL INFO Rank 0 selecting transport for rank 1 node-0:1933433:1934551 [3] NCCL INFO Channel 10/0 : 3[3] -> 4[4] via P2P/CUMEM/read node-0:1933430:1934228 [0] NCCL INFO Transport 0 canConnect 1 node-0:1933430:1934228 [0] NCCL INFO Channel 19/0 : 0[0] -> 1[1] via P2P/CUMEM/read node-0:1933436:1934547 [6] NCCL INFO Rank 6 selecting transport for rank 7 node-0:1933436:1934547 [6] NCCL INFO Transport 0 canConnect 1 node-0:1933436:1934547 [6] NCCL INFO Channel 19/0 : 6[6] -> 7[7] via P2P/CUMEM/read node-0:1933437:1934852 [7] NCCL INFO Rank 7 selecting transport for rank 0 node-0:1933437:1934852 [7] NCCL INFO Transport 0 canConnect 1 node-0:1933437:1934852 [7] NCCL INFO Channel 17/0 : 7[7] -> 0[0] via P2P/CUMEM/read node-0:1933432:1934230 [2] NCCL INFO Rank 2 selecting transport for rank 3 node-0:1933432:1934230 [2] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Channel 20/0 : 2[2] -> 3[3] via P2P/CUMEM/read node-0:1933434:1934850 [4] NCCL INFO Rank 4 selecting transport for rank 5 node-0:1933434:1934850 [4] NCCL INFO Transport 0 canConnect 1 node-0:1933434:1934850 [4] NCCL INFO Channel 22/0 : 4[4] -> 5[5] via P2P/CUMEM/read node-0:1933431:1934545 [1] NCCL INFO Rank 1 selecting transport for rank 2 node-0:1933431:1934545 [1] NCCL INFO Transport 0 canConnect 1 node-0:1933431:1934545 [1] NCCL INFO Channel 16/0 : 1[1] -> 2[2] via P2P/CUMEM/read node-0:1933430:1934228 [0] NCCL INFO Rank 0 selecting transport for rank 1 node-0:1933430:1934228 [0] NCCL INFO Transport 0 canConnect 1 node-0:1933430:1934228 [0] NCCL INFO Channel 20/0 : 0[0] -> 1[1] via P2P/CUMEM/read node-0:1933433:1934551 [3] NCCL INFO Rank 3 selecting transport for rank 4 node-0:1933433:1934551 [3] NCCL INFO Transport 0 canConnect 1 node-0:1933433:1934551 [3] NCCL INFO Channel 11/0 : 3[3] -> 4[4] via P2P/CUMEM/read node-0:1933436:1934547 [6] NCCL INFO Rank 6 selecting transport for rank 7 node-0:1933436:1934547 [6] NCCL INFO Transport 0 canConnect 1 node-0:1933436:1934547 [6] NCCL INFO Channel 20/0 : 6[6] -> 7[7] via P2P/CUMEM/read node-0:1933437:1934852 [7] NCCL INFO Rank 7 selecting transport for rank 0 node-0:1933437:1934852 [7] NCCL INFO Transport 0 canConnect 1 node-0:1933437:1934852 [7] NCCL INFO Channel 18/0 : 7[7] -> 0[0] via P2P/CUMEM/read node-0:1933432:1934230 [2] NCCL INFO Rank 2 selecting transport for rank 3 node-0:1933432:1934230 [2] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Channel 21/0 : 2[2] -> 3[3] via P2P/CUMEM/read node-0:1933434:1934850 [4] NCCL INFO Rank 4 selecting transport for rank 5 node-0:1933434:1934850 [4] NCCL INFO Transport 0 canConnect 1 node-0:1933434:1934850 [4] NCCL INFO Channel 23/0 : 4[4] -> 5[5] via P2P/CUMEM/read node-0:1933431:1934545 [1] NCCL INFO Rank 1 selecting transport for rank 2 node-0:1933431:1934545 [1] NCCL INFO Transport 0 canConnect 1 node-0:1933431:1934545 [1] NCCL INFO Channel 17/0 : 1[1] -> 2[2] via P2P/CUMEM/read node-0:1933430:1934228 [0] NCCL INFO Rank 0 selecting transport for rank 1 node-0:1933430:1934228 [0] NCCL INFO Transport 0 canConnect 1 node-0:1933430:1934228 [0] NCCL INFO Channel 21/0 : 0[0] -> 1[1] via P2P/CUMEM/read node-0:1933433:1934551 [3] NCCL INFO Rank 3 selecting transport for rank 4 node-0:1933433:1934551 [3] NCCL INFO Transport 0 canConnect 1 node-0:1933433:1934551 [3] NCCL INFO Channel 12/0 : 3[3] -> 4[4] via P2P/CUMEM/read node-0:1933436:1934547 [6] NCCL INFO Rank 6 selecting transport for rank 7 node-0:1933436:1934547 [6] NCCL INFO Transport 0 canConnect 1 node-0:1933436:1934547 [6] NCCL INFO Channel 21/0 : 6[6] -> 7[7] via P2P/CUMEM/read node-0:1933437:1934852 [7] NCCL INFO Rank 7 selecting transport for rank 0 node-0:1933437:1934852 [7] NCCL INFO Transport 0 canConnect 1 node-0:1933437:1934852 [7] NCCL INFO Channel 19/0 : 7[7] -> 0[0] via P2P/CUMEM/read node-0:1933432:1934230 [2] NCCL INFO Rank 2 selecting transport for rank 3 node-0:1933432:1934230 [2] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Channel 22/0 : 2[2] -> 3[3] via P2P/CUMEM/read node-0:1933431:1934545 [1] NCCL INFO Rank 1 selecting transport for rank 2 node-0:1933431:1934545 [1] NCCL INFO Transport 0 canConnect 1 node-0:1933431:1934545 [1] NCCL INFO Channel 18/0 : 1[1] -> 2[2] via P2P/CUMEM/read node-0:1933430:1934228 [0] NCCL INFO Rank 0 selecting transport for rank 1 node-0:1933430:1934228 [0] NCCL INFO Transport 0 canConnect 1 node-0:1933430:1934228 [0] NCCL INFO Channel 22/0 : 0[0] -> 1[1] via P2P/CUMEM/read node-0:1933433:1934551 [3] NCCL INFO Rank 3 selecting transport for rank 4 node-0:1933433:1934551 [3] NCCL INFO Transport 0 canConnect 1 node-0:1933433:1934551 [3] NCCL INFO Channel 13/0 : 3[3] -> 4[4] via P2P/CUMEM/read node-0:1933436:1934547 [6] NCCL INFO Rank 6 selecting transport for rank 7 node-0:1933436:1934547 [6] NCCL INFO Transport 0 canConnect 1 node-0:1933436:1934547 [6] NCCL INFO Channel 22/0 : 6[6] -> 7[7] via P2P/CUMEM/read node-0:1933437:1934852 [7] NCCL INFO Rank 7 selecting transport for rank 0 node-0:1933437:1934852 [7] NCCL INFO Transport 0 canConnect 1 node-0:1933437:1934852 [7] NCCL INFO Channel 20/0 : 7[7] -> 0[0] via P2P/CUMEM/read node-0:1933432:1934230 [2] NCCL INFO Rank 2 selecting transport for rank 3 node-0:1933432:1934230 [2] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Channel 23/0 : 2[2] -> 3[3] via P2P/CUMEM/read node-0:1933431:1934545 [1] NCCL INFO Rank 1 selecting transport for rank 2 node-0:1933431:1934545 [1] NCCL INFO Transport 0 canConnect 1 node-0:1933431:1934545 [1] NCCL INFO Channel 19/0 : 1[1] -> 2[2] via P2P/CUMEM/read node-0:1933430:1934228 [0] NCCL INFO Rank 0 selecting transport for rank 1 node-0:1933430:1934228 [0] NCCL INFO Transport 0 canConnect 1 node-0:1933430:1934228 [0] NCCL INFO Channel 23/0 : 0[0] -> 1[1] via P2P/CUMEM/read node-0:1933433:1934551 [3] NCCL INFO Rank 3 selecting transport for rank 4 node-0:1933433:1934551 [3] NCCL INFO Transport 0 canConnect 1 node-0:1933433:1934551 [3] NCCL INFO Channel 14/0 : 3[3] -> 4[4] via P2P/CUMEM/read node-0:1933436:1934547 [6] NCCL INFO Rank 6 selecting transport for rank 7 node-0:1933436:1934547 [6] NCCL INFO Transport 0 canConnect 1 node-0:1933436:1934547 [6] NCCL INFO Channel 23/0 : 6[6] -> 7[7] via P2P/CUMEM/read node-0:1933437:1934852 [7] NCCL INFO Rank 7 selecting transport for rank 0 node-0:1933437:1934852 [7] NCCL INFO Transport 0 canConnect 1 node-0:1933437:1934852 [7] NCCL INFO Channel 21/0 : 7[7] -> 0[0] via P2P/CUMEM/read node-0:1933431:1934545 [1] NCCL INFO Rank 1 selecting transport for rank 2 node-0:1933431:1934545 [1] NCCL INFO Transport 0 canConnect 1 node-0:1933431:1934545 [1] NCCL INFO Channel 20/0 : 1[1] -> 2[2] via P2P/CUMEM/read node-0:1933433:1934551 [3] NCCL INFO Rank 3 selecting transport for rank 4 node-0:1933433:1934551 [3] NCCL INFO Transport 0 canConnect 1 node-0:1933433:1934551 [3] NCCL INFO Channel 15/0 : 3[3] -> 4[4] via P2P/CUMEM/read node-0:1933437:1934852 [7] NCCL INFO Rank 7 selecting transport for rank 0 node-0:1933437:1934852 [7] NCCL INFO Transport 0 canConnect 1 node-0:1933437:1934852 [7] NCCL INFO Channel 22/0 : 7[7] -> 0[0] via P2P/CUMEM/read node-0:1933431:1934545 [1] NCCL INFO Rank 1 selecting transport for rank 2 node-0:1933431:1934545 [1] NCCL INFO Transport 0 canConnect 1 node-0:1933431:1934545 [1] NCCL INFO Channel 21/0 : 1[1] -> 2[2] via P2P/CUMEM/read node-0:1933433:1934551 [3] NCCL INFO Rank 3 selecting transport for rank 4 node-0:1933433:1934551 [3] NCCL INFO Transport 0 canConnect 1 node-0:1933433:1934551 [3] NCCL INFO Channel 16/0 : 3[3] -> 4[4] via P2P/CUMEM/read node-0:1933437:1934852 [7] NCCL INFO Rank 7 selecting transport for rank 0 node-0:1933437:1934852 [7] NCCL INFO Transport 0 canConnect 1 node-0:1933437:1934852 [7] NCCL INFO Channel 23/0 : 7[7] -> 0[0] via P2P/CUMEM/read node-0:1933435:1934848 [5] NCCL INFO Rank 5 selecting transport for rank 4 node-0:1933435:1934848 [5] NCCL INFO Transport 0 canConnect 1 node-0:1933431:1934545 [1] NCCL INFO Rank 1 selecting transport for rank 2 node-0:1933431:1934545 [1] NCCL INFO Transport 0 canConnect 1 node-0:1933431:1934545 [1] NCCL INFO Channel 22/0 : 1[1] -> 2[2] via P2P/CUMEM/read node-0:1933433:1934551 [3] NCCL INFO Rank 3 selecting transport for rank 4 node-0:1933433:1934551 [3] NCCL INFO Transport 0 canConnect 1 node-0:1933433:1934551 [3] NCCL INFO Channel 17/0 : 3[3] -> 4[4] via P2P/CUMEM/read node-0:1933435:1934848 [5] NCCL INFO Rank 5 selecting transport for rank 4 node-0:1933435:1934848 [5] NCCL INFO Transport 0 canConnect 1 node-0:1933431:1934545 [1] NCCL INFO Rank 1 selecting transport for rank 2 node-0:1933431:1934545 [1] NCCL INFO Transport 0 canConnect 1 node-0:1933431:1934545 [1] NCCL INFO Channel 23/0 : 1[1] -> 2[2] via P2P/CUMEM/read node-0:1933433:1934551 [3] NCCL INFO Rank 3 selecting transport for rank 4 node-0:1933433:1934551 [3] NCCL INFO Transport 0 canConnect 1 node-0:1933433:1934551 [3] NCCL INFO Channel 18/0 : 3[3] -> 4[4] via P2P/CUMEM/read node-0:1933435:1934848 [5] NCCL INFO Rank 5 selecting transport for rank 6 node-0:1933435:1934848 [5] NCCL INFO Transport 0 canConnect 1 node-0:1933435:1934848 [5] NCCL INFO Channel 00/0 : 5[5] -> 6[6] via P2P/CUMEM/read node-0:1933433:1934551 [3] NCCL INFO Rank 3 selecting transport for rank 4 node-0:1933433:1934551 [3] NCCL INFO Transport 0 canConnect 1 node-0:1933433:1934551 [3] NCCL INFO Channel 19/0 : 3[3] -> 4[4] via P2P/CUMEM/read node-0:1933435:1934848 [5] NCCL INFO Rank 5 selecting transport for rank 6 node-0:1933435:1934848 [5] NCCL INFO Transport 0 canConnect 1 node-0:1933435:1934848 [5] NCCL INFO Channel 01/0 : 5[5] -> 6[6] via P2P/CUMEM/read node-0:1933433:1934551 [3] NCCL INFO Rank 3 selecting transport for rank 4 node-0:1933433:1934551 [3] NCCL INFO Transport 0 canConnect 1 node-0:1933433:1934551 [3] NCCL INFO Channel 20/0 : 3[3] -> 4[4] via P2P/CUMEM/read node-0:1933433:1934551 [3] NCCL INFO Rank 3 selecting transport for rank 4 node-0:1933433:1934551 [3] NCCL INFO Transport 0 canConnect 1 node-0:1933433:1934551 [3] NCCL INFO Channel 21/0 : 3[3] -> 4[4] via P2P/CUMEM/read node-0:1933435:1934848 [5] NCCL INFO Rank 5 selecting transport for rank 6 node-0:1933435:1934848 [5] NCCL INFO Transport 0 canConnect 1 node-0:1933435:1934848 [5] NCCL INFO Channel 02/0 : 5[5] -> 6[6] via P2P/CUMEM/read node-0:1933433:1934551 [3] NCCL INFO Rank 3 selecting transport for rank 4 node-0:1933433:1934551 [3] NCCL INFO Transport 0 canConnect 1 node-0:1933433:1934551 [3] NCCL INFO Channel 22/0 : 3[3] -> 4[4] via P2P/CUMEM/read node-0:1933433:1934551 [3] NCCL INFO Rank 3 selecting transport for rank 4 node-0:1933433:1934551 [3] NCCL INFO Transport 0 canConnect 1 node-0:1933433:1934551 [3] NCCL INFO Channel 23/0 : 3[3] -> 4[4] via P2P/CUMEM/read node-0:1933435:1934848 [5] NCCL INFO Rank 5 selecting transport for rank 6 node-0:1933435:1934848 [5] NCCL INFO Transport 0 canConnect 1 node-0:1933435:1934848 [5] NCCL INFO Channel 03/0 : 5[5] -> 6[6] via P2P/CUMEM/read node-0:1933435:1934848 [5] NCCL INFO Rank 5 selecting transport for rank 6 node-0:1933435:1934848 [5] NCCL INFO Transport 0 canConnect 1 node-0:1933435:1934848 [5] NCCL INFO Channel 04/0 : 5[5] -> 6[6] via P2P/CUMEM/read node-0:1933435:1934848 [5] NCCL INFO Rank 5 selecting transport for rank 6 node-0:1933435:1934848 [5] NCCL INFO Transport 0 canConnect 1 node-0:1933435:1934848 [5] NCCL INFO Channel 05/0 : 5[5] -> 6[6] via P2P/CUMEM/read node-0:1933435:1934848 [5] NCCL INFO Rank 5 selecting transport for rank 6 node-0:1933435:1934848 [5] NCCL INFO Transport 0 canConnect 1 node-0:1933435:1934848 [5] NCCL INFO Channel 06/0 : 5[5] -> 6[6] via P2P/CUMEM/read node-0:1933435:1934848 [5] NCCL INFO Rank 5 selecting transport for rank 6 node-0:1933435:1934848 [5] NCCL INFO Transport 0 canConnect 1 node-0:1933435:1934848 [5] NCCL INFO Channel 07/0 : 5[5] -> 6[6] via P2P/CUMEM/read node-0:1933435:1934848 [5] NCCL INFO Rank 5 selecting transport for rank 6 node-0:1933435:1934848 [5] NCCL INFO Transport 0 canConnect 1 node-0:1933435:1934848 [5] NCCL INFO Channel 08/0 : 5[5] -> 6[6] via P2P/CUMEM/read node-0:1933435:1934848 [5] NCCL INFO Rank 5 selecting transport for rank 6 node-0:1933435:1934848 [5] NCCL INFO Transport 0 canConnect 1 node-0:1933435:1934848 [5] NCCL INFO Channel 09/0 : 5[5] -> 6[6] via P2P/CUMEM/read node-0:1933435:1934848 [5] NCCL INFO Rank 5 selecting transport for rank 6 node-0:1933435:1934848 [5] NCCL INFO Transport 0 canConnect 1 node-0:1933435:1934848 [5] NCCL INFO Channel 10/0 : 5[5] -> 6[6] via P2P/CUMEM/read node-0:1933435:1934848 [5] NCCL INFO Rank 5 selecting transport for rank 6 node-0:1933435:1934848 [5] NCCL INFO Transport 0 canConnect 1 node-0:1933435:1934848 [5] NCCL INFO Channel 11/0 : 5[5] -> 6[6] via P2P/CUMEM/read node-0:1933435:1934848 [5] NCCL INFO Rank 5 selecting transport for rank 6 node-0:1933435:1934848 [5] NCCL INFO Transport 0 canConnect 1 node-0:1933435:1934848 [5] NCCL INFO Channel 12/0 : 5[5] -> 6[6] via P2P/CUMEM/read node-0:1933435:1934848 [5] NCCL INFO Rank 5 selecting transport for rank 6 node-0:1933435:1934848 [5] NCCL INFO Transport 0 canConnect 1 node-0:1933435:1934848 [5] NCCL INFO Channel 13/0 : 5[5] -> 6[6] via P2P/CUMEM/read node-0:1933435:1934848 [5] NCCL INFO Rank 5 selecting transport for rank 6 node-0:1933435:1934848 [5] NCCL INFO Transport 0 canConnect 1 node-0:1933435:1934848 [5] NCCL INFO Channel 14/0 : 5[5] -> 6[6] via P2P/CUMEM/read node-0:1933435:1934848 [5] NCCL INFO Rank 5 selecting transport for rank 6 node-0:1933435:1934848 [5] NCCL INFO Transport 0 canConnect 1 node-0:1933435:1934848 [5] NCCL INFO Channel 15/0 : 5[5] -> 6[6] via P2P/CUMEM/read node-0:1933435:1934848 [5] NCCL INFO Rank 5 selecting transport for rank 6 node-0:1933435:1934848 [5] NCCL INFO Transport 0 canConnect 1 node-0:1933435:1934848 [5] NCCL INFO Channel 16/0 : 5[5] -> 6[6] via P2P/CUMEM/read node-0:1933435:1934848 [5] NCCL INFO Rank 5 selecting transport for rank 6 node-0:1933435:1934848 [5] NCCL INFO Transport 0 canConnect 1 node-0:1933435:1934848 [5] NCCL INFO Channel 17/0 : 5[5] -> 6[6] via P2P/CUMEM/read node-0:1933435:1934848 [5] NCCL INFO Rank 5 selecting transport for rank 6 node-0:1933435:1934848 [5] NCCL INFO Transport 0 canConnect 1 node-0:1933435:1934848 [5] NCCL INFO Channel 18/0 : 5[5] -> 6[6] via P2P/CUMEM/read node-0:1933435:1934848 [5] NCCL INFO Rank 5 selecting transport for rank 6 node-0:1933435:1934848 [5] NCCL INFO Transport 0 canConnect 1 node-0:1933435:1934848 [5] NCCL INFO Channel 19/0 : 5[5] -> 6[6] via P2P/CUMEM/read node-0:1933435:1934848 [5] NCCL INFO Rank 5 selecting transport for rank 6 node-0:1933435:1934848 [5] NCCL INFO Transport 0 canConnect 1 node-0:1933435:1934848 [5] NCCL INFO Channel 20/0 : 5[5] -> 6[6] via P2P/CUMEM/read node-0:1933435:1934848 [5] NCCL INFO Rank 5 selecting transport for rank 6 node-0:1933435:1934848 [5] NCCL INFO Transport 0 canConnect 1 node-0:1933435:1934848 [5] NCCL INFO Channel 21/0 : 5[5] -> 6[6] via P2P/CUMEM/read node-0:1933435:1934848 [5] NCCL INFO Rank 5 selecting transport for rank 6 node-0:1933435:1934848 [5] NCCL INFO Transport 0 canConnect 1 node-0:1933435:1934848 [5] NCCL INFO Channel 22/0 : 5[5] -> 6[6] via P2P/CUMEM/read node-0:1933435:1934848 [5] NCCL INFO Rank 5 selecting transport for rank 6 node-0:1933435:1934848 [5] NCCL INFO Transport 0 canConnect 1 node-0:1933435:1934848 [5] NCCL INFO Channel 23/0 : 5[5] -> 6[6] via P2P/CUMEM/read node-0:1933432:1934230 [2] NCCL INFO Connected all rings node-0:1933432:1934230 [2] NCCL INFO Rank 2 selecting transport for rank 3 node-0:1933432:1934230 [2] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Rank 2 selecting transport for rank 3 node-0:1933432:1934230 [2] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Rank 2 selecting transport for rank 3 node-0:1933432:1934230 [2] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Rank 2 selecting transport for rank 3 node-0:1933432:1934230 [2] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Rank 2 selecting transport for rank 3 node-0:1933432:1934230 [2] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Rank 2 selecting transport for rank 3 node-0:1933432:1934230 [2] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Rank 2 selecting transport for rank 3 node-0:1933432:1934230 [2] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Rank 2 selecting transport for rank 3 node-0:1933432:1934230 [2] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Rank 2 selecting transport for rank 3 node-0:1933432:1934230 [2] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Rank 2 selecting transport for rank 3 node-0:1933432:1934230 [2] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Rank 2 selecting transport for rank 3 node-0:1933432:1934230 [2] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Rank 2 selecting transport for rank 3 node-0:1933432:1934230 [2] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Rank 2 selecting transport for rank 3 node-0:1933432:1934230 [2] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Rank 2 selecting transport for rank 3 node-0:1933432:1934230 [2] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Rank 2 selecting transport for rank 3 node-0:1933432:1934230 [2] NCCL INFO Transport 0 canConnect 1 node-0:1933431:1934545 [1] NCCL INFO Connected all rings node-0:1933431:1934545 [1] NCCL INFO Rank 1 selecting transport for rank 2 node-0:1933431:1934545 [1] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Rank 2 selecting transport for rank 3 node-0:1933432:1934230 [2] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Rank 2 selecting transport for rank 3 node-0:1933432:1934230 [2] NCCL INFO Transport 0 canConnect 1 node-0:1933431:1934545 [1] NCCL INFO Rank 1 selecting transport for rank 2 node-0:1933431:1934545 [1] NCCL INFO Transport 0 canConnect 1 node-0:1933433:1934551 [3] NCCL INFO Connected all rings node-0:1933433:1934551 [3] NCCL INFO Rank 3 selecting transport for rank 4 node-0:1933433:1934551 [3] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Rank 2 selecting transport for rank 3 node-0:1933432:1934230 [2] NCCL INFO Transport 0 canConnect 1 node-0:1933433:1934551 [3] NCCL INFO Rank 3 selecting transport for rank 4 node-0:1933431:1934545 [1] NCCL INFO Rank 1 selecting transport for rank 2 node-0:1933433:1934551 [3] NCCL INFO Transport 0 canConnect 1 node-0:1933431:1934545 [1] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Rank 2 selecting transport for rank 3 node-0:1933432:1934230 [2] NCCL INFO Transport 0 canConnect 1 node-0:1933433:1934551 [3] NCCL INFO Rank 3 selecting transport for rank 4 node-0:1933433:1934551 [3] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Rank 2 selecting transport for rank 3 node-0:1933432:1934230 [2] NCCL INFO Transport 0 canConnect 1 node-0:1933431:1934545 [1] NCCL INFO Rank 1 selecting transport for rank 2 node-0:1933431:1934545 [1] NCCL INFO Transport 0 canConnect 1 node-0:1933433:1934551 [3] NCCL INFO Rank 3 selecting transport for rank 4 node-0:1933433:1934551 [3] NCCL INFO Transport 0 canConnect 1 node-0:1933433:1934551 [3] NCCL INFO Rank 3 selecting transport for rank 4 node-0:1933433:1934551 [3] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Rank 2 selecting transport for rank 3 node-0:1933432:1934230 [2] NCCL INFO Transport 0 canConnect 1 node-0:1933433:1934551 [3] NCCL INFO Rank 3 selecting transport for rank 4 node-0:1933433:1934551 [3] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Rank 2 selecting transport for rank 3 node-0:1933432:1934230 [2] NCCL INFO Transport 0 canConnect 1 node-0:1933431:1934545 [1] NCCL INFO Rank 1 selecting transport for rank 2 node-0:1933431:1934545 [1] NCCL INFO Transport 0 canConnect 1 node-0:1933433:1934551 [3] NCCL INFO Rank 3 selecting transport for rank 4 node-0:1933433:1934551 [3] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Rank 2 selecting transport for rank 3 node-0:1933432:1934230 [2] NCCL INFO Transport 0 canConnect 1 node-0:1933431:1934545 [1] NCCL INFO Rank 1 selecting transport for rank 2 node-0:1933431:1934545 [1] NCCL INFO Transport 0 canConnect 1 node-0:1933433:1934551 [3] NCCL INFO Rank 3 selecting transport for rank 4 node-0:1933433:1934551 [3] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Rank 2 selecting transport for rank 3 node-0:1933432:1934230 [2] NCCL INFO Transport 0 canConnect 1 node-0:1933431:1934545 [1] NCCL INFO Rank 1 selecting transport for rank 2 node-0:1933431:1934545 [1] NCCL INFO Transport 0 canConnect 1 node-0:1933433:1934551 [3] NCCL INFO Rank 3 selecting transport for rank 4 node-0:1933433:1934551 [3] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Rank 2 selecting transport for rank 1 node-0:1933432:1934230 [2] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Channel 00/0 : 2[2] -> 1[1] via P2P/CUMEM/read node-0:1933433:1934551 [3] NCCL INFO Rank 3 selecting transport for rank 4 node-0:1933433:1934551 [3] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Rank 2 selecting transport for rank 1 node-0:1933432:1934230 [2] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Channel 01/0 : 2[2] -> 1[1] via P2P/CUMEM/read node-0:1933431:1934545 [1] NCCL INFO Rank 1 selecting transport for rank 2 node-0:1933431:1934545 [1] NCCL INFO Transport 0 canConnect 1 node-0:1933431:1934545 [1] NCCL INFO Rank 1 selecting transport for rank 2 node-0:1933431:1934545 [1] NCCL INFO Transport 0 canConnect 1 node-0:1933433:1934551 [3] NCCL INFO Rank 3 selecting transport for rank 4 node-0:1933433:1934551 [3] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Rank 2 selecting transport for rank 1 node-0:1933432:1934230 [2] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Channel 02/0 : 2[2] -> 1[1] via P2P/CUMEM/read node-0:1933431:1934545 [1] NCCL INFO Rank 1 selecting transport for rank 2 node-0:1933431:1934545 [1] NCCL INFO Transport 0 canConnect 1 node-0:1933433:1934551 [3] NCCL INFO Rank 3 selecting transport for rank 4 node-0:1933433:1934551 [3] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Rank 2 selecting transport for rank 1 node-0:1933432:1934230 [2] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Channel 03/0 : 2[2] -> 1[1] via P2P/CUMEM/read node-0:1933431:1934545 [1] NCCL INFO Rank 1 selecting transport for rank 2 node-0:1933431:1934545 [1] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Rank 2 selecting transport for rank 1 node-0:1933432:1934230 [2] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Channel 04/0 : 2[2] -> 1[1] via P2P/CUMEM/read node-0:1933431:1934545 [1] NCCL INFO Rank 1 selecting transport for rank 2 node-0:1933431:1934545 [1] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Rank 2 selecting transport for rank 1 node-0:1933432:1934230 [2] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Channel 05/0 : 2[2] -> 1[1] via P2P/CUMEM/read node-0:1933431:1934545 [1] NCCL INFO Rank 1 selecting transport for rank 2 node-0:1933431:1934545 [1] NCCL INFO Transport 0 canConnect 1 node-0:1933433:1934551 [3] NCCL INFO Rank 3 selecting transport for rank 4 node-0:1933433:1934551 [3] NCCL INFO Transport 0 canConnect 1 node-0:1933431:1934545 [1] NCCL INFO Rank 1 selecting transport for rank 2 node-0:1933431:1934545 [1] NCCL INFO Transport 0 canConnect 1 node-0:1933433:1934551 [3] NCCL INFO Rank 3 selecting transport for rank 4 node-0:1933433:1934551 [3] NCCL INFO Transport 0 canConnect 1 node-0:1933431:1934545 [1] NCCL INFO Rank 1 selecting transport for rank 2 node-0:1933431:1934545 [1] NCCL INFO Transport 0 canConnect 1 node-0:1933430:1934228 [0] NCCL INFO Connected all rings node-0:1933430:1934228 [0] NCCL INFO Rank 0 selecting transport for rank 1 node-0:1933430:1934228 [0] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Rank 2 selecting transport for rank 1 node-0:1933432:1934230 [2] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Channel 06/0 : 2[2] -> 1[1] via P2P/CUMEM/read node-0:1933433:1934551 [3] NCCL INFO Rank 3 selecting transport for rank 4 node-0:1933433:1934551 [3] NCCL INFO Transport 0 canConnect 1 node-0:1933430:1934228 [0] NCCL INFO Rank 0 selecting transport for rank 1 node-0:1933430:1934228 [0] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Rank 2 selecting transport for rank 1 node-0:1933432:1934230 [2] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Channel 07/0 : 2[2] -> 1[1] via P2P/CUMEM/read node-0:1933431:1934545 [1] NCCL INFO Rank 1 selecting transport for rank 2 node-0:1933431:1934545 [1] NCCL INFO Transport 0 canConnect 1 node-0:1933433:1934551 [3] NCCL INFO Rank 3 selecting transport for rank 4 node-0:1933433:1934551 [3] NCCL INFO Transport 0 canConnect 1 node-0:1933430:1934228 [0] NCCL INFO Rank 0 selecting transport for rank 1 node-0:1933430:1934228 [0] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Rank 2 selecting transport for rank 1 node-0:1933432:1934230 [2] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Channel 08/0 : 2[2] -> 1[1] via P2P/CUMEM/read node-0:1933430:1934228 [0] NCCL INFO Rank 0 selecting transport for rank 1 node-0:1933430:1934228 [0] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Rank 2 selecting transport for rank 1 node-0:1933432:1934230 [2] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Channel 09/0 : 2[2] -> 1[1] via P2P/CUMEM/read node-0:1933431:1934545 [1] NCCL INFO Rank 1 selecting transport for rank 2 node-0:1933431:1934545 [1] NCCL INFO Transport 0 canConnect 1 node-0:1933430:1934228 [0] NCCL INFO Rank 0 selecting transport for rank 1 node-0:1933430:1934228 [0] NCCL INFO Transport 0 canConnect 1 node-0:1933433:1934551 [3] NCCL INFO Rank 3 selecting transport for rank 4 node-0:1933433:1934551 [3] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Rank 2 selecting transport for rank 1 node-0:1933432:1934230 [2] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Channel 10/0 : 2[2] -> 1[1] via P2P/CUMEM/read node-0:1933430:1934228 [0] NCCL INFO Rank 0 selecting transport for rank 1 node-0:1933430:1934228 [0] NCCL INFO Transport 0 canConnect 1 node-0:1933431:1934545 [1] NCCL INFO Rank 1 selecting transport for rank 2 node-0:1933431:1934545 [1] NCCL INFO Transport 0 canConnect 1 node-0:1933433:1934551 [3] NCCL INFO Rank 3 selecting transport for rank 4 node-0:1933433:1934551 [3] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Rank 2 selecting transport for rank 1 node-0:1933432:1934230 [2] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Channel 11/0 : 2[2] -> 1[1] via P2P/CUMEM/read node-0:1933430:1934228 [0] NCCL INFO Rank 0 selecting transport for rank 1 node-0:1933430:1934228 [0] NCCL INFO Transport 0 canConnect 1 node-0:1933431:1934545 [1] NCCL INFO Rank 1 selecting transport for rank 2 node-0:1933431:1934545 [1] NCCL INFO Transport 0 canConnect 1 node-0:1933433:1934551 [3] NCCL INFO Rank 3 selecting transport for rank 4 node-0:1933433:1934551 [3] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Rank 2 selecting transport for rank 1 node-0:1933432:1934230 [2] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Channel 12/0 : 2[2] -> 1[1] via P2P/CUMEM/read node-0:1933430:1934228 [0] NCCL INFO Rank 0 selecting transport for rank 1 node-0:1933430:1934228 [0] NCCL INFO Transport 0 canConnect 1 node-0:1933431:1934545 [1] NCCL INFO Rank 1 selecting transport for rank 2 node-0:1933431:1934545 [1] NCCL INFO Transport 0 canConnect 1 node-0:1933433:1934551 [3] NCCL INFO Rank 3 selecting transport for rank 4 node-0:1933433:1934551 [3] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Rank 2 selecting transport for rank 1 node-0:1933432:1934230 [2] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Channel 13/0 : 2[2] -> 1[1] via P2P/CUMEM/read node-0:1933433:1934551 [3] NCCL INFO Rank 3 selecting transport for rank 4 node-0:1933433:1934551 [3] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Rank 2 selecting transport for rank 1 node-0:1933432:1934230 [2] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Channel 14/0 : 2[2] -> 1[1] via P2P/CUMEM/read node-0:1933430:1934228 [0] NCCL INFO Rank 0 selecting transport for rank 1 node-0:1933430:1934228 [0] NCCL INFO Transport 0 canConnect 1 node-0:1933433:1934551 [3] NCCL INFO Rank 3 selecting transport for rank 4 node-0:1933433:1934551 [3] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Rank 2 selecting transport for rank 1 node-0:1933432:1934230 [2] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Channel 15/0 : 2[2] -> 1[1] via P2P/CUMEM/read node-0:1933430:1934228 [0] NCCL INFO Rank 0 selecting transport for rank 1 node-0:1933430:1934228 [0] NCCL INFO Transport 0 canConnect 1 node-0:1933433:1934551 [3] NCCL INFO Rank 3 selecting transport for rank 4 node-0:1933433:1934551 [3] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Rank 2 selecting transport for rank 1 node-0:1933432:1934230 [2] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Channel 16/0 : 2[2] -> 1[1] via P2P/CUMEM/read node-0:1933431:1934545 [1] NCCL INFO Rank 1 selecting transport for rank 2 node-0:1933431:1934545 [1] NCCL INFO Transport 0 canConnect 1 node-0:1933430:1934228 [0] NCCL INFO Rank 0 selecting transport for rank 1 node-0:1933430:1934228 [0] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Rank 2 selecting transport for rank 1 node-0:1933432:1934230 [2] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Channel 17/0 : 2[2] -> 1[1] via P2P/CUMEM/read node-0:1933431:1934545 [1] NCCL INFO Rank 1 selecting transport for rank 2 node-0:1933431:1934545 [1] NCCL INFO Transport 0 canConnect 1 node-0:1933433:1934551 [3] NCCL INFO Rank 3 selecting transport for rank 4 node-0:1933433:1934551 [3] NCCL INFO Transport 0 canConnect 1 node-0:1933430:1934228 [0] NCCL INFO Rank 0 selecting transport for rank 1 node-0:1933430:1934228 [0] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Rank 2 selecting transport for rank 1 node-0:1933432:1934230 [2] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Channel 18/0 : 2[2] -> 1[1] via P2P/CUMEM/read node-0:1933433:1934551 [3] NCCL INFO Rank 3 selecting transport for rank 2 node-0:1933433:1934551 [3] NCCL INFO Transport 0 canConnect 1 node-0:1933433:1934551 [3] NCCL INFO Channel 00/0 : 3[3] -> 2[2] via P2P/CUMEM/read node-0:1933430:1934228 [0] NCCL INFO Rank 0 selecting transport for rank 1 node-0:1933430:1934228 [0] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Rank 2 selecting transport for rank 1 node-0:1933432:1934230 [2] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Channel 19/0 : 2[2] -> 1[1] via P2P/CUMEM/read node-0:1933433:1934551 [3] NCCL INFO Rank 3 selecting transport for rank 2 node-0:1933433:1934551 [3] NCCL INFO Transport 0 canConnect 1 node-0:1933433:1934551 [3] NCCL INFO Channel 01/0 : 3[3] -> 2[2] via P2P/CUMEM/read node-0:1933430:1934228 [0] NCCL INFO Rank 0 selecting transport for rank 1 node-0:1933430:1934228 [0] NCCL INFO Transport 0 canConnect 1 node-0:1933431:1934545 [1] NCCL INFO Rank 1 selecting transport for rank 2 node-0:1933431:1934545 [1] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Rank 2 selecting transport for rank 1 node-0:1933432:1934230 [2] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Channel 20/0 : 2[2] -> 1[1] via P2P/CUMEM/read node-0:1933433:1934551 [3] NCCL INFO Rank 3 selecting transport for rank 2 node-0:1933433:1934551 [3] NCCL INFO Transport 0 canConnect 1 node-0:1933433:1934551 [3] NCCL INFO Channel 02/0 : 3[3] -> 2[2] via P2P/CUMEM/read node-0:1933431:1934545 [1] NCCL INFO Rank 1 selecting transport for rank 2 node-0:1933431:1934545 [1] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Rank 2 selecting transport for rank 1 node-0:1933432:1934230 [2] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Channel 21/0 : 2[2] -> 1[1] via P2P/CUMEM/read node-0:1933430:1934228 [0] NCCL INFO Rank 0 selecting transport for rank 1 node-0:1933430:1934228 [0] NCCL INFO Transport 0 canConnect 1 node-0:1933433:1934551 [3] NCCL INFO Rank 3 selecting transport for rank 2 node-0:1933433:1934551 [3] NCCL INFO Transport 0 canConnect 1 node-0:1933433:1934551 [3] NCCL INFO Channel 03/0 : 3[3] -> 2[2] via P2P/CUMEM/read node-0:1933431:1934545 [1] NCCL INFO Rank 1 selecting transport for rank 0 node-0:1933431:1934545 [1] NCCL INFO Transport 0 canConnect 1 node-0:1933431:1934545 [1] NCCL INFO Channel 00/0 : 1[1] -> 0[0] via P2P/CUMEM/read node-0:1933430:1934228 [0] NCCL INFO Rank 0 selecting transport for rank 1 node-0:1933430:1934228 [0] NCCL INFO Transport 0 canConnect 1 node-0:1933433:1934551 [3] NCCL INFO Rank 3 selecting transport for rank 2 node-0:1933433:1934551 [3] NCCL INFO Transport 0 canConnect 1 node-0:1933433:1934551 [3] NCCL INFO Channel 04/0 : 3[3] -> 2[2] via P2P/CUMEM/read node-0:1933431:1934545 [1] NCCL INFO Rank 1 selecting transport for rank 0 node-0:1933431:1934545 [1] NCCL INFO Transport 0 canConnect 1 node-0:1933431:1934545 [1] NCCL INFO Channel 01/0 : 1[1] -> 0[0] via P2P/CUMEM/read node-0:1933430:1934228 [0] NCCL INFO Rank 0 selecting transport for rank 1 node-0:1933430:1934228 [0] NCCL INFO Transport 0 canConnect 1 node-0:1933433:1934551 [3] NCCL INFO Rank 3 selecting transport for rank 2 node-0:1933433:1934551 [3] NCCL INFO Transport 0 canConnect 1 node-0:1933433:1934551 [3] NCCL INFO Channel 05/0 : 3[3] -> 2[2] via P2P/CUMEM/read node-0:1933431:1934545 [1] NCCL INFO Rank 1 selecting transport for rank 0 node-0:1933431:1934545 [1] NCCL INFO Transport 0 canConnect 1 node-0:1933431:1934545 [1] NCCL INFO Channel 02/0 : 1[1] -> 0[0] via P2P/CUMEM/read node-0:1933430:1934228 [0] NCCL INFO Rank 0 selecting transport for rank 1 node-0:1933430:1934228 [0] NCCL INFO Transport 0 canConnect 1 node-0:1933433:1934551 [3] NCCL INFO Rank 3 selecting transport for rank 2 node-0:1933433:1934551 [3] NCCL INFO Transport 0 canConnect 1 node-0:1933433:1934551 [3] NCCL INFO Channel 06/0 : 3[3] -> 2[2] via P2P/CUMEM/read node-0:1933431:1934545 [1] NCCL INFO Rank 1 selecting transport for rank 0 node-0:1933431:1934545 [1] NCCL INFO Transport 0 canConnect 1 node-0:1933431:1934545 [1] NCCL INFO Channel 03/0 : 1[1] -> 0[0] via P2P/CUMEM/read node-0:1933430:1934228 [0] NCCL INFO Rank 0 selecting transport for rank 1 node-0:1933430:1934228 [0] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Rank 2 selecting transport for rank 1 node-0:1933432:1934230 [2] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Channel 22/0 : 2[2] -> 1[1] via P2P/CUMEM/read node-0:1933431:1934545 [1] NCCL INFO Rank 1 selecting transport for rank 0 node-0:1933431:1934545 [1] NCCL INFO Transport 0 canConnect 1 node-0:1933431:1934545 [1] NCCL INFO Channel 04/0 : 1[1] -> 0[0] via P2P/CUMEM/read node-0:1933432:1934230 [2] NCCL INFO Rank 2 selecting transport for rank 1 node-0:1933432:1934230 [2] NCCL INFO Transport 0 canConnect 1 node-0:1933432:1934230 [2] NCCL INFO Channel 23/0 : 2[2] -> 1[1] via P2P/CUMEM/read node-0:1933430:1934228 [0] NCCL INFO Rank 0 selecting transport for rank 1 node-0:1933430:1934228 [0] NCCL INFO Transport 0 canConnect 1 node-0:1933433:1934551 [3] NCCL INFO Rank 3 selecting transport for rank 2 node-0:1933433:1934551 [3] NCCL INFO Transport 0 canConnect 1 node-0:1933433:1934551 [3] NCCL INFO Channel 07/0 : 3[3] -> 2[2] via P2P/CUMEM/read node-0:1933430:1934228 [0] NCCL INFO Rank 0 selecting transport for rank 1 node-0:1933430:1934228 [0] NCCL INFO Transport 0 canConnect 1 node-0:1933431:1934545 [1] NCCL INFO Rank 1 selecting transport for rank 0 node-0:1933431:1934545 [1] NCCL INFO Transport 0 canConnect 1 node-0:1933431:1934545 [1] NCCL INFO Channel 05/0 : 1[1] -> 0[0] via P2P/CUMEM/read node-0:1933430:1934228 [0] NCCL INFO Rank 0 selecting transport for rank 1 node-0:1933430:1934228 [0] NCCL INFO Transport 0 canConnect 1 node-0:1933431:1934545 [1] NCCL INFO Rank 1 selecting transport for rank 0 node-0:1933431:1934545 [1] NCCL INFO Transport 0 canConnect 1 node-0:1933431:1934545 [1] NCCL INFO Channel 06/0 : 1[1] -> 0[0] via P2P/CUMEM/read node-0:1933430:1934228 [0] NCCL INFO Rank 0 selecting transport for rank 1 node-0:1933430:1934228 [0] NCCL INFO Transport 0 canConnect 1 node-0:1933431:1934545 [1] NCCL INFO Rank 1 selecting transport for rank 0 node-0:1933431:1934545 [1] NCCL INFO Transport 0 canConnect 1 node-0:1933431:1934545 [1] NCCL INFO Channel 07/0 : 1[1] -> 0[0] via P2P/CUMEM/read node-0:1933430:1934228 [0] NCCL INFO Rank 0 selecting transport for rank 1 node-0:1933430:1934228 [0] NCCL INFO Transport 0 canConnect 1 node-0:1933431:1934545 [1] NCCL INFO Rank 1 selecting transport for rank 0 node-0:1933431:1934545 [1] NCCL INFO Transport 0 canConnect 1 node-0:1933431:1934545 [1] NCCL INFO Channel 08/0 : 1[1] -> 0[0] via P2P/CUMEM/read node-0:1933433:1934551 [3] NCCL INFO Rank 3 selecting transport for rank 2 node-0:1933433:1934551 [3] NCCL INFO Transport 0 canConnect 1 node-0:1933433:1934551 [3] NCCL INFO Channel 08/0 : 3[3] -> 2[2] via P2P/CUMEM/read node-0:1933431:1934545 [1] NCCL INFO Rank 1 selecting transport for rank 0 node-0:1933431:1934545 [1] NCCL INFO Transport 0 canConnect 1 node-0:1933431:1934545 [1] NCCL INFO Channel 09/0 : 1[1] -> 0[0] via P2P/CUMEM/read node-0:1933433:1934551 [3] NCCL INFO Rank 3 selecting transport for rank 2 node-0:1933433:1934551 [3] NCCL INFO Transport 0 canConnect 1 node-0:1933433:1934551 [3] NCCL INFO Channel 09/0 : 3[3] -> 2[2] via P2P/CUMEM/read node-0:1933431:1934545 [1] NCCL INFO Rank 1 selecting transport for rank 0 node-0:1933431:1934545 [1] NCCL INFO Transport 0 canConnect 1 node-0:1933431:1934545 [1] NCCL INFO Channel 10/0 : 1[1] -> 0[0] via P2P/CUMEM/read node-0:1933431:1934545 [1] NCCL INFO Rank 1 selecting transport for rank 0 node-0:1933431:1934545 [1] NCCL INFO Transport 0 canConnect 1 node-0:1933431:1934545 [1] NCCL INFO Channel 11/0 : 1[1] -> 0[0] via P2P/CUMEM/read node-0:1933433:1934551 [3] NCCL INFO Rank 3 selecting transport for rank 2 node-0:1933433:1934551 [3] NCCL INFO Transport 0 canConnect 1 node-0:1933433:1934551 [3] NCCL INFO Channel 10/0 : 3[3] -> 2[2] via P2P/CUMEM/read node-0:1933431:1934545 [1] NCCL INFO Rank 1 selecting transport for rank 0 node-0:1933431:1934545 [1] NCCL INFO Transport 0 canConnect 1 node-0:1933431:1934545 [1] NCCL INFO Channel 12/0 : 1[1] -> 0[0] via P2P/CUMEM/read node-0:1933431:1934545 [1] NCCL INFO Rank 1 selecting transport for rank 0 node-0:1933431:1934545 [1] NCCL INFO Transport 0 canConnect 1 node-0:1933431:1934545 [1] NCCL INFO Channel 13/0 : 1[1] -> 0[0] via P2P/CUMEM/read node-0:1933433:1934551 [3] NCCL INFO Rank 3 selecting transport for rank 2 node-0:1933433:1934551 [3] NCCL INFO Transport 0 canConnect 1 node-0:1933433:1934551 [3] NCCL INFO Channel 11/0 : 3[3] -> 2[2] via P2P/CUMEM/read node-0:1933431:1934545 [1] NCCL INFO Rank 1 selecting transport for rank 0 node-0:1933431:1934545 [1] NCCL INFO Transport 0 canConnect 1 node-0:1933431:1934545 [1] NCCL INFO Channel 14/0 : 1[1] -> 0[0] via P2P/CUMEM/read node-0:1933433:1934551 [3] NCCL INFO Rank 3 selecting transport for rank 2 node-0:1933433:1934551 [3] NCCL INFO Transport 0 canConnect 1 node-0:1933433:1934551 [3] NCCL INFO Channel 12/0 : 3[3] -> 2[2] via P2P/CUMEM/read node-0:1933431:1934545 [1] NCCL INFO Rank 1 selecting transport for rank 0 node-0:1933431:1934545 [1] NCCL INFO Transport 0 canConnect 1 node-0:1933431:1934545 [1] NCCL INFO Channel 15/0 : 1[1] -> 0[0] via P2P/CUMEM/read node-0:1933431:1934545 [1] NCCL INFO Rank 1 selecting transport for rank 0 node-0:1933431:1934545 [1] NCCL INFO Transport 0 canConnect 1 node-0:1933431:1934545 [1] NCCL INFO Channel 16/0 : 1[1] -> 0[0] via P2P/CUMEM/read node-0:1933431:1934545 [1] NCCL INFO Rank 1 selecting transport for rank 0 node-0:1933431:1934545 [1] NCCL INFO Transport 0 canConnect 1 node-0:1933431:1934545 [1] NCCL INFO Channel 17/0 : 1[1] -> 0[0] via P2P/CUMEM/read node-0:1933433:1934551 [3] NCCL INFO Rank 3 selecting transport for rank 2 node-0:1933433:1934551 [3] NCCL INFO Transport 0 canConnect 1 node-0:1933433:1934551 [3] NCCL INFO Channel 13/0 : 3[3] -> 2[2] via P2P/CUMEM/read node-0:1933431:1934545 [1] NCCL INFO Rank 1 selecting transport for rank 0 node-0:1933431:1934545 [1] NCCL INFO Transport 0 canConnect 1 node-0:1933431:1934545 [1] NCCL INFO Channel 18/0 : 1[1] -> 0[0] via P2P/CUMEM/read node-0:1933431:1934545 [1] NCCL INFO Rank 1 selecting transport for rank 0 node-0:1933431:1934545 [1] NCCL INFO Transport 0 canConnect 1 node-0:1933431:1934545 [1] NCCL INFO Channel 19/0 : 1[1] -> 0[0] via P2P/CUMEM/read node-0:1933431:1934545 [1] NCCL INFO Rank 1 selecting transport for rank 0 node-0:1933431:1934545 [1] NCCL INFO Transport 0 canConnect 1 node-0:1933431:1934545 [1] NCCL INFO Channel 20/0 : 1[1] -> 0[0] via P2P/CUMEM/read node-0:1933433:1934551 [3] NCCL INFO Rank 3 selecting transport for rank 2 node-0:1933433:1934551 [3] NCCL INFO Transport 0 canConnect 1 node-0:1933433:1934551 [3] NCCL INFO Channel 14/0 : 3[3] -> 2[2] via P2P/CUMEM/read node-0:1933431:1934545 [1] NCCL INFO Rank 1 selecting transport for rank 0 node-0:1933431:1934545 [1] NCCL INFO Transport 0 canConnect 1 node-0:1933431:1934545 [1] NCCL INFO Channel 21/0 : 1[1] -> 0[0] via P2P/CUMEM/read node-0:1933433:1934551 [3] NCCL INFO Rank 3 selecting transport for rank 2 node-0:1933433:1934551 [3] NCCL INFO Transport 0 canConnect 1 node-0:1933433:1934551 [3] NCCL INFO Channel 15/0 : 3[3] -> 2[2] via P2P/CUMEM/read node-0:1933431:1934545 [1] NCCL INFO Rank 1 selecting transport for rank 0 node-0:1933431:1934545 [1] NCCL INFO Transport 0 canConnect 1 node-0:1933431:1934545 [1] NCCL INFO Channel 22/0 : 1[1] -> 0[0] via P2P/CUMEM/read node-0:1933433:1934551 [3] NCCL INFO Rank 3 selecting transport for rank 2 node-0:1933433:1934551 [3] NCCL INFO Transport 0 canConnect 1 node-0:1933433:1934551 [3] NCCL INFO Channel 16/0 : 3[3] -> 2[2] via P2P/CUMEM/read node-0:1933431:1934545 [1] NCCL INFO Rank 1 selecting transport for rank 0 node-0:1933431:1934545 [1] NCCL INFO Transport 0 canConnect 1 node-0:1933431:1934545 [1] NCCL INFO Channel 23/0 : 1[1] -> 0[0] via P2P/CUMEM/read node-0:1933433:1934551 [3] NCCL INFO Rank 3 selecting transport for rank 2 node-0:1933433:1934551 [3] NCCL INFO Transport 0 canConnect 1 node-0:1933433:1934551 [3] NCCL INFO Channel 17/0 : 3[3] -> 2[2] via P2P/CUMEM/read node-0:1933433:1934551 [3] NCCL INFO Rank 3 selecting transport for rank 2 node-0:1933433:1934551 [3] NCCL INFO Transport 0 canConnect 1 node-0:1933433:1934551 [3] NCCL INFO Channel 18/0 : 3[3] -> 2[2] via P2P/CUMEM/read node-0:1933433:1934551 [3] NCCL INFO Rank 3 selecting transport for rank 2 node-0:1933433:1934551 [3] NCCL INFO Transport 0 canConnect 1 node-0:1933433:1934551 [3] NCCL INFO Channel 19/0 : 3[3] -> 2[2] via P2P/CUMEM/read node-0:1933433:1934551 [3] NCCL INFO Rank 3 selecting transport for rank 2 node-0:1933433:1934551 [3] NCCL INFO Transport 0 canConnect 1 node-0:1933433:1934551 [3] NCCL INFO Channel 20/0 : 3[3] -> 2[2] via P2P/CUMEM/read node-0:1933433:1934551 [3] NCCL INFO Rank 3 selecting transport for rank 2 node-0:1933433:1934551 [3] NCCL INFO Transport 0 canConnect 1 node-0:1933433:1934551 [3] NCCL INFO Channel 21/0 : 3[3] -> 2[2] via P2P/CUMEM/read node-0:1933433:1934551 [3] NCCL INFO Rank 3 selecting transport for rank 2 node-0:1933433:1934551 [3] NCCL INFO Transport 0 canConnect 1 node-0:1933433:1934551 [3] NCCL INFO Channel 22/0 : 3[3] -> 2[2] via P2P/CUMEM/read node-0:1933433:1934551 [3] NCCL INFO Rank 3 selecting transport for rank 2 node-0:1933433:1934551 [3] NCCL INFO Transport 0 canConnect 1 node-0:1933433:1934551 [3] NCCL INFO Channel 23/0 : 3[3] -> 2[2] via P2P/CUMEM/read node-0:1933437:1934852 [7] NCCL INFO Connected all rings node-0:1933437:1934852 [7] NCCL INFO Rank 7 selecting transport for rank 6 node-0:1933437:1934852 [7] NCCL INFO Transport 0 canConnect 1 node-0:1933437:1934852 [7] NCCL INFO Channel 00/0 : 7[7] -> 6[6] via P2P/CUMEM/read node-0:1933436:1934547 [6] NCCL INFO Connected all rings node-0:1933436:1934547 [6] NCCL INFO Rank 6 selecting transport for rank 7 node-0:1933436:1934547 [6] NCCL INFO Transport 0 canConnect 1 node-0:1933434:1934850 [4] NCCL INFO Connected all rings node-0:1933434:1934850 [4] NCCL INFO Rank 4 selecting transport for rank 5 node-0:1933435:1934848 [5] NCCL INFO Connected all rings node-0:1933435:1934848 [5] NCCL INFO Rank 5 selecting transport for rank 6 node-0:1933434:1934850 [4] NCCL INFO Transport 0 canConnect 1 node-0:1933435:1934848 [5] NCCL INFO Transport 0 canConnect 1 node-0:1933437:1934852 [7] NCCL INFO Rank 7 selecting transport for rank 6 node-0:1933437:1934852 [7] NCCL INFO Transport 0 canConnect 1 node-0:1933437:1934852 [7] NCCL INFO Channel 01/0 : 7[7] -> 6[6] via P2P/CUMEM/read node-0:1933436:1934547 [6] NCCL INFO Rank 6 selecting transport for rank 7 node-0:1933436:1934547 [6] NCCL INFO Transport 0 canConnect 1 node-0:1933435:1934848 [5] NCCL INFO Rank 5 selecting transport for rank 6 node-0:1933435:1934848 [5] NCCL INFO Transport 0 canConnect 1 node-0:1933437:1934852 [7] NCCL INFO Rank 7 selecting transport for rank 6 node-0:1933437:1934852 [7] NCCL INFO Transport 0 canConnect 1 node-0:1933437:1934852 [7] NCCL INFO Channel 02/0 : 7[7] -> 6[6] via P2P/CUMEM/read node-0:1933436:1934547 [6] NCCL INFO Rank 6 selecting transport for rank 7 node-0:1933436:1934547 [6] NCCL INFO Transport 0 canConnect 1 node-0:1933434:1934850 [4] NCCL INFO Rank 4 selecting transport for rank 5 node-0:1933434:1934850 [4] NCCL INFO Transport 0 canConnect 1 node-0:1933436:1934547 [6] NCCL INFO Rank 6 selecting transport for rank 7 node-0:1933436:1934547 [6] NCCL INFO Transport 0 canConnect 1 node-0:1933434:1934850 [4] NCCL INFO Rank 4 selecting transport for rank 5 node-0:1933434:1934850 [4] NCCL INFO Transport 0 canConnect 1 node-0:1933436:1934547 [6] NCCL INFO Rank 6 selecting transport for rank 7 node-0:1933436:1934547 [6] NCCL INFO Transport 0 canConnect 1 node-0:1933434:1934850 [4] NCCL INFO Rank 4 selecting transport for rank 5 node-0:1933434:1934850 [4] NCCL INFO Transport 0 canConnect 1 node-0:1933436:1934547 [6] NCCL INFO Rank 6 selecting transport for rank 7 node-0:1933436:1934547 [6] NCCL INFO Transport 0 canConnect 1 node-0:1933437:1934852 [7] NCCL INFO Rank 7 selecting transport for rank 6 node-0:1933437:1934852 [7] NCCL INFO Transport 0 canConnect 1 node-0:1933437:1934852 [7] NCCL INFO Channel 03/0 : 7[7] -> 6[6] via P2P/CUMEM/read node-0:1933434:1934850 [4] NCCL INFO Rank 4 selecting transport for rank 5 node-0:1933434:1934850 [4] NCCL INFO Transport 0 canConnect 1 node-0:1933436:1934547 [6] NCCL INFO Rank 6 selecting transport for rank 7 node-0:1933436:1934547 [6] NCCL INFO Transport 0 canConnect 1 node-0:1933437:1934852 [7] NCCL INFO Rank 7 selecting transport for rank 6 node-0:1933437:1934852 [7] NCCL INFO Transport 0 canConnect 1 node-0:1933437:1934852 [7] NCCL INFO Channel 04/0 : 7[7] -> 6[6] via P2P/CUMEM/read node-0:1933435:1934848 [5] NCCL INFO Rank 5 selecting transport for rank 6 node-0:1933435:1934848 [5] NCCL INFO Transport 0 canConnect 1 node-0:1933436:1934547 [6] NCCL INFO Rank 6 selecting transport for rank 7 node-0:1933436:1934547 [6] NCCL INFO Transport 0 canConnect 1 node-0:1933437:1934852 [7] NCCL INFO Rank 7 selecting transport for rank 6 node-0:1933437:1934852 [7] NCCL INFO Transport 0 canConnect 1 node-0:1933437:1934852 [7] NCCL INFO Channel 05/0 : 7[7] -> 6[6] via P2P/CUMEM/read node-0:1933435:1934848 [5] NCCL INFO Rank 5 selecting transport for rank 6 node-0:1933435:1934848 [5] NCCL INFO Transport 0 canConnect 1 node-0:1933434:1934850 [4] NCCL INFO Rank 4 selecting transport for rank 5 node-0:1933434:1934850 [4] NCCL INFO Transport 0 canConnect 1 node-0:1933437:1934852 [7] NCCL INFO Rank 7 selecting transport for rank 6 node-0:1933437:1934852 [7] NCCL INFO Transport 0 canConnect 1 node-0:1933437:1934852 [7] NCCL INFO Channel 06/0 : 7[7] -> 6[6] via P2P/CUMEM/read node-0:1933434:1934850 [4] NCCL INFO Rank 4 selecting transport for rank 5 node-0:1933434:1934850 [4] NCCL INFO Transport 0 canConnect 1 node-0:1933435:1934848 [5] NCCL INFO Rank 5 selecting transport for rank 6 node-0:1933435:1934848 [5] NCCL INFO Transport 0 canConnect 1 node-0:1933437:1934852 [7] NCCL INFO Rank 7 selecting transport for rank 6 node-0:1933437:1934852 [7] NCCL INFO Transport 0 canConnect 1 node-0:1933437:1934852 [7] NCCL INFO Channel 07/0 : 7[7] -> 6[6] via P2P/CUMEM/read node-0:1933434:1934850 [4] NCCL INFO Rank 4 selecting transport for rank 5 node-0:1933434:1934850 [4] NCCL INFO Transport 0 canConnect 1 node-0:1933436:1934547 [6] NCCL INFO Rank 6 selecting transport for rank 7 node-0:1933436:1934547 [6] NCCL INFO Transport 0 canConnect 1 node-0:1933435:1934848 [5] NCCL INFO Rank 5 selecting transport for rank 6 node-0:1933435:1934848 [5] NCCL INFO Transport 0 canConnect 1 node-0:1933436:1934547 [6] NCCL INFO Rank 6 selecting transport for rank 7 node-0:1933436:1934547 [6] NCCL INFO Transport 0 canConnect 1 node-0:1933437:1934852 [7] NCCL INFO Rank 7 selecting transport for rank 6 node-0:1933437:1934852 [7] NCCL INFO Transport 0 canConnect 1 node-0:1933437:1934852 [7] NCCL INFO Channel 08/0 : 7[7] -> 6[6] via P2P/CUMEM/read node-0:1933434:1934850 [4] NCCL INFO Rank 4 selecting transport for rank 5 node-0:1933434:1934850 [4] NCCL INFO Transport 0 canConnect 1 node-0:1933435:1934848 [5] NCCL INFO Rank 5 selecting transport for rank 6 node-0:1933435:1934848 [5] NCCL INFO Transport 0 canConnect 1 node-0:1933437:1934852 [7] NCCL INFO Rank 7 selecting transport for rank 6 node-0:1933437:1934852 [7] NCCL INFO Transport 0 canConnect 1 node-0:1933437:1934852 [7] NCCL INFO Channel 09/0 : 7[7] -> 6[6] via P2P/CUMEM/read node-0:1933434:1934850 [4] NCCL INFO Rank 4 selecting transport for rank 5 node-0:1933434:1934850 [4] NCCL INFO Transport 0 canConnect 1 node-0:1933435:1934848 [5] NCCL INFO Rank 5 selecting transport for rank 6 node-0:1933435:1934848 [5] NCCL INFO Transport 0 canConnect 1 node-0:1933437:1934852 [7] NCCL INFO Rank 7 selecting transport for rank 6 node-0:1933437:1934852 [7] NCCL INFO Transport 0 canConnect 1 node-0:1933437:1934852 [7] NCCL INFO Channel 10/0 : 7[7] -> 6[6] via P2P/CUMEM/read node-0:1933434:1934850 [4] NCCL INFO Rank 4 selecting transport for rank 5 node-0:1933434:1934850 [4] NCCL INFO Transport 0 canConnect 1 node-0:1933435:1934848 [5] NCCL INFO Rank 5 selecting transport for rank 6 node-0:1933435:1934848 [5] NCCL INFO Transport 0 canConnect 1 node-0:1933437:1934852 [7] NCCL INFO Rank 7 selecting transport for rank 6 node-0:1933437:1934852 [7] NCCL INFO Transport 0 canConnect 1 node-0:1933437:1934852 [7] NCCL INFO Channel 11/0 : 7[7] -> 6[6] via P2P/CUMEM/read node-0:1933434:1934850 [4] NCCL INFO Rank 4 selecting transport for rank 5 node-0:1933434:1934850 [4] NCCL INFO Transport 0 canConnect 1 node-0:1933435:1934848 [5] NCCL INFO Rank 5 selecting transport for rank 6 node-0:1933435:1934848 [5] NCCL INFO Transport 0 canConnect 1 node-0:1933436:1934547 [6] NCCL INFO Rank 6 selecting transport for rank 7 node-0:1933436:1934547 [6] NCCL INFO Transport 0 canConnect 1 node-0:1933437:1934852 [7] NCCL INFO Rank 7 selecting transport for rank 6 node-0:1933437:1934852 [7] NCCL INFO Transport 0 canConnect 1 node-0:1933437:1934852 [7] NCCL INFO Channel 12/0 : 7[7] -> 6[6] via P2P/CUMEM/read node-0:1933434:1934850 [4] NCCL INFO Rank 4 selecting transport for rank 5 node-0:1933434:1934850 [4] NCCL INFO Transport 0 canConnect 1 node-0:1933436:1934547 [6] NCCL INFO Rank 6 selecting transport for rank 7 node-0:1933436:1934547 [6] NCCL INFO Transport 0 canConnect 1 node-0:1933437:1934852 [7] NCCL INFO Rank 7 selecting transport for rank 6 node-0:1933437:1934852 [7] NCCL INFO Transport 0 canConnect 1 node-0:1933437:1934852 [7] NCCL INFO Channel 13/0 : 7[7] -> 6[6] via P2P/CUMEM/read node-0:1933434:1934850 [4] NCCL INFO Rank 4 selecting transport for rank 5 node-0:1933434:1934850 [4] NCCL INFO Transport 0 canConnect 1 node-0:1933436:1934547 [6] NCCL INFO Rank 6 selecting transport for rank 7 node-0:1933436:1934547 [6] NCCL INFO Transport 0 canConnect 1 node-0:1933437:1934852 [7] NCCL INFO Rank 7 selecting transport for rank 6 node-0:1933437:1934852 [7] NCCL INFO Transport 0 canConnect 1 node-0:1933437:1934852 [7] NCCL INFO Channel 14/0 : 7[7] -> 6[6] via P2P/CUMEM/read node-0:1933434:1934850 [4] NCCL INFO Rank 4 selecting transport for rank 5 node-0:1933434:1934850 [4] NCCL INFO Transport 0 canConnect 1 node-0:1933436:1934547 [6] NCCL INFO Rank 6 selecting transport for rank 7 node-0:1933436:1934547 [6] NCCL INFO Transport 0 canConnect 1 node-0:1933434:1934850 [4] NCCL INFO Rank 4 selecting transport for rank 5 node-0:1933434:1934850 [4] NCCL INFO Transport 0 canConnect 1 node-0:1933436:1934547 [6] NCCL INFO Rank 6 selecting transport for rank 7 node-0:1933436:1934547 [6] NCCL INFO Transport 0 canConnect 1 node-0:1933437:1934852 [7] NCCL INFO Rank 7 selecting transport for rank 6 node-0:1933437:1934852 [7] NCCL INFO Transport 0 canConnect 1 node-0:1933437:1934852 [7] NCCL INFO Channel 15/0 : 7[7] -> 6[6] via P2P/CUMEM/read node-0:1933434:1934850 [4] NCCL INFO Rank 4 selecting transport for rank 5 node-0:1933434:1934850 [4] NCCL INFO Transport 0 canConnect 1 node-0:1933436:1934547 [6] NCCL INFO Rank 6 selecting transport for rank 7 node-0:1933436:1934547 [6] NCCL INFO Transport 0 canConnect 1 node-0:1933437:1934852 [7] NCCL INFO Rank 7 selecting transport for rank 6 node-0:1933437:1934852 [7] NCCL INFO Transport 0 canConnect 1 node-0:1933437:1934852 [7] NCCL INFO Channel 16/0 : 7[7] -> 6[6] via P2P/CUMEM/read node-0:1933434:1934850 [4] NCCL INFO Rank 4 selecting transport for rank 5 node-0:1933434:1934850 [4] NCCL INFO Transport 0 canConnect 1 node-0:1933436:1934547 [6] NCCL INFO Rank 6 selecting transport for rank 7 node-0:1933436:1934547 [6] NCCL INFO Transport 0 canConnect 1 node-0:1933434:1934850 [4] NCCL INFO Rank 4 selecting transport for rank 5 node-0:1933434:1934850 [4] NCCL INFO Transport 0 canConnect 1 node-0:1933435:1934848 [5] NCCL INFO Rank 5 selecting transport for rank 6 node-0:1933435:1934848 [5] NCCL INFO Transport 0 canConnect 1 node-0:1933435:1934848 [5] NCCL INFO Rank 5 selecting transport for rank 6 node-0:1933435:1934848 [5] NCCL INFO Transport 0 canConnect 1 node-0:1933436:1934547 [6] NCCL INFO Rank 6 selecting transport for rank 7 node-0:1933436:1934547 [6] NCCL INFO Transport 0 canConnect 1 node-0:1933434:1934850 [4] NCCL INFO Rank 4 selecting transport for rank 5 node-0:1933434:1934850 [4] NCCL INFO Transport 0 canConnect 1 node-0:1933436:1934547 [6] NCCL INFO Rank 6 selecting transport for rank 7 node-0:1933436:1934547 [6] NCCL INFO Transport 0 canConnect 1 node-0:1933434:1934850 [4] NCCL INFO Rank 4 selecting transport for rank 5 node-0:1933434:1934850 [4] NCCL INFO Transport 0 canConnect 1 node-0:1933435:1934848 [5] NCCL INFO Rank 5 selecting transport for rank 6 node-0:1933435:1934848 [5] NCCL INFO Transport 0 canConnect 1 node-0:1933436:1934547 [6] NCCL INFO Rank 6 selecting transport for rank 7 node-0:1933436:1934547 [6] NCCL INFO Transport 0 canConnect 1 node-0:1933437:1934852 [7] NCCL INFO Rank 7 selecting transport for rank 6 node-0:1933437:1934852 [7] NCCL INFO Transport 0 canConnect 1 node-0:1933437:1934852 [7] NCCL INFO Channel 17/0 : 7[7] -> 6[6] via P2P/CUMEM/read node-0:1933435:1934848 [5] NCCL INFO Rank 5 selecting transport for rank 6 node-0:1933435:1934848 [5] NCCL INFO Transport 0 canConnect 1 node-0:1933436:1934547 [6] NCCL INFO Rank 6 selecting transport for rank 7 node-0:1933436:1934547 [6] NCCL INFO Transport 0 canConnect 1 node-0:1933437:1934852 [7] NCCL INFO Rank 7 selecting transport for rank 6 node-0:1933437:1934852 [7] NCCL INFO Transport 0 canConnect 1 node-0:1933437:1934852 [7] NCCL INFO Channel 18/0 : 7[7] -> 6[6] via P2P/CUMEM/read node-0:1933435:1934848 [5] NCCL INFO Rank 5 selecting transport for rank 6 node-0:1933435:1934848 [5] NCCL INFO Transport 0 canConnect 1 node-0:1933434:1934850 [4] NCCL INFO Rank 4 selecting transport for rank 5 node-0:1933436:1934547 [6] NCCL INFO Rank 6 selecting transport for rank 7 node-0:1933434:1934850 [4] NCCL INFO Transport 0 canConnect 1 node-0:1933436:1934547 [6] NCCL INFO Transport 0 canConnect 1 node-0:1933437:1934852 [7] NCCL INFO Rank 7 selecting transport for rank 6 node-0:1933437:1934852 [7] NCCL INFO Transport 0 canConnect 1 node-0:1933437:1934852 [7] NCCL INFO Channel 19/0 : 7[7] -> 6[6] via P2P/CUMEM/read node-0:1933435:1934848 [5] NCCL INFO Rank 5 selecting transport for rank 6 node-0:1933435:1934848 [5] NCCL INFO Transport 0 canConnect 1 node-0:1933437:1934852 [7] NCCL INFO Rank 7 selecting transport for rank 6 node-0:1933437:1934852 [7] NCCL INFO Transport 0 canConnect 1 node-0:1933437:1934852 [7] NCCL INFO Channel 20/0 : 7[7] -> 6[6] via P2P/CUMEM/read node-0:1933434:1934850 [4] NCCL INFO Rank 4 selecting transport for rank 5 node-0:1933434:1934850 [4] NCCL INFO Transport 0 canConnect 1 node-0:1933436:1934547 [6] NCCL INFO Rank 6 selecting transport for rank 7 node-0:1933436:1934547 [6] NCCL INFO Transport 0 canConnect 1 node-0:1933435:1934848 [5] NCCL INFO Rank 5 selecting transport for rank 6 node-0:1933435:1934848 [5] NCCL INFO Transport 0 canConnect 1 node-0:1933434:1934850 [4] NCCL INFO Rank 4 selecting transport for rank 5 node-0:1933434:1934850 [4] NCCL INFO Transport 0 canConnect 1 node-0:1933436:1934547 [6] NCCL INFO Rank 6 selecting transport for rank 7 node-0:1933436:1934547 [6] NCCL INFO Transport 0 canConnect 1 node-0:1933437:1934852 [7] NCCL INFO Rank 7 selecting transport for rank 6 node-0:1933437:1934852 [7] NCCL INFO Transport 0 canConnect 1 node-0:1933437:1934852 [7] NCCL INFO Channel 21/0 : 7[7] -> 6[6] via P2P/CUMEM/read node-0:1933436:1934547 [6] NCCL INFO Rank 6 selecting transport for rank 5 node-0:1933436:1934547 [6] NCCL INFO Transport 0 canConnect 1 node-0:1933436:1934547 [6] NCCL INFO Channel 00/0 : 6[6] -> 5[5] via P2P/CUMEM/read node-0:1933435:1934848 [5] NCCL INFO Rank 5 selecting transport for rank 6 node-0:1933435:1934848 [5] NCCL INFO Transport 0 canConnect 1 node-0:1933436:1934547 [6] NCCL INFO Rank 6 selecting transport for rank 5 node-0:1933436:1934547 [6] NCCL INFO Transport 0 canConnect 1 node-0:1933436:1934547 [6] NCCL INFO Channel 01/0 : 6[6] -> 5[5] via P2P/CUMEM/read node-0:1933434:1934850 [4] NCCL INFO Rank 4 selecting transport for rank 3 node-0:1933434:1934850 [4] NCCL INFO Transport 0 canConnect 1 node-0:1933434:1934850 [4] NCCL INFO Channel 00/0 : 4[4] -> 3[3] via P2P/CUMEM/read node-0:1933437:1934852 [7] NCCL INFO Rank 7 selecting transport for rank 6 node-0:1933437:1934852 [7] NCCL INFO Transport 0 canConnect 1 node-0:1933437:1934852 [7] NCCL INFO Channel 22/0 : 7[7] -> 6[6] via P2P/CUMEM/read node-0:1933435:1934848 [5] NCCL INFO Rank 5 selecting transport for rank 6 node-0:1933435:1934848 [5] NCCL INFO Transport 0 canConnect 1 node-0:1933434:1934850 [4] NCCL INFO Rank 4 selecting transport for rank 3 node-0:1933434:1934850 [4] NCCL INFO Transport 0 canConnect 1 node-0:1933434:1934850 [4] NCCL INFO Channel 01/0 : 4[4] -> 3[3] via P2P/CUMEM/read node-0:1933437:1934852 [7] NCCL INFO Rank 7 selecting transport for rank 6 node-0:1933437:1934852 [7] NCCL INFO Transport 0 canConnect 1 node-0:1933437:1934852 [7] NCCL INFO Channel 23/0 : 7[7] -> 6[6] via P2P/CUMEM/read node-0:1933435:1934848 [5] NCCL INFO Rank 5 selecting transport for rank 6 node-0:1933435:1934848 [5] NCCL INFO Transport 0 canConnect 1 node-0:1933434:1934850 [4] NCCL INFO Rank 4 selecting transport for rank 3 node-0:1933434:1934850 [4] NCCL INFO Transport 0 canConnect 1 node-0:1933434:1934850 [4] NCCL INFO Channel 02/0 : 4[4] -> 3[3] via P2P/CUMEM/read node-0:1933436:1934547 [6] NCCL INFO Rank 6 selecting transport for rank 5 node-0:1933436:1934547 [6] NCCL INFO Transport 0 canConnect 1 node-0:1933436:1934547 [6] NCCL INFO Channel 02/0 : 6[6] -> 5[5] via P2P/CUMEM/read node-0:1933434:1934850 [4] NCCL INFO Rank 4 selecting transport for rank 3 node-0:1933434:1934850 [4] NCCL INFO Transport 0 canConnect 1 node-0:1933434:1934850 [4] NCCL INFO Channel 03/0 : 4[4] -> 3[3] via P2P/CUMEM/read node-0:1933436:1934547 [6] NCCL INFO Rank 6 selecting transport for rank 5 node-0:1933436:1934547 [6] NCCL INFO Transport 0 canConnect 1 node-0:1933436:1934547 [6] NCCL INFO Channel 03/0 : 6[6] -> 5[5] via P2P/CUMEM/read node-0:1933435:1934848 [5] NCCL INFO Rank 5 selecting transport for rank 6 node-0:1933435:1934848 [5] NCCL INFO Transport 0 canConnect 1 node-0:1933436:1934547 [6] NCCL INFO Rank 6 selecting transport for rank 5 node-0:1933436:1934547 [6] NCCL INFO Transport 0 canConnect 1 node-0:1933436:1934547 [6] NCCL INFO Channel 04/0 : 6[6] -> 5[5] via P2P/CUMEM/read node-0:1933435:1934848 [5] NCCL INFO Rank 5 selecting transport for rank 6 node-0:1933435:1934848 [5] NCCL INFO Transport 0 canConnect 1 node-0:1933436:1934547 [6] NCCL INFO Rank 6 selecting transport for rank 5 node-0:1933436:1934547 [6] NCCL INFO Transport 0 canConnect 1 node-0:1933436:1934547 [6] NCCL INFO Channel 05/0 : 6[6] -> 5[5] via P2P/CUMEM/read node-0:1933436:1934547 [6] NCCL INFO Rank 6 selecting transport for rank 5 node-0:1933436:1934547 [6] NCCL INFO Transport 0 canConnect 1 node-0:1933436:1934547 [6] NCCL INFO Channel 06/0 : 6[6] -> 5[5] via P2P/CUMEM/read node-0:1933435:1934848 [5] NCCL INFO Rank 5 selecting transport for rank 6 node-0:1933435:1934848 [5] NCCL INFO Transport 0 canConnect 1 node-0:1933436:1934547 [6] NCCL INFO Rank 6 selecting transport for rank 5 node-0:1933436:1934547 [6] NCCL INFO Transport 0 canConnect 1 node-0:1933436:1934547 [6] NCCL INFO Channel 07/0 : 6[6] -> 5[5] via P2P/CUMEM/read node-0:1933435:1934848 [5] NCCL INFO Rank 5 selecting transport for rank 6 node-0:1933435:1934848 [5] NCCL INFO Transport 0 canConnect 1 node-0:1933434:1934850 [4] NCCL INFO Rank 4 selecting transport for rank 3 node-0:1933434:1934850 [4] NCCL INFO Transport 0 canConnect 1 node-0:1933434:1934850 [4] NCCL INFO Channel 04/0 : 4[4] -> 3[3] via P2P/CUMEM/read node-0:1933436:1934547 [6] NCCL INFO Rank 6 selecting transport for rank 5 node-0:1933436:1934547 [6] NCCL INFO Transport 0 canConnect 1 node-0:1933436:1934547 [6] NCCL INFO Channel 08/0 : 6[6] -> 5[5] via P2P/CUMEM/read node-0:1933434:1934850 [4] NCCL INFO Rank 4 selecting transport for rank 3 node-0:1933434:1934850 [4] NCCL INFO Transport 0 canConnect 1 node-0:1933434:1934850 [4] NCCL INFO Channel 05/0 : 4[4] -> 3[3] via P2P/CUMEM/read node-0:1933434:1934850 [4] NCCL INFO Rank 4 selecting transport for rank 3 node-0:1933434:1934850 [4] NCCL INFO Transport 0 canConnect 1 node-0:1933434:1934850 [4] NCCL INFO Channel 06/0 : 4[4] -> 3[3] via P2P/CUMEM/read node-0:1933435:1934848 [5] NCCL INFO Rank 5 selecting transport for rank 4 node-0:1933435:1934848 [5] NCCL INFO Transport 0 canConnect 1 node-0:1933435:1934848 [5] NCCL INFO Channel 00/0 : 5[5] -> 4[4] via P2P/CUMEM/read node-0:1933436:1934547 [6] NCCL INFO Rank 6 selecting transport for rank 5 node-0:1933436:1934547 [6] NCCL INFO Transport 0 canConnect 1 node-0:1933436:1934547 [6] NCCL INFO Channel 09/0 : 6[6] -> 5[5] via P2P/CUMEM/read node-0:1933434:1934850 [4] NCCL INFO Rank 4 selecting transport for rank 3 node-0:1933434:1934850 [4] NCCL INFO Transport 0 canConnect 1 node-0:1933434:1934850 [4] NCCL INFO Channel 07/0 : 4[4] -> 3[3] via P2P/CUMEM/read node-0:1933436:1934547 [6] NCCL INFO Rank 6 selecting transport for rank 5 node-0:1933436:1934547 [6] NCCL INFO Transport 0 canConnect 1 node-0:1933436:1934547 [6] NCCL INFO Channel 10/0 : 6[6] -> 5[5] via P2P/CUMEM/read node-0:1933434:1934850 [4] NCCL INFO Rank 4 selecting transport for rank 3 node-0:1933434:1934850 [4] NCCL INFO Transport 0 canConnect 1 node-0:1933434:1934850 [4] NCCL INFO Channel 08/0 : 4[4] -> 3[3] via P2P/CUMEM/read node-0:1933436:1934547 [6] NCCL INFO Rank 6 selecting transport for rank 5 node-0:1933436:1934547 [6] NCCL INFO Transport 0 canConnect 1 node-0:1933436:1934547 [6] NCCL INFO Channel 11/0 : 6[6] -> 5[5] via P2P/CUMEM/read node-0:1933435:1934848 [5] NCCL INFO Rank 5 selecting transport for rank 4 node-0:1933435:1934848 [5] NCCL INFO Transport 0 canConnect 1 node-0:1933435:1934848 [5] NCCL INFO Channel 01/0 : 5[5] -> 4[4] via P2P/CUMEM/read node-0:1933435:1934848 [5] NCCL INFO Rank 5 selecting transport for rank 4 node-0:1933435:1934848 [5] NCCL INFO Transport 0 canConnect 1 node-0:1933435:1934848 [5] NCCL INFO Channel 02/0 : 5[5] -> 4[4] via P2P/CUMEM/read node-0:1933436:1934547 [6] NCCL INFO Rank 6 selecting transport for rank 5 node-0:1933436:1934547 [6] NCCL INFO Transport 0 canConnect 1 node-0:1933436:1934547 [6] NCCL INFO Channel 12/0 : 6[6] -> 5[5] via P2P/CUMEM/read node-0:1933434:1934850 [4] NCCL INFO Rank 4 selecting transport for rank 3 node-0:1933434:1934850 [4] NCCL INFO Transport 0 canConnect 1 node-0:1933434:1934850 [4] NCCL INFO Channel 09/0 : 4[4] -> 3[3] via P2P/CUMEM/read node-0:1933436:1934547 [6] NCCL INFO Rank 6 selecting transport for rank 5 node-0:1933436:1934547 [6] NCCL INFO Transport 0 canConnect 1 node-0:1933436:1934547 [6] NCCL INFO Channel 13/0 : 6[6] -> 5[5] via P2P/CUMEM/read node-0:1933434:1934850 [4] NCCL INFO Rank 4 selecting transport for rank 3 node-0:1933434:1934850 [4] NCCL INFO Transport 0 canConnect 1 node-0:1933434:1934850 [4] NCCL INFO Channel 10/0 : 4[4] -> 3[3] via P2P/CUMEM/read node-0:1933436:1934547 [6] NCCL INFO Rank 6 selecting transport for rank 5 node-0:1933436:1934547 [6] NCCL INFO Transport 0 canConnect 1 node-0:1933436:1934547 [6] NCCL INFO Channel 14/0 : 6[6] -> 5[5] via P2P/CUMEM/read node-0:1933434:1934850 [4] NCCL INFO Rank 4 selecting transport for rank 3 node-0:1933434:1934850 [4] NCCL INFO Transport 0 canConnect 1 node-0:1933434:1934850 [4] NCCL INFO Channel 11/0 : 4[4] -> 3[3] via P2P/CUMEM/read node-0:1933436:1934547 [6] NCCL INFO Rank 6 selecting transport for rank 5 node-0:1933436:1934547 [6] NCCL INFO Transport 0 canConnect 1 node-0:1933436:1934547 [6] NCCL INFO Channel 15/0 : 6[6] -> 5[5] via P2P/CUMEM/read node-0:1933434:1934850 [4] NCCL INFO Rank 4 selecting transport for rank 3 node-0:1933434:1934850 [4] NCCL INFO Transport 0 canConnect 1 node-0:1933434:1934850 [4] NCCL INFO Channel 12/0 : 4[4] -> 3[3] via P2P/CUMEM/read node-0:1933435:1934848 [5] NCCL INFO Rank 5 selecting transport for rank 4 node-0:1933435:1934848 [5] NCCL INFO Transport 0 canConnect 1 node-0:1933435:1934848 [5] NCCL INFO Channel 03/0 : 5[5] -> 4[4] via P2P/CUMEM/read node-0:1933436:1934547 [6] NCCL INFO Rank 6 selecting transport for rank 5 node-0:1933436:1934547 [6] NCCL INFO Transport 0 canConnect 1 node-0:1933436:1934547 [6] NCCL INFO Channel 16/0 : 6[6] -> 5[5] via P2P/CUMEM/read node-0:1933434:1934850 [4] NCCL INFO Rank 4 selecting transport for rank 3 node-0:1933434:1934850 [4] NCCL INFO Transport 0 canConnect 1 node-0:1933434:1934850 [4] NCCL INFO Channel 13/0 : 4[4] -> 3[3] via P2P/CUMEM/read node-0:1933435:1934848 [5] NCCL INFO Rank 5 selecting transport for rank 4 node-0:1933435:1934848 [5] NCCL INFO Transport 0 canConnect 1 node-0:1933435:1934848 [5] NCCL INFO Channel 04/0 : 5[5] -> 4[4] via P2P/CUMEM/read node-0:1933436:1934547 [6] NCCL INFO Rank 6 selecting transport for rank 5 node-0:1933436:1934547 [6] NCCL INFO Transport 0 canConnect 1 node-0:1933436:1934547 [6] NCCL INFO Channel 17/0 : 6[6] -> 5[5] via P2P/CUMEM/read node-0:1933434:1934850 [4] NCCL INFO Rank 4 selecting transport for rank 3 node-0:1933434:1934850 [4] NCCL INFO Transport 0 canConnect 1 node-0:1933434:1934850 [4] NCCL INFO Channel 14/0 : 4[4] -> 3[3] via P2P/CUMEM/read node-0:1933435:1934848 [5] NCCL INFO Rank 5 selecting transport for rank 4 node-0:1933435:1934848 [5] NCCL INFO Transport 0 canConnect 1 node-0:1933435:1934848 [5] NCCL INFO Channel 05/0 : 5[5] -> 4[4] via P2P/CUMEM/read node-0:1933434:1934850 [4] NCCL INFO Rank 4 selecting transport for rank 3 node-0:1933434:1934850 [4] NCCL INFO Transport 0 canConnect 1 node-0:1933434:1934850 [4] NCCL INFO Channel 15/0 : 4[4] -> 3[3] via P2P/CUMEM/read node-0:1933435:1934848 [5] NCCL INFO Rank 5 selecting transport for rank 4 node-0:1933435:1934848 [5] NCCL INFO Transport 0 canConnect 1 node-0:1933435:1934848 [5] NCCL INFO Channel 06/0 : 5[5] -> 4[4] via P2P/CUMEM/read node-0:1933436:1934547 [6] NCCL INFO Rank 6 selecting transport for rank 5 node-0:1933436:1934547 [6] NCCL INFO Transport 0 canConnect 1 node-0:1933436:1934547 [6] NCCL INFO Channel 18/0 : 6[6] -> 5[5] via P2P/CUMEM/read node-0:1933435:1934848 [5] NCCL INFO Rank 5 selecting transport for rank 4 node-0:1933435:1934848 [5] NCCL INFO Transport 0 canConnect 1 node-0:1933435:1934848 [5] NCCL INFO Channel 07/0 : 5[5] -> 4[4] via P2P/CUMEM/read node-0:1933434:1934850 [4] NCCL INFO Rank 4 selecting transport for rank 3 node-0:1933434:1934850 [4] NCCL INFO Transport 0 canConnect 1 node-0:1933434:1934850 [4] NCCL INFO Channel 16/0 : 4[4] -> 3[3] via P2P/CUMEM/read node-0:1933435:1934848 [5] NCCL INFO Rank 5 selecting transport for rank 4 node-0:1933435:1934848 [5] NCCL INFO Transport 0 canConnect 1 node-0:1933435:1934848 [5] NCCL INFO Channel 08/0 : 5[5] -> 4[4] via P2P/CUMEM/read node-0:1933434:1934850 [4] NCCL INFO Rank 4 selecting transport for rank 3 node-0:1933434:1934850 [4] NCCL INFO Transport 0 canConnect 1 node-0:1933434:1934850 [4] NCCL INFO Channel 17/0 : 4[4] -> 3[3] via P2P/CUMEM/read node-0:1933435:1934848 [5] NCCL INFO Rank 5 selecting transport for rank 4 node-0:1933435:1934848 [5] NCCL INFO Transport 0 canConnect 1 node-0:1933435:1934848 [5] NCCL INFO Channel 09/0 : 5[5] -> 4[4] via P2P/CUMEM/read node-0:1933435:1934848 [5] NCCL INFO Rank 5 selecting transport for rank 4 node-0:1933435:1934848 [5] NCCL INFO Transport 0 canConnect 1 node-0:1933435:1934848 [5] NCCL INFO Channel 10/0 : 5[5] -> 4[4] via P2P/CUMEM/read node-0:1933436:1934547 [6] NCCL INFO Rank 6 selecting transport for rank 5 node-0:1933436:1934547 [6] NCCL INFO Transport 0 canConnect 1 node-0:1933436:1934547 [6] NCCL INFO Channel 19/0 : 6[6] -> 5[5] via P2P/CUMEM/read node-0:1933435:1934848 [5] NCCL INFO Rank 5 selecting transport for rank 4 node-0:1933435:1934848 [5] NCCL INFO Transport 0 canConnect 1 node-0:1933435:1934848 [5] NCCL INFO Channel 11/0 : 5[5] -> 4[4] via P2P/CUMEM/read node-0:1933436:1934547 [6] NCCL INFO Rank 6 selecting transport for rank 5 node-0:1933436:1934547 [6] NCCL INFO Transport 0 canConnect 1 node-0:1933436:1934547 [6] NCCL INFO Channel 20/0 : 6[6] -> 5[5] via P2P/CUMEM/read node-0:1933434:1934850 [4] NCCL INFO Rank 4 selecting transport for rank 3 node-0:1933434:1934850 [4] NCCL INFO Transport 0 canConnect 1 node-0:1933434:1934850 [4] NCCL INFO Channel 18/0 : 4[4] -> 3[3] via P2P/CUMEM/read node-0:1933435:1934848 [5] NCCL INFO Rank 5 selecting transport for rank 4 node-0:1933435:1934848 [5] NCCL INFO Transport 0 canConnect 1 node-0:1933435:1934848 [5] NCCL INFO Channel 12/0 : 5[5] -> 4[4] via P2P/CUMEM/read node-0:1933436:1934547 [6] NCCL INFO Rank 6 selecting transport for rank 5 node-0:1933436:1934547 [6] NCCL INFO Transport 0 canConnect 1 node-0:1933436:1934547 [6] NCCL INFO Channel 21/0 : 6[6] -> 5[5] via P2P/CUMEM/read node-0:1933434:1934850 [4] NCCL INFO Rank 4 selecting transport for rank 3 node-0:1933434:1934850 [4] NCCL INFO Transport 0 canConnect 1 node-0:1933434:1934850 [4] NCCL INFO Channel 19/0 : 4[4] -> 3[3] via P2P/CUMEM/read node-0:1933435:1934848 [5] NCCL INFO Rank 5 selecting transport for rank 4 node-0:1933435:1934848 [5] NCCL INFO Transport 0 canConnect 1 node-0:1933435:1934848 [5] NCCL INFO Channel 13/0 : 5[5] -> 4[4] via P2P/CUMEM/read node-0:1933436:1934547 [6] NCCL INFO Rank 6 selecting transport for rank 5 node-0:1933436:1934547 [6] NCCL INFO Transport 0 canConnect 1 node-0:1933436:1934547 [6] NCCL INFO Channel 22/0 : 6[6] -> 5[5] via P2P/CUMEM/read node-0:1933434:1934850 [4] NCCL INFO Rank 4 selecting transport for rank 3 node-0:1933434:1934850 [4] NCCL INFO Transport 0 canConnect 1 node-0:1933434:1934850 [4] NCCL INFO Channel 20/0 : 4[4] -> 3[3] via P2P/CUMEM/read node-0:1933435:1934848 [5] NCCL INFO Rank 5 selecting transport for rank 4 node-0:1933435:1934848 [5] NCCL INFO Transport 0 canConnect 1 node-0:1933435:1934848 [5] NCCL INFO Channel 14/0 : 5[5] -> 4[4] via P2P/CUMEM/read node-0:1933434:1934850 [4] NCCL INFO Rank 4 selecting transport for rank 3 node-0:1933434:1934850 [4] NCCL INFO Transport 0 canConnect 1 node-0:1933434:1934850 [4] NCCL INFO Channel 21/0 : 4[4] -> 3[3] via P2P/CUMEM/read node-0:1933436:1934547 [6] NCCL INFO Rank 6 selecting transport for rank 5 node-0:1933436:1934547 [6] NCCL INFO Transport 0 canConnect 1 node-0:1933436:1934547 [6] NCCL INFO Channel 23/0 : 6[6] -> 5[5] via P2P/CUMEM/read node-0:1933435:1934848 [5] NCCL INFO Rank 5 selecting transport for rank 4 node-0:1933435:1934848 [5] NCCL INFO Transport 0 canConnect 1 node-0:1933435:1934848 [5] NCCL INFO Channel 15/0 : 5[5] -> 4[4] via P2P/CUMEM/read node-0:1933435:1934848 [5] NCCL INFO Rank 5 selecting transport for rank 4 node-0:1933435:1934848 [5] NCCL INFO Transport 0 canConnect 1 node-0:1933435:1934848 [5] NCCL INFO Channel 16/0 : 5[5] -> 4[4] via P2P/CUMEM/read node-0:1933434:1934850 [4] NCCL INFO Rank 4 selecting transport for rank 3 node-0:1933434:1934850 [4] NCCL INFO Transport 0 canConnect 1 node-0:1933434:1934850 [4] NCCL INFO Channel 22/0 : 4[4] -> 3[3] via P2P/CUMEM/read node-0:1933435:1934848 [5] NCCL INFO Rank 5 selecting transport for rank 4 node-0:1933435:1934848 [5] NCCL INFO Transport 0 canConnect 1 node-0:1933435:1934848 [5] NCCL INFO Channel 17/0 : 5[5] -> 4[4] via P2P/CUMEM/read node-0:1933434:1934850 [4] NCCL INFO Rank 4 selecting transport for rank 3 node-0:1933434:1934850 [4] NCCL INFO Transport 0 canConnect 1 node-0:1933434:1934850 [4] NCCL INFO Channel 23/0 : 4[4] -> 3[3] via P2P/CUMEM/read node-0:1933435:1934848 [5] NCCL INFO Rank 5 selecting transport for rank 4 node-0:1933435:1934848 [5] NCCL INFO Transport 0 canConnect 1 node-0:1933435:1934848 [5] NCCL INFO Channel 18/0 : 5[5] -> 4[4] via P2P/CUMEM/read node-0:1933435:1934848 [5] NCCL INFO Rank 5 selecting transport for rank 4 node-0:1933435:1934848 [5] NCCL INFO Transport 0 canConnect 1 node-0:1933435:1934848 [5] NCCL INFO Channel 19/0 : 5[5] -> 4[4] via P2P/CUMEM/read node-0:1933435:1934848 [5] NCCL INFO Rank 5 selecting transport for rank 4 node-0:1933435:1934848 [5] NCCL INFO Transport 0 canConnect 1 node-0:1933435:1934848 [5] NCCL INFO Channel 20/0 : 5[5] -> 4[4] via P2P/CUMEM/read node-0:1933435:1934848 [5] NCCL INFO Rank 5 selecting transport for rank 4 node-0:1933435:1934848 [5] NCCL INFO Transport 0 canConnect 1 node-0:1933435:1934848 [5] NCCL INFO Channel 21/0 : 5[5] -> 4[4] via P2P/CUMEM/read node-0:1933435:1934848 [5] NCCL INFO Rank 5 selecting transport for rank 4 node-0:1933435:1934848 [5] NCCL INFO Transport 0 canConnect 1 node-0:1933435:1934848 [5] NCCL INFO Channel 22/0 : 5[5] -> 4[4] via P2P/CUMEM/read node-0:1933435:1934848 [5] NCCL INFO Rank 5 selecting transport for rank 4 node-0:1933435:1934848 [5] NCCL INFO Transport 0 canConnect 1 node-0:1933435:1934848 [5] NCCL INFO Channel 23/0 : 5[5] -> 4[4] via P2P/CUMEM/read node-0:1933430:1934228 [0] NCCL INFO Connected all trees node-0:1933430:1934228 [0] NCCL INFO threadThresholds 8/8/64 | 64/8/64 | 512 | 512 node-0:1933430:1934228 [0] NCCL INFO 24 coll channels, 0 nvls channels, 32 p2p channels, 32 p2p channels per peer node-0:1933430:1934228 [0] NCCL INFO MSCCL: No external scheduler found, using internal implementation node-0:1933430:1934228 [0] NCCL INFO MSCCL: Internal Scheduler will use /lib/x86_64-linux-gnu/msccl-algorithms as algorithm directory and /lib/x86_64-linux-gnu/../share/nccl/msccl-algorithms as share algorithm directory and /usr/share/nccl/msccl-algorithms as package installed share algorithm directory node-0:1933430:1934228 [0] NCCL INFO Using MSCCL Algo files from /usr/share/nccl/msccl-algorithms node-0:1933431:1934545 [1] NCCL INFO Connected all trees node-0:1933431:1934545 [1] NCCL INFO threadThresholds 8/8/64 | 64/8/64 | 512 | 512 node-0:1933431:1934545 [1] NCCL INFO 24 coll channels, 0 nvls channels, 32 p2p channels, 32 p2p channels per peer node-0:1933431:1934545 [1] NCCL INFO MSCCL: No external scheduler found, using internal implementation node-0:1933431:1934545 [1] NCCL INFO MSCCL: Internal Scheduler will use /lib/x86_64-linux-gnu/msccl-algorithms as algorithm directory and /lib/x86_64-linux-gnu/../share/nccl/msccl-algorithms as share algorithm directory and /usr/share/nccl/msccl-algorithms as package installed share algorithm directory node-0:1933431:1934545 [1] NCCL INFO Using MSCCL Algo files from /usr/share/nccl/msccl-algorithms node-0:1933430:1934228 [0] NCCL INFO MSCCL: Initialization finished node-0:1933432:1934230 [2] NCCL INFO Connected all trees node-0:1933432:1934230 [2] NCCL INFO threadThresholds 8/8/64 | 64/8/64 | 512 | 512 node-0:1933432:1934230 [2] NCCL INFO 24 coll channels, 0 nvls channels, 32 p2p channels, 32 p2p channels per peer node-0:1933437:1934852 [7] NCCL INFO Connected all trees node-0:1933437:1934852 [7] NCCL INFO threadThresholds 8/8/64 | 64/8/64 | 512 | 512 node-0:1933437:1934852 [7] NCCL INFO 24 coll channels, 0 nvls channels, 32 p2p channels, 32 p2p channels per peer node-0:1933436:1934547 [6] NCCL INFO Connected all trees node-0:1933436:1934547 [6] NCCL INFO threadThresholds 8/8/64 | 64/8/64 | 512 | 512 node-0:1933436:1934547 [6] NCCL INFO 24 coll channels, 0 nvls channels, 32 p2p channels, 32 p2p channels per peer node-0:1933431:1934545 [1] NCCL INFO MSCCL: Initialization finished node-0:1933433:1934551 [3] NCCL INFO Connected all trees node-0:1933435:1934848 [5] NCCL INFO Connected all trees node-0:1933435:1934848 [5] NCCL INFO threadThresholds 8/8/64 | 64/8/64 | 512 | 512 node-0:1933435:1934848 [5] NCCL INFO 24 coll channels, 0 nvls channels, 32 p2p channels, 32 p2p channels per peer node-0:1933433:1934551 [3] NCCL INFO threadThresholds 8/8/64 | 64/8/64 | 512 | 512 node-0:1933433:1934551 [3] NCCL INFO 24 coll channels, 0 nvls channels, 32 p2p channels, 32 p2p channels per peer node-0:1933434:1934850 [4] NCCL INFO Connected all trees node-0:1933434:1934850 [4] NCCL INFO threadThresholds 8/8/64 | 64/8/64 | 512 | 512 node-0:1933434:1934850 [4] NCCL INFO 24 coll channels, 0 nvls channels, 32 p2p channels, 32 p2p channels per peer node-0:1933432:1934230 [2] NCCL INFO MSCCL: No external scheduler found, using internal implementation node-0:1933432:1934230 [2] NCCL INFO MSCCL: Internal Scheduler will use /lib/x86_64-linux-gnu/msccl-algorithms as algorithm directory and /lib/x86_64-linux-gnu/../share/nccl/msccl-algorithms as share algorithm directory and /usr/share/nccl/msccl-algorithms as package installed share algorithm directory node-0:1933432:1934230 [2] NCCL INFO Using MSCCL Algo files from /usr/share/nccl/msccl-algorithms node-0:1933437:1934852 [7] NCCL INFO MSCCL: No external scheduler found, using internal implementation node-0:1933437:1934852 [7] NCCL INFO MSCCL: Internal Scheduler will use /lib/x86_64-linux-gnu/msccl-algorithms as algorithm directory and /lib/x86_64-linux-gnu/../share/nccl/msccl-algorithms as share algorithm directory and /usr/share/nccl/msccl-algorithms as package installed share algorithm directory node-0:1933437:1934852 [7] NCCL INFO Using MSCCL Algo files from /usr/share/nccl/msccl-algorithms node-0:1933436:1934547 [6] NCCL INFO MSCCL: No external scheduler found, using internal implementation node-0:1933436:1934547 [6] NCCL INFO MSCCL: Internal Scheduler will use /lib/x86_64-linux-gnu/msccl-algorithms as algorithm directory and /lib/x86_64-linux-gnu/../share/nccl/msccl-algorithms as share algorithm directory and /usr/share/nccl/msccl-algorithms as package installed share algorithm directory node-0:1933436:1934547 [6] NCCL INFO Using MSCCL Algo files from /usr/share/nccl/msccl-algorithms node-0:1933434:1934850 [4] NCCL INFO MSCCL: No external scheduler found, using internal implementation node-0:1933434:1934850 [4] NCCL INFO MSCCL: Internal Scheduler will use /lib/x86_64-linux-gnu/msccl-algorithms as algorithm directory and /lib/x86_64-linux-gnu/../share/nccl/msccl-algorithms as share algorithm directory and /usr/share/nccl/msccl-algorithms as package installed share algorithm directory node-0:1933434:1934850 [4] NCCL INFO Using MSCCL Algo files from /usr/share/nccl/msccl-algorithms node-0:1933435:1934848 [5] NCCL INFO MSCCL: No external scheduler found, using internal implementation node-0:1933435:1934848 [5] NCCL INFO MSCCL: Internal Scheduler will use /lib/x86_64-linux-gnu/msccl-algorithms as algorithm directory and /lib/x86_64-linux-gnu/../share/nccl/msccl-algorithms as share algorithm directory and /usr/share/nccl/msccl-algorithms as package installed share algorithm directory node-0:1933433:1934551 [3] NCCL INFO MSCCL: No external scheduler found, using internal implementation node-0:1933433:1934551 [3] NCCL INFO MSCCL: Internal Scheduler will use /lib/x86_64-linux-gnu/msccl-algorithms as algorithm directory and /lib/x86_64-linux-gnu/../share/nccl/msccl-algorithms as share algorithm directory and /usr/share/nccl/msccl-algorithms as package installed share algorithm directory node-0:1933435:1934848 [5] NCCL INFO Using MSCCL Algo files from /usr/share/nccl/msccl-algorithms node-0:1933433:1934551 [3] NCCL INFO Using MSCCL Algo files from /usr/share/nccl/msccl-algorithms node-0:1933432:1934230 [2] NCCL INFO MSCCL: Initialization finished node-0:1933436:1934547 [6] NCCL INFO MSCCL: Initialization finished node-0:1933437:1934852 [7] NCCL INFO MSCCL: Initialization finished node-0:1933434:1934850 [4] NCCL INFO MSCCL: Initialization finished node-0:1933435:1934848 [5] NCCL INFO MSCCL: Initialization finished node-0:1933433:1934551 [3] NCCL INFO MSCCL: Initialization finished node-0:1933434:1934850 [4] NCCL INFO comm 0x1c4fa290 rank 4 nranks 8 cudaDev 4 nvmlDev 4 busId b00000 commId 0xf0f4eccd25f6e07d - Init COMPLETE node-0:1933436:1934547 [6] NCCL INFO comm 0x191c29e0 rank 6 nranks 8 cudaDev 6 nvmlDev 6 busId d00000 commId 0xf0f4eccd25f6e07d - Init COMPLETE node-0:1933433:1934551 [3] NCCL INFO comm 0x280a7fe0 rank 3 nranks 8 cudaDev 3 nvmlDev 3 busId 400000 commId 0xf0f4eccd25f6e07d - Init COMPLETE node-0:1933437:1934852 [7] NCCL INFO comm 0x2b32c390 rank 7 nranks 8 cudaDev 7 nvmlDev 7 busId e00000 commId 0xf0f4eccd25f6e07d - Init COMPLETE node-0:1933432:1934230 [2] NCCL INFO comm 0x190895c0 rank 2 nranks 8 cudaDev 2 nvmlDev 2 busId 300000 commId 0xf0f4eccd25f6e07d - Init COMPLETE node-0:1933435:1934848 [5] NCCL INFO comm 0x18508c00 rank 5 nranks 8 cudaDev 5 nvmlDev 5 busId c00000 commId 0xf0f4eccd25f6e07d - Init COMPLETE node-0:1933431:1934545 [1] NCCL INFO comm 0x18aedf80 rank 1 nranks 8 cudaDev 1 nvmlDev 1 busId 200000 commId 0xf0f4eccd25f6e07d - Init COMPLETE node-0:1933430:1934228 [0] NCCL INFO comm 0x2691bc80 rank 0 nranks 8 cudaDev 0 nvmlDev 0 busId 100000 commId 0xf0f4eccd25f6e07d - Init COMPLETE [rank5]:[W1128 11:52:41.869205308 Utils.hpp:110] Warning: Environment variable NCCL_BLOCKING_WAIT is deprecated; use TORCH_NCCL_BLOCKING_WAIT instead (function operator()) [rank4]:[W1128 11:52:41.869354442 Utils.hpp:110] Warning: Environment variable NCCL_BLOCKING_WAIT is deprecated; use TORCH_NCCL_BLOCKING_WAIT instead (function operator()) [rank3]:[W1128 11:52:41.870670729 Utils.hpp:110] Warning: Environment variable NCCL_BLOCKING_WAIT is deprecated; use TORCH_NCCL_BLOCKING_WAIT instead (function operator()) [rank1]:[W1128 11:52:41.872397191 Utils.hpp:110] Warning: Environment variable NCCL_BLOCKING_WAIT is deprecated; use TORCH_NCCL_BLOCKING_WAIT instead (function operator()) [rank6]:[W1128 11:52:41.873209534 Utils.hpp:110] Warning: Environment variable NCCL_BLOCKING_WAIT is deprecated; use TORCH_NCCL_BLOCKING_WAIT instead (function operator()) [rank7]:[W1128 11:52:41.873396798 Utils.hpp:110] Warning: Environment variable NCCL_BLOCKING_WAIT is deprecated; use TORCH_NCCL_BLOCKING_WAIT instead (function operator()) [rank2]:[W1128 11:52:41.874716166 Utils.hpp:110] Warning: Environment variable NCCL_BLOCKING_WAIT is deprecated; use TORCH_NCCL_BLOCKING_WAIT instead (function operator()) [rank0]:[W1128 11:52:41.876778389 Utils.hpp:110] Warning: Environment variable NCCL_BLOCKING_WAIT is deprecated; use TORCH_NCCL_BLOCKING_WAIT instead (function operator()) Steps: 0%| | 0/100 [00:00