File size: 11,540 Bytes
c011401 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 |
[2025-01-15 22:08:51,908 I 8759 8759] (gcs_server) gcs_server_main.cc:52: Ray cluster metadata ray_version=2.40.0 ray_commit=22541c38dbef25286cd6d19f1c151bf4fd62f2ed
[2025-01-15 22:08:51,908 I 8759 8759] (gcs_server) io_service_pool.cc:35: IOServicePool is running with 1 io_service.
[2025-01-15 22:08:51,914 I 8759 8759] (gcs_server) event.cc:493: Ray Event initialized for GCS
[2025-01-15 22:08:51,914 I 8759 8759] (gcs_server) event.cc:493: Ray Event initialized for EXPORT_NODE
[2025-01-15 22:08:51,914 I 8759 8759] (gcs_server) event.cc:493: Ray Event initialized for EXPORT_ACTOR
[2025-01-15 22:08:51,914 I 8759 8759] (gcs_server) event.cc:493: Ray Event initialized for EXPORT_DRIVER_JOB
[2025-01-15 22:08:51,914 I 8759 8759] (gcs_server) event.cc:324: Set ray event level to warning
[2025-01-15 22:08:51,920 I 8759 8759] (gcs_server) gcs_server.cc:73: GCS storage type is StorageType::IN_MEMORY
[2025-01-15 22:08:51,921 I 8759 8759] (gcs_server) gcs_init_data.cc:42: Loading job table data.
[2025-01-15 22:08:51,921 I 8759 8759] (gcs_server) gcs_init_data.cc:54: Loading node table data.
[2025-01-15 22:08:51,921 I 8759 8759] (gcs_server) gcs_init_data.cc:80: Loading actor table data.
[2025-01-15 22:08:51,921 I 8759 8759] (gcs_server) gcs_init_data.cc:93: Loading actor task spec table data.
[2025-01-15 22:08:51,921 I 8759 8759] (gcs_server) gcs_init_data.cc:66: Loading placement group table data.
[2025-01-15 22:08:51,921 I 8759 8759] (gcs_server) gcs_init_data.cc:46: Finished loading job table data, size = 0
[2025-01-15 22:08:51,921 I 8759 8759] (gcs_server) gcs_init_data.cc:58: Finished loading node table data, size = 0
[2025-01-15 22:08:51,921 I 8759 8759] (gcs_server) gcs_init_data.cc:84: Finished loading actor table data, size = 0
[2025-01-15 22:08:51,921 I 8759 8759] (gcs_server) gcs_init_data.cc:97: Finished loading actor task spec table data, size = 0
[2025-01-15 22:08:51,921 I 8759 8759] (gcs_server) gcs_init_data.cc:71: Finished loading placement group table data, size = 0
[2025-01-15 22:08:51,921 I 8759 8759] (gcs_server) gcs_server.cc:162: No existing server cluster ID found. Generating new ID: 2bcf8a6732068ee64ed53aa66cc21831cded5d79d632420e62bd9704
[2025-01-15 22:08:51,922 I 8759 8759] (gcs_server) gcs_server.cc:644: Autoscaler V2 enabled: 0
[2025-01-15 22:08:51,927 I 8759 8759] (gcs_server) grpc_server.cc:134: GcsServer server started, listening on port 48490.
[2025-01-15 22:08:52,184 I 8759 8759] (gcs_server) gcs_server.cc:245: Gcs Debug state:
GcsNodeManager:
- RegisterNode request count: 0
- DrainNode request count: 0
- GetAllNodeInfo request count: 0
GcsActorManager:
- RegisterActor request count: 0
- CreateActor request count: 0
- GetActorInfo request count: 0
- GetNamedActorInfo request count: 0
- GetAllActorInfo request count: 0
- KillActor request count: 0
- ListNamedActors request count: 0
- Registered actors count: 0
- Destroyed actors count: 0
- Named actors count: 0
- Unresolved actors count: 0
- Pending actors count: 0
- Created actors count: 0
- owners_: 0
- actor_to_register_callbacks_: 0
- actor_to_restart_callbacks_: 0
- actor_to_create_callbacks_: 0
- sorted_destroyed_actor_list_: 0
GcsResourceManager:
- GetAllAvailableResources request count: 0
- GetAllTotalResources request count: 0
- GetAllResourceUsage request count: 0
GcsPlacementGroupManager:
- CreatePlacementGroup request count: 0
- RemovePlacementGroup request count: 0
- GetPlacementGroup request count: 0
- GetAllPlacementGroup request count: 0
- WaitPlacementGroupUntilReady request count: 0
- GetNamedPlacementGroup request count: 0
- Scheduling pending placement group count: 0
- Registered placement groups count: 0
- Named placement group count: 0
- Pending placement groups count: 0
- Infeasible placement groups count: 0
Publisher:
[runtime env manager] ID to URIs table:
[runtime env manager] URIs reference table:
GcsTaskManager:
-Total num task events reported: 0
-Total num status task events dropped: 0
-Total num profile events dropped: 0
-Current num of task events stored: 0
-Total num of actor creation tasks: 0
-Total num of actor tasks: 0
-Total num of normal tasks: 0
-Total num of driver tasks: 0
GcsAutoscalerStateManager:
- last_seen_autoscaler_state_version_: 0
- last_cluster_resource_state_version_: 0
- pending demands:
[2025-01-15 22:08:52,185 I 8759 8759] (gcs_server) gcs_server.cc:843: Main service Event stats:
Global stats: 25 total (5 active)
Queueing time: mean = 93.925 ms, max = 259.585 ms, min = 2.187 us, total = 2.348 s
Execution time: mean = 10.507 ms, total = 262.668 ms
Event stats:
GcsInMemoryStore.Put - 9 total (0 active), Execution time: mean = 28.846 ms, total = 259.610 ms, Queueing time: mean = 201.340 ms, max = 259.108 ms, min = 2.187 us, total = 1.812 s
GcsInMemoryStore.GetAll - 5 total (0 active), Execution time: mean = 27.716 us, total = 138.579 us, Queueing time: mean = 57.418 us, max = 61.830 us, min = 53.344 us, total = 287.091 us
PeriodicalRunner.RunFnPeriodically - 4 total (2 active, 1 running), Execution time: mean = 2.295 us, total = 9.180 us, Queueing time: mean = 129.764 ms, max = 259.585 ms, min = 259.470 ms, total = 519.055 ms
event_loop_lag_probe - 2 total (0 active), Execution time: mean = 11.284 us, total = 22.567 us, Queueing time: mean = 6.804 ms, max = 13.423 ms, min = 184.796 us, total = 13.607 ms
NodeInfoGcsService.grpc_server.GetClusterId - 1 total (1 active), Execution time: mean = 0.000 s, total = 0.000 s, Queueing time: mean = 0.000 s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s
ClusterResourceManager.ResetRemoteNodeView - 1 total (1 active), Execution time: mean = 0.000 s, total = 0.000 s, Queueing time: mean = 0.000 s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s
NodeInfoGcsService.grpc_server.GetClusterId.HandleRequestImpl - 1 total (0 active), Execution time: mean = 2.842 ms, total = 2.842 ms, Queueing time: mean = 3.112 ms, max = 3.112 ms, min = 3.112 ms, total = 3.112 ms
RayletLoadPulled - 1 total (1 active), Execution time: mean = 0.000 s, total = 0.000 s, Queueing time: mean = 0.000 s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s
GcsInMemoryStore.Get - 1 total (0 active), Execution time: mean = 46.268 us, total = 46.268 us, Queueing time: mean = 11.333 us, max = 11.333 us, min = 11.333 us, total = 11.333 us
[2025-01-15 22:08:52,185 I 8759 8759] (gcs_server) gcs_server.cc:847: task_io_context Event stats:
Global stats: 5 total (1 active)
Queueing time: mean = 205.976 us, max = 885.749 us, min = 11.465 us, total = 1.030 ms
Execution time: mean = 760.624 us, total = 3.803 ms
Event stats:
event_loop_lag_probe - 3 total (0 active), Execution time: mean = 1.264 ms, total = 3.791 ms, Queueing time: mean = 311.804 us, max = 885.749 us, min = 11.465 us, total = 935.412 us
PeriodicalRunner.RunFnPeriodically - 1 total (0 active), Execution time: mean = 12.426 us, total = 12.426 us, Queueing time: mean = 94.469 us, max = 94.469 us, min = 94.469 us, total = 94.469 us
GcsTaskManager.GcJobSummary - 1 total (1 active), Execution time: mean = 0.000 s, total = 0.000 s, Queueing time: mean = 0.000 s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s
[2025-01-15 22:08:52,185 I 8759 8759] (gcs_server) gcs_server.cc:847: pubsub_io_context Event stats:
Global stats: 5 total (1 active)
Queueing time: mean = 1.446 ms, max = 7.025 ms, min = 8.892 us, total = 7.230 ms
Execution time: mean = 47.262 us, total = 236.309 us
Event stats:
event_loop_lag_probe - 3 total (0 active), Execution time: mean = 61.529 us, total = 184.587 us, Queueing time: mean = 2.379 ms, max = 7.025 ms, min = 8.892 us, total = 7.137 ms
PeriodicalRunner.RunFnPeriodically - 1 total (0 active), Execution time: mean = 51.722 us, total = 51.722 us, Queueing time: mean = 92.550 us, max = 92.550 us, min = 92.550 us, total = 92.550 us
Publisher.CheckDeadSubscribers - 1 total (1 active), Execution time: mean = 0.000 s, total = 0.000 s, Queueing time: mean = 0.000 s, max = -0.000 s, min = 9223372036.855 s, total = 0.000 s
[2025-01-15 22:08:52,185 I 8759 8759] (gcs_server) gcs_server.cc:847: ray_syncer_io_context Event stats:
Global stats: 5 total (0 active)
Queueing time: mean = 1.541 ms, max = 7.473 ms, min = 8.937 us, total = 7.703 ms
Execution time: mean = 51.605 us, total = 258.024 us
Event stats:
event_loop_lag_probe - 3 total (0 active), Execution time: mean = 85.401 us, total = 256.204 us, Queueing time: mean = 2.510 ms, max = 7.473 ms, min = 8.937 us, total = 7.529 ms
RaySyncerRegister - 2 total (0 active), Execution time: mean = 910.000 ns, total = 1.820 us, Queueing time: mean = 86.895 us, max = 88.042 us, min = 85.749 us, total = 173.791 us
[2025-01-15 22:08:54,462 I 8759 8759] (gcs_server) gcs_node_manager.cc:85: Registering node info, address = 192.168.0.2, node name = 192.168.0.2 node_id=b299581aa87c781b34034c14ce7a6e0e40656f864b369ec0463de680
[2025-01-15 22:08:54,463 I 8759 8759] (gcs_server) gcs_node_manager.cc:91: Finished registering node info, address = 192.168.0.2, node name = 192.168.0.2, is_head_node = 1 node_id=b299581aa87c781b34034c14ce7a6e0e40656f864b369ec0463de680
[2025-01-15 22:08:54,463 I 8759 8759] (gcs_server) gcs_placement_group_manager.cc:819: A new node: b299581aa87c781b34034c14ce7a6e0e40656f864b369ec0463de680 registered, will try to reschedule all the infeasible placement groups.
[2025-01-15 22:08:54,474 I 8759 8837] (gcs_server) ray_syncer.cc:377: Get connection node_id=b299581aa87c781b34034c14ce7a6e0e40656f864b369ec0463de680
[2025-01-15 22:08:55,399 I 8759 8759] (gcs_server) gcs_job_manager.cc:90: Adding job, job id = 01000000, driver pid = 8692
[2025-01-15 22:08:55,399 I 8759 8759] (gcs_server) gcs_job_manager.cc:111: Finished adding job, job id = 01000000, driver pid = 8692
[2025-01-15 22:08:55,659 I 8759 8759] (gcs_server) gcs_job_manager.cc:149: Finished marking job state, job id = 01000000
[2025-01-15 22:08:55,751 I 8759 8759] (gcs_server) gcs_node_manager.cc:366: Removing node, node name = 192.168.0.2, death reason = EXPECTED_TERMINATION, death message = received SIGTERM node_id=b299581aa87c781b34034c14ce7a6e0e40656f864b369ec0463de680
[2025-01-15 22:08:55,752 I 8759 8759] (gcs_server) gcs_placement_group_manager.cc:789: Node failed, rescheduling the placement groups on the dead node. node_id=b299581aa87c781b34034c14ce7a6e0e40656f864b369ec0463de680
[2025-01-15 22:08:55,752 I 8759 8759] (gcs_server) gcs_actor_manager.cc:1274: Node failed, reconstructing actors. node_id=b299581aa87c781b34034c14ce7a6e0e40656f864b369ec0463de680
[2025-01-15 22:08:55,752 I 8759 8759] (gcs_server) gcs_job_manager.cc:454: Node failed, mark all jobs from this node as finished node_id=b299581aa87c781b34034c14ce7a6e0e40656f864b369ec0463de680
[2025-01-15 22:08:55,993 I 8759 8808] (gcs_server) ray_syncer-inl.h:318: Failed to read the message from: b299581aa87c781b34034c14ce7a6e0e40656f864b369ec0463de680
[2025-01-15 22:08:55,993 I 8759 8808] (gcs_server) ray_syncer.cc:373: Connection is broken. node_id=b299581aa87c781b34034c14ce7a6e0e40656f864b369ec0463de680
[2025-01-15 22:08:56,016 I 8759 8759] (gcs_server) gcs_server_main.cc:130: GCS server received SIGTERM, shutting down...
[2025-01-15 22:08:56,017 I 8759 8759] (gcs_server) gcs_server.cc:267: Stopping GCS server.
[2025-01-15 22:08:56,108 I 8759 8759] (gcs_server) gcs_server.cc:284: GCS server stopped.
[2025-01-15 22:08:56,109 I 8759 8759] (gcs_server) io_service_pool.cc:47: IOServicePool is stopped.
[2025-01-15 22:08:56,119 I 8759 8759] (gcs_server) stats.h:120: Stats module has shutdown.
|