File size: 19,236 Bytes
368b66b
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
2024-04-11 01:01:03,983 INFO    StreamThr :138 [internal.py:wandb_internal():86] W&B internal server running at pid: 138, started at: 2024-04-11 01:01:03.982415
2024-04-11 01:01:03,984 DEBUG   HandlerThread:138 [handler.py:handle_request():146] handle_request: status
2024-04-11 01:01:04,351 INFO    WriterThread:138 [datastore.py:open_for_write():87] open: /kaggle/working/wandb/run-20240411_010103-4b3fzolv/run-4b3fzolv.wandb
2024-04-11 01:01:04,352 DEBUG   SenderThread:138 [sender.py:send():379] send: header
2024-04-11 01:01:04,355 DEBUG   SenderThread:138 [sender.py:send():379] send: run
2024-04-11 01:01:04,503 INFO    SenderThread:138 [dir_watcher.py:__init__():211] watching files in: /kaggle/working/wandb/run-20240411_010103-4b3fzolv/files
2024-04-11 01:01:04,503 INFO    SenderThread:138 [sender.py:_start_run_threads():1124] run started: 4b3fzolv with start time 1712797263.984466
2024-04-11 01:01:04,511 DEBUG   HandlerThread:138 [handler.py:handle_request():146] handle_request: check_version
2024-04-11 01:01:04,511 DEBUG   SenderThread:138 [sender.py:send_request():406] send_request: check_version
2024-04-11 01:01:04,599 DEBUG   HandlerThread:138 [handler.py:handle_request():146] handle_request: run_start
2024-04-11 01:01:04,610 DEBUG   HandlerThread:138 [system_info.py:__init__():26] System info init
2024-04-11 01:01:04,610 DEBUG   HandlerThread:138 [system_info.py:__init__():41] System info init done
2024-04-11 01:01:04,610 INFO    HandlerThread:138 [system_monitor.py:start():194] Starting system monitor
2024-04-11 01:01:04,610 INFO    SystemMonitor:138 [system_monitor.py:_start():158] Starting system asset monitoring threads
2024-04-11 01:01:04,610 INFO    HandlerThread:138 [system_monitor.py:probe():214] Collecting system info
2024-04-11 01:01:04,611 INFO    SystemMonitor:138 [interfaces.py:start():190] Started cpu monitoring
2024-04-11 01:01:04,611 INFO    SystemMonitor:138 [interfaces.py:start():190] Started disk monitoring
2024-04-11 01:01:04,612 INFO    SystemMonitor:138 [interfaces.py:start():190] Started gpu monitoring
2024-04-11 01:01:04,613 INFO    SystemMonitor:138 [interfaces.py:start():190] Started memory monitoring
2024-04-11 01:01:04,614 INFO    SystemMonitor:138 [interfaces.py:start():190] Started network monitoring
2024-04-11 01:01:04,623 DEBUG   HandlerThread:138 [system_info.py:probe():150] Probing system
2024-04-11 01:01:04,625 DEBUG   HandlerThread:138 [gitlib.py:_init_repo():56] git repository is invalid
2024-04-11 01:01:04,625 DEBUG   HandlerThread:138 [system_info.py:probe():198] Probing system done
2024-04-11 01:01:04,626 DEBUG   HandlerThread:138 [system_monitor.py:probe():223] {'os': 'Linux-5.15.133+-x86_64-with-glibc2.31', 'python': '3.10.13', 'heartbeatAt': '2024-04-11T01:01:04.623823', 'startedAt': '2024-04-11T01:01:03.976173', 'docker': None, 'cuda': None, 'args': (), 'state': 'running', 'program': 'kaggle.ipynb', 'codePathLocal': None, 'root': '/kaggle/working', 'host': 'c072b7c9e487', 'username': 'root', 'executable': '/opt/conda/bin/python3.10', 'cpu_count': 2, 'cpu_count_logical': 4, 'cpu_freq': {'current': 2000.188, 'min': 0.0, 'max': 0.0}, 'cpu_freq_per_core': [{'current': 2000.188, 'min': 0.0, 'max': 0.0}, {'current': 2000.188, 'min': 0.0, 'max': 0.0}, {'current': 2000.188, 'min': 0.0, 'max': 0.0}, {'current': 2000.188, 'min': 0.0, 'max': 0.0}], 'disk': {'/': {'total': 8062.387607574463, 'used': 5566.689571380615}}, 'gpu': 'Tesla T4', 'gpu_count': 2, 'gpu_devices': [{'name': 'Tesla T4', 'memory_total': 16106127360}, {'name': 'Tesla T4', 'memory_total': 16106127360}], 'memory': {'total': 31.357559204101562}}
2024-04-11 01:01:04,626 INFO    HandlerThread:138 [system_monitor.py:probe():224] Finished collecting system info
2024-04-11 01:01:04,626 INFO    HandlerThread:138 [system_monitor.py:probe():227] Publishing system info
2024-04-11 01:01:04,626 DEBUG   HandlerThread:138 [system_info.py:_save_conda():207] Saving list of conda packages installed into the current environment
2024-04-11 01:01:05,505 INFO    Thread-12 :138 [dir_watcher.py:_on_file_created():271] file/dir created: /kaggle/working/wandb/run-20240411_010103-4b3fzolv/files/conda-environment.yaml
2024-04-11 01:01:19,640 ERROR   HandlerThread:138 [system_info.py:_save_conda():221] Error saving conda packages: Command '['conda', 'env', 'export']' timed out after 15 seconds
Traceback (most recent call last):
  File "/opt/conda/lib/python3.10/site-packages/wandb/sdk/internal/system/system_info.py", line 214, in _save_conda
    subprocess.call(
  File "/opt/conda/lib/python3.10/subprocess.py", line 347, in call
    return p.wait(timeout=timeout)
  File "/opt/conda/lib/python3.10/subprocess.py", line 1209, in wait
    return self._wait(timeout=timeout)
  File "/opt/conda/lib/python3.10/subprocess.py", line 1951, in _wait
    raise TimeoutExpired(self.args, timeout)
subprocess.TimeoutExpired: Command '['conda', 'env', 'export']' timed out after 15 seconds
2024-04-11 01:01:19,643 DEBUG   HandlerThread:138 [system_info.py:_save_conda():222] Saving conda packages done
2024-04-11 01:01:19,644 INFO    HandlerThread:138 [system_monitor.py:probe():229] Finished publishing system info
2024-04-11 01:01:19,652 DEBUG   HandlerThread:138 [handler.py:handle_request():146] handle_request: status_report
2024-04-11 01:01:19,652 DEBUG   HandlerThread:138 [handler.py:handle_request():146] handle_request: keepalive
2024-04-11 01:01:19,652 DEBUG   HandlerThread:138 [handler.py:handle_request():146] handle_request: status_report
2024-04-11 01:01:19,653 DEBUG   HandlerThread:138 [handler.py:handle_request():146] handle_request: keepalive
2024-04-11 01:01:19,653 DEBUG   HandlerThread:138 [handler.py:handle_request():146] handle_request: status_report
2024-04-11 01:01:19,653 DEBUG   HandlerThread:138 [handler.py:handle_request():146] handle_request: keepalive
2024-04-11 01:01:19,653 DEBUG   SenderThread:138 [sender.py:send():379] send: files
2024-04-11 01:01:19,653 INFO    SenderThread:138 [sender.py:_save_file():1390] saving file wandb-metadata.json with policy now
2024-04-11 01:01:19,933 INFO    wandb-upload_0:138 [upload_job.py:push():131] Uploaded file /tmp/tmpiqqv1dwfwandb/3d55vshp-wandb-metadata.json
2024-04-11 01:01:20,508 INFO    Thread-12 :138 [dir_watcher.py:_on_file_created():271] file/dir created: /kaggle/working/wandb/run-20240411_010103-4b3fzolv/files/wandb-metadata.json
2024-04-11 01:01:20,599 DEBUG   HandlerThread:138 [handler.py:handle_request():146] handle_request: python_packages
2024-04-11 01:01:20,599 DEBUG   SenderThread:138 [sender.py:send_request():406] send_request: python_packages
2024-04-11 01:01:20,602 DEBUG   SenderThread:138 [sender.py:send():379] send: telemetry
2024-04-11 01:01:20,613 DEBUG   SenderThread:138 [sender.py:send():379] send: config
2024-04-11 01:01:20,615 DEBUG   SenderThread:138 [sender.py:send():379] send: metric
2024-04-11 01:01:20,615 DEBUG   SenderThread:138 [sender.py:send():379] send: telemetry
2024-04-11 01:01:20,615 DEBUG   SenderThread:138 [sender.py:send():379] send: metric
2024-04-11 01:01:20,615 WARNING SenderThread:138 [sender.py:send_metric():1341] Seen metric with glob (shouldn't happen)
2024-04-11 01:01:20,616 DEBUG   SenderThread:138 [sender.py:send():379] send: telemetry
2024-04-11 01:01:20,616 DEBUG   HandlerThread:138 [handler.py:handle_request():146] handle_request: stop_status
2024-04-11 01:01:20,616 DEBUG   HandlerThread:138 [handler.py:handle_request():146] handle_request: internal_messages
2024-04-11 01:01:20,617 DEBUG   SenderThread:138 [sender.py:send_request():406] send_request: stop_status
2024-04-11 01:01:21,508 INFO    Thread-12 :138 [dir_watcher.py:_on_file_created():271] file/dir created: /kaggle/working/wandb/run-20240411_010103-4b3fzolv/files/requirements.txt
2024-04-11 01:01:21,509 INFO    Thread-12 :138 [dir_watcher.py:_on_file_created():271] file/dir created: /kaggle/working/wandb/run-20240411_010103-4b3fzolv/files/output.log
2024-04-11 01:01:23,509 INFO    Thread-12 :138 [dir_watcher.py:_on_file_modified():288] file/dir modified: /kaggle/working/wandb/run-20240411_010103-4b3fzolv/files/output.log
2024-04-11 01:01:25,510 INFO    Thread-12 :138 [dir_watcher.py:_on_file_modified():288] file/dir modified: /kaggle/working/wandb/run-20240411_010103-4b3fzolv/files/output.log
2024-04-11 01:01:25,513 DEBUG   HandlerThread:138 [handler.py:handle_request():146] handle_request: status_report
2024-04-11 01:01:27,511 INFO    Thread-12 :138 [dir_watcher.py:_on_file_modified():288] file/dir modified: /kaggle/working/wandb/run-20240411_010103-4b3fzolv/files/output.log
2024-04-11 01:01:30,790 DEBUG   HandlerThread:138 [handler.py:handle_request():146] handle_request: status_report
2024-04-11 01:01:35,602 DEBUG   HandlerThread:138 [handler.py:handle_request():146] handle_request: internal_messages
2024-04-11 01:01:35,603 DEBUG   HandlerThread:138 [handler.py:handle_request():146] handle_request: stop_status
2024-04-11 01:01:35,604 DEBUG   SenderThread:138 [sender.py:send_request():406] send_request: stop_status
2024-04-11 01:01:36,648 DEBUG   HandlerThread:138 [handler.py:handle_request():146] handle_request: status_report
2024-04-11 01:01:37,515 INFO    Thread-12 :138 [dir_watcher.py:_on_file_modified():288] file/dir modified: /kaggle/working/wandb/run-20240411_010103-4b3fzolv/files/config.yaml
2024-04-11 01:01:41,766 DEBUG   HandlerThread:138 [handler.py:handle_request():146] handle_request: status_report
2024-04-11 01:01:46,767 DEBUG   HandlerThread:138 [handler.py:handle_request():146] handle_request: status_report
2024-04-11 01:01:50,600 DEBUG   HandlerThread:138 [handler.py:handle_request():146] handle_request: stop_status
2024-04-11 01:01:50,601 DEBUG   SenderThread:138 [sender.py:send_request():406] send_request: stop_status
2024-04-11 01:01:50,640 DEBUG   HandlerThread:138 [handler.py:handle_request():146] handle_request: internal_messages
2024-04-11 01:01:52,707 DEBUG   HandlerThread:138 [handler.py:handle_request():146] handle_request: status_report
2024-04-11 01:01:57,708 DEBUG   HandlerThread:138 [handler.py:handle_request():146] handle_request: status_report
2024-04-11 01:02:02,709 DEBUG   HandlerThread:138 [handler.py:handle_request():146] handle_request: status_report
2024-04-11 01:02:04,614 DEBUG   SystemMonitor:138 [system_monitor.py:_start():172] Starting system metrics aggregation loop
2024-04-11 01:02:04,616 DEBUG   SenderThread:138 [sender.py:send():379] send: stats
2024-04-11 01:02:05,600 DEBUG   HandlerThread:138 [handler.py:handle_request():146] handle_request: stop_status
2024-04-11 01:02:05,601 DEBUG   SenderThread:138 [sender.py:send_request():406] send_request: stop_status
2024-04-11 01:02:05,641 DEBUG   HandlerThread:138 [handler.py:handle_request():146] handle_request: internal_messages
2024-04-11 01:02:08,651 DEBUG   HandlerThread:138 [handler.py:handle_request():146] handle_request: status_report
2024-04-11 01:02:13,651 DEBUG   HandlerThread:138 [handler.py:handle_request():146] handle_request: status_report
2024-04-11 01:02:18,652 DEBUG   HandlerThread:138 [handler.py:handle_request():146] handle_request: status_report
2024-04-11 01:02:20,600 DEBUG   HandlerThread:138 [handler.py:handle_request():146] handle_request: stop_status
2024-04-11 01:02:20,601 DEBUG   SenderThread:138 [sender.py:send_request():406] send_request: stop_status
2024-04-11 01:02:20,641 DEBUG   HandlerThread:138 [handler.py:handle_request():146] handle_request: internal_messages
2024-04-11 01:02:23,671 DEBUG   HandlerThread:138 [handler.py:handle_request():146] handle_request: status_report
2024-04-11 01:02:28,672 DEBUG   HandlerThread:138 [handler.py:handle_request():146] handle_request: status_report
2024-04-11 01:02:33,673 DEBUG   HandlerThread:138 [handler.py:handle_request():146] handle_request: status_report
2024-04-11 01:02:34,617 DEBUG   SenderThread:138 [sender.py:send():379] send: stats
2024-04-11 01:02:35,601 DEBUG   HandlerThread:138 [handler.py:handle_request():146] handle_request: stop_status
2024-04-11 01:02:35,601 DEBUG   SenderThread:138 [sender.py:send_request():406] send_request: stop_status
2024-04-11 01:02:35,641 DEBUG   HandlerThread:138 [handler.py:handle_request():146] handle_request: internal_messages
2024-04-11 01:02:39,669 DEBUG   HandlerThread:138 [handler.py:handle_request():146] handle_request: status_report
2024-04-11 01:02:44,669 DEBUG   HandlerThread:138 [handler.py:handle_request():146] handle_request: status_report
2024-04-11 01:02:49,670 DEBUG   HandlerThread:138 [handler.py:handle_request():146] handle_request: status_report
2024-04-11 01:02:50,601 DEBUG   HandlerThread:138 [handler.py:handle_request():146] handle_request: stop_status
2024-04-11 01:02:50,601 DEBUG   SenderThread:138 [sender.py:send_request():406] send_request: stop_status
2024-04-11 01:02:50,641 DEBUG   HandlerThread:138 [handler.py:handle_request():146] handle_request: internal_messages
2024-04-11 01:02:54,683 DEBUG   HandlerThread:138 [handler.py:handle_request():146] handle_request: status_report
2024-04-11 01:02:59,684 DEBUG   HandlerThread:138 [handler.py:handle_request():146] handle_request: status_report
2024-04-11 01:03:04,618 DEBUG   SenderThread:138 [sender.py:send():379] send: stats
2024-04-11 01:03:05,601 DEBUG   HandlerThread:138 [handler.py:handle_request():146] handle_request: stop_status
2024-04-11 01:03:05,601 DEBUG   SenderThread:138 [sender.py:send_request():406] send_request: stop_status
2024-04-11 01:03:05,641 DEBUG   HandlerThread:138 [handler.py:handle_request():146] handle_request: internal_messages
2024-04-11 01:03:05,668 DEBUG   HandlerThread:138 [handler.py:handle_request():146] handle_request: status_report
2024-04-11 01:03:10,669 DEBUG   HandlerThread:138 [handler.py:handle_request():146] handle_request: status_report
2024-04-11 01:03:15,670 DEBUG   HandlerThread:138 [handler.py:handle_request():146] handle_request: status_report
2024-04-11 01:03:20,601 DEBUG   HandlerThread:138 [handler.py:handle_request():146] handle_request: stop_status
2024-04-11 01:03:20,602 DEBUG   SenderThread:138 [sender.py:send_request():406] send_request: stop_status
2024-04-11 01:03:20,641 DEBUG   HandlerThread:138 [handler.py:handle_request():146] handle_request: internal_messages
2024-04-11 01:03:21,658 DEBUG   HandlerThread:138 [handler.py:handle_request():146] handle_request: status_report
2024-04-11 01:03:26,659 DEBUG   HandlerThread:138 [handler.py:handle_request():146] handle_request: status_report
2024-04-11 01:03:31,660 DEBUG   HandlerThread:138 [handler.py:handle_request():146] handle_request: status_report
2024-04-11 01:03:34,619 DEBUG   SenderThread:138 [sender.py:send():379] send: stats
2024-04-11 01:03:35,601 DEBUG   HandlerThread:138 [handler.py:handle_request():146] handle_request: stop_status
2024-04-11 01:03:35,602 DEBUG   SenderThread:138 [sender.py:send_request():406] send_request: stop_status
2024-04-11 01:03:35,642 DEBUG   HandlerThread:138 [handler.py:handle_request():146] handle_request: internal_messages
2024-04-11 01:03:36,661 DEBUG   HandlerThread:138 [handler.py:handle_request():146] handle_request: status_report
2024-04-11 01:03:37,017 DEBUG   HandlerThread:138 [handler.py:handle_request():146] handle_request: partial_history
2024-04-11 01:03:37,019 DEBUG   SenderThread:138 [sender.py:send():379] send: metric
2024-04-11 01:03:37,019 DEBUG   SenderThread:138 [sender.py:send():379] send: metric
2024-04-11 01:03:37,019 DEBUG   SenderThread:138 [sender.py:send():379] send: metric
2024-04-11 01:03:37,020 DEBUG   SenderThread:138 [sender.py:send():379] send: metric
2024-04-11 01:03:37,020 DEBUG   SenderThread:138 [sender.py:send():379] send: history
2024-04-11 01:03:37,020 DEBUG   SenderThread:138 [sender.py:send_request():406] send_request: summary_record
2024-04-11 01:03:37,020 INFO    SenderThread:138 [sender.py:_save_file():1390] saving file wandb-summary.json with policy end
2024-04-11 01:03:37,559 INFO    Thread-12 :138 [dir_watcher.py:_on_file_created():271] file/dir created: /kaggle/working/wandb/run-20240411_010103-4b3fzolv/files/wandb-summary.json
2024-04-11 01:03:38,738 DEBUG   SenderThread:138 [sender.py:send():379] send: telemetry
2024-04-11 01:03:38,738 DEBUG   SenderThread:138 [sender.py:send_request():406] send_request: summary_record
2024-04-11 01:03:38,739 DEBUG   HandlerThread:138 [handler.py:handle_request():146] handle_request: partial_history
2024-04-11 01:03:38,741 INFO    SenderThread:138 [sender.py:_save_file():1390] saving file wandb-summary.json with policy end
2024-04-11 01:03:38,741 DEBUG   SenderThread:138 [sender.py:send_request():406] send_request: summary_record
2024-04-11 01:03:38,741 INFO    SenderThread:138 [sender.py:_save_file():1390] saving file wandb-summary.json with policy end
2024-04-11 01:03:38,742 DEBUG   SenderThread:138 [sender.py:send_request():406] send_request: summary_record
2024-04-11 01:03:38,742 INFO    SenderThread:138 [sender.py:_save_file():1390] saving file wandb-summary.json with policy end
2024-04-11 01:03:38,742 DEBUG   SenderThread:138 [sender.py:send_request():406] send_request: summary_record
2024-04-11 01:03:38,743 INFO    SenderThread:138 [sender.py:_save_file():1390] saving file wandb-summary.json with policy end
2024-04-11 01:03:38,743 DEBUG   SenderThread:138 [sender.py:send_request():406] send_request: summary_record
2024-04-11 01:03:38,743 INFO    SenderThread:138 [sender.py:_save_file():1390] saving file wandb-summary.json with policy end
2024-04-11 01:03:38,744 DEBUG   SenderThread:138 [sender.py:send():379] send: history
2024-04-11 01:03:38,744 DEBUG   SenderThread:138 [sender.py:send_request():406] send_request: summary_record
2024-04-11 01:03:38,744 INFO    SenderThread:138 [sender.py:_save_file():1390] saving file wandb-summary.json with policy end
2024-04-11 01:03:39,559 INFO    Thread-12 :138 [dir_watcher.py:_on_file_modified():288] file/dir modified: /kaggle/working/wandb/run-20240411_010103-4b3fzolv/files/wandb-summary.json
2024-04-11 01:03:39,560 INFO    Thread-12 :138 [dir_watcher.py:_on_file_modified():288] file/dir modified: /kaggle/working/wandb/run-20240411_010103-4b3fzolv/files/output.log
2024-04-11 01:03:41,560 INFO    Thread-12 :138 [dir_watcher.py:_on_file_modified():288] file/dir modified: /kaggle/working/wandb/run-20240411_010103-4b3fzolv/files/output.log
2024-04-11 01:03:41,871 DEBUG   HandlerThread:138 [handler.py:handle_request():146] handle_request: status_report
2024-04-11 01:03:42,561 INFO    Thread-12 :138 [dir_watcher.py:_on_file_modified():288] file/dir modified: /kaggle/working/wandb/run-20240411_010103-4b3fzolv/files/config.yaml
2024-04-11 01:03:46,983 DEBUG   HandlerThread:138 [handler.py:handle_request():146] handle_request: status_report
2024-04-11 01:03:50,601 DEBUG   HandlerThread:138 [handler.py:handle_request():146] handle_request: stop_status
2024-04-11 01:03:50,602 DEBUG   SenderThread:138 [sender.py:send_request():406] send_request: stop_status
2024-04-11 01:03:50,604 DEBUG   HandlerThread:138 [handler.py:handle_request():146] handle_request: internal_messages
2024-04-11 01:03:52,705 DEBUG   HandlerThread:138 [handler.py:handle_request():146] handle_request: status_report
2024-04-11 01:03:57,706 DEBUG   HandlerThread:138 [handler.py:handle_request():146] handle_request: status_report