sanchit-gandhi's picture
Training in progress, step 24500
2fe9464
2022-05-10 17:28:04 INFO Running runs: []
2022-05-10 17:28:04 INFO Agent received command: run
2022-05-10 17:28:04 INFO Agent starting run with config:
eval_split_name: test
eval_steps: 500
evaluation_strategy: steps
generation_max_length: 40
generation_num_beams: 1
gradient_accumulation_steps: 8
greater_is_better: True
hidden_dropout: 0.1
language: fr.en
learning_rate: 0.0003
logging_steps: 1
max_duration_in_seconds: 20
metric_for_best_model: bleu
model_name_or_path: ./
num_train_epochs: 10
output_dir: ./
per_device_eval_batch_size: 8
per_device_train_batch_size: 8
save_steps: 500
task: covost2
warmup_steps: 500
2022-05-10 17:28:04 INFO About to run command: python3 run_xtreme_s.py --overwrite_output_dir --freeze_feature_encoder --gradient_checkpointing --predict_with_generate --fp16 --group_by_length --do_train --do_eval --load_best_model_at_end --push_to_hub --use_auth_token --eval_split_name=test --eval_steps=500 --evaluation_strategy=steps --generation_max_length=40 --generation_num_beams=1 --gradient_accumulation_steps=8 --greater_is_better=True --hidden_dropout=0.1 --language=fr.en --learning_rate=0.0003 --logging_steps=1 --max_duration_in_seconds=20 --metric_for_best_model=bleu --model_name_or_path=./ --num_train_epochs=10 --output_dir=./ --per_device_eval_batch_size=8 --per_device_train_batch_size=8 --save_steps=500 --task=covost2 --warmup_steps=500
2022-05-10 17:28:09 INFO Running runs: ['vt9bt6sr']
2022-05-10 19:20:59 ERROR 500 response executing GraphQL.
2022-05-10 19:20:59 ERROR {"error":"context deadline exceeded"}
2022-05-11 09:19:32 ERROR 500 response executing GraphQL.
2022-05-11 09:19:32 ERROR {"errors":[{"message":"context deadline exceeded"}]}
2022-05-11 09:20:28 ERROR 500 response executing GraphQL.
2022-05-11 09:20:28 ERROR {"errors":[{"message":"context deadline exceeded"}]}
2022-05-11 21:23:22 ERROR 500 response executing GraphQL.
2022-05-11 21:23:22 ERROR {"errors":[{"message":"context deadline exceeded"}]}
2022-05-11 21:23:58 ERROR 500 response executing GraphQL.
2022-05-11 21:23:58 ERROR {"error":"Error 1040: Too many connections"}
2022-05-11 21:24:14 ERROR 500 response executing GraphQL.
2022-05-11 21:24:14 ERROR {"error":"Error 1040: Too many connections"}
2022-05-11 21:24:14 ERROR Retry attempt failed:
Traceback (most recent call last):
File "/home/sanchit_huggingface_co/gcp/lib/python3.9/site-packages/wandb/sdk/lib/retry.py", line 102, in __call__
result = self._call_fn(*args, **kwargs)
File "/home/sanchit_huggingface_co/gcp/lib/python3.9/site-packages/wandb/sdk/internal/internal_api.py", line 146, in execute
six.reraise(*sys.exc_info())
File "/home/sanchit_huggingface_co/gcp/lib/python3.9/site-packages/six.py", line 719, in reraise
raise value
File "/home/sanchit_huggingface_co/gcp/lib/python3.9/site-packages/wandb/sdk/internal/internal_api.py", line 140, in execute
return self.client.execute(*args, **kwargs)
File "/home/sanchit_huggingface_co/gcp/lib/python3.9/site-packages/wandb/vendor/gql-0.2.0/gql/client.py", line 52, in execute
result = self._get_result(document, *args, **kwargs)
File "/home/sanchit_huggingface_co/gcp/lib/python3.9/site-packages/wandb/vendor/gql-0.2.0/gql/client.py", line 60, in _get_result
return self.transport.execute(document, *args, **kwargs)
File "/home/sanchit_huggingface_co/gcp/lib/python3.9/site-packages/wandb/vendor/gql-0.2.0/gql/transport/requests.py", line 39, in execute
request.raise_for_status()
File "/home/sanchit_huggingface_co/gcp/lib/python3.9/site-packages/requests/models.py", line 960, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 500 Server Error: Internal Server Error for url: https://api.wandb.ai/graphql
2022-05-11 21:24:19 ERROR 500 response executing GraphQL.
2022-05-11 21:24:19 ERROR {"error":"Error 1040: Too many connections"}
2022-05-11 21:24:50 ERROR 500 response executing GraphQL.
2022-05-11 21:24:50 ERROR {"error":"Error 1040: Too many connections"}
2022-05-12 08:27:55 ERROR 500 response executing GraphQL.
2022-05-12 08:27:55 ERROR {"errors":[{"message":"context deadline exceeded"}]}
2022-05-12 08:28:51 ERROR 500 response executing GraphQL.
2022-05-12 08:28:51 ERROR {"errors":[{"message":"context deadline exceeded"}]}
2022-05-12 14:38:08 ERROR 500 response executing GraphQL.
2022-05-12 14:38:08 ERROR {"errors":[{"message":"Error 1040: Too many connections","path":["agentHeartbeat"]}],"data":{"agentHeartbeat":null}}
2022-05-12 14:39:04 ERROR 500 response executing GraphQL.
2022-05-12 14:39:04 ERROR {"errors":[{"message":"context deadline exceeded"}]}
2022-05-12 15:37:39 ERROR 502 response executing GraphQL.
2022-05-12 15:37:39 ERROR
<html><head>
<meta http-equiv="content-type" content="text/html;charset=utf-8">
<title>502 Server Error</title>
</head>
<body text=#000000 bgcolor=#ffffff>
<h1>Error: Server Error</h1>
<h2>The server encountered a temporary error and could not complete your request.<p>Please try again in 30 seconds.</h2>
<h2></h2>
</body></html>
2022-05-12 17:57:05 ERROR 500 response executing GraphQL.
2022-05-12 17:57:05 ERROR {"errors":[{"message":"context deadline exceeded"}]}
2022-05-13 07:26:18 ERROR 500 response executing GraphQL.
2022-05-13 07:26:18 ERROR {"errors":[{"message":"context deadline exceeded"}]}