Update README.md
Browse files
README.md
CHANGED
@@ -29,6 +29,41 @@ We present the dev results on SQuAD 1.1/2.0 and MNLI tasks.
|
|
29 |
| **DeBERTa-v3-large** | -/- | 91.5/89.0 | **91.9** |
|
30 |
| DeBERTa-v2-xxlarge|96.1/91.4 |**92.2/89.7** | 91.7 |
|
31 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
32 |
### Citation
|
33 |
|
34 |
If you find DeBERTa useful for your work, please cite the following paper:
|
|
|
29 |
| **DeBERTa-v3-large** | -/- | 91.5/89.0 | **91.9** |
|
30 |
| DeBERTa-v2-xxlarge|96.1/91.4 |**92.2/89.7** | 91.7 |
|
31 |
|
32 |
+
#### Fine-tuning with HF transformers
|
33 |
+
|
34 |
+
```bash
|
35 |
+
#!/bin/bash
|
36 |
+
|
37 |
+
cd transformers/examples/pytorch/text-classification/
|
38 |
+
|
39 |
+
pip install datasets
|
40 |
+
export TASK_NAME=mnli
|
41 |
+
|
42 |
+
output_dir="ds_results"
|
43 |
+
|
44 |
+
num_gpus=8
|
45 |
+
|
46 |
+
batch_size=8
|
47 |
+
|
48 |
+
python -m torch.distributed.launch --nproc_per_node=${num_gpus} \
|
49 |
+
run_glue.py \
|
50 |
+
--model_name_or_path microsoft/deberta-v3-large \
|
51 |
+
--task_name $TASK_NAME \
|
52 |
+
--do_train \
|
53 |
+
--do_eval \
|
54 |
+
--evaluation_strategy steps \
|
55 |
+
--max_seq_length 256 \
|
56 |
+
--warmup_steps 1000 \
|
57 |
+
--per_device_train_batch_size ${batch_size} \
|
58 |
+
--learning_rate 6e-6 \
|
59 |
+
--num_train_epochs 2 \
|
60 |
+
--output_dir $output_dir \
|
61 |
+
--overwrite_output_dir \
|
62 |
+
--logging_steps 1000 \
|
63 |
+
--logging_dir $output_dir
|
64 |
+
|
65 |
+
```
|
66 |
+
|
67 |
### Citation
|
68 |
|
69 |
If you find DeBERTa useful for your work, please cite the following paper:
|