DeBERTa commited on
Commit
9a44a41
1 Parent(s): 32390a1

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -3
README.md CHANGED
@@ -12,7 +12,7 @@ widget:
12
 
13
  ## DeBERTa: Decoding-enhanced BERT with Disentangled Attention
14
 
15
- [DeBERTa](https://arxiv.org/abs/2006.03654) improves the BERT and RoBERTa models using disentangled attention and enhanced mask decoder. With those two improvements, DeBERTa out perform RoBERTa on a majority of NLU tasks with 80GB training data.
16
 
17
  Please check the [official repository](https://github.com/microsoft/DeBERTa) for more details and updates.
18
 
@@ -40,8 +40,8 @@ We present the dev results on SQuAD 1.1/2.0 and several GLUE benchmark tasks.
40
  ```bash
41
  cd transformers/examples/text-classification/
42
  export TASK_NAME=mrpc
43
- python -m torch.distributed.launch --nproc_per_node=8 run_glue.py --model_name_or_path microsoft/deberta-v2-xxlarge \
44
- --task_name $TASK_NAME --do_train --do_eval --max_seq_length 128 --per_device_train_batch_size 4 \
45
  --learning_rate 3e-6 --num_train_epochs 3 --output_dir /tmp/$TASK_NAME/ --overwrite_output_dir --sharded_ddp --fp16
46
  ```
47
 
 
12
 
13
  ## DeBERTa: Decoding-enhanced BERT with Disentangled Attention
14
 
15
+ [DeBERTa](https://arxiv.org/abs/2006.03654) improves the BERT and RoBERTa models using disentangled attention and enhanced mask decoder. It outperforms BERT and RoBERTa on majority of NLU tasks with 80GB training data.
16
 
17
  Please check the [official repository](https://github.com/microsoft/DeBERTa) for more details and updates.
18
 
 
40
  ```bash
41
  cd transformers/examples/text-classification/
42
  export TASK_NAME=mrpc
43
+ python -m torch.distributed.launch --nproc_per_node=8 run_glue.py --model_name_or_path microsoft/deberta-v2-xxlarge \\
44
+ --task_name $TASK_NAME --do_train --do_eval --max_seq_length 128 --per_device_train_batch_size 4 \\
45
  --learning_rate 3e-6 --num_train_epochs 3 --output_dir /tmp/$TASK_NAME/ --overwrite_output_dir --sharded_ddp --fp16
46
  ```
47