File size: 1,847 Bytes
30a5806
4f78e11
 
30a5806
4f78e11
 
 
 
30a5806
4f78e11
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5fefd6f
4f78e11
 
 
 
 
 
 
 
 
 
 
 
75d725f
4f78e11
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
---
language:
- en
license: apache-2.0
tags:
- dialogue state tracking
- task-oriented dialog

---

# roberta-base-trippy-dst-multiwoz21

This is a TripPy model trained on [MultiWOZ 2.1](https://github.com/budzianowski/multiwoz) for use in [ConvLab-3](https://github.com/ConvLab/ConvLab-3).
This model predicts informable slots, requestable slots, general actions and domain indicator slots.
Expected joint goal accuracy for MultiWOZ 2.1 is in the range of 55-56\%.

For information about TripPy DST, refer to [TripPy: A Triple Copy Strategy for Value Independent Neural Dialog State Tracking](https://aclanthology.org/2020.sigdial-1.4/).

The training and evaluation code is available at the official [TripPy repository](https://gitlab.cs.uni-duesseldorf.de/general/dsml/trippy-public).

## Training procedure

The model was trained on MultiWOZ 2.1 data via supervised learning using the [TripPy codebase](https://gitlab.cs.uni-duesseldorf.de/general/dsml/trippy-public).
MultiWOZ 2.1 data was loaded via ConvLab-3's unified data format dataloader.
The pre-trained encoder is [RoBERTa](https://huggingface.co./docs/transformers/model_doc/roberta) (base).
Fine-tuning the encoder and training the DST specific classification heads was conducted for 10 epochs.

### Training hyperparameters

```
python3 run_dst.py \
  --task_name="unified" \
  --model_type="roberta" \
  --model_name_or_path="roberta-base" \
  --dataset_config=dataset_config/unified_multiwoz21.json \
  --do_lower_case \
  --learning_rate=1e-4 \
  --num_train_epochs=10 \
  --max_seq_length=180 \
  --per_gpu_train_batch_size=24 \
  --per_gpu_eval_batch_size=32 \
  --output_dir=results \
  --save_epochs=2 \
  --eval_all_checkpoints \
  --warmup_proportion=0.1 \
  --adam_epsilon=1e-6 \
  --weight_decay=0.01 \
  --fp16 \
  --do_train \
  --predict_type=dummy \
  --seed=42
```