Farnazgh commited on
Commit
ffcbf4c
1 Parent(s): b3af6f4

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +36 -12
README.md CHANGED
@@ -1,7 +1,19 @@
1
- **Transforming Question-Answer Pairs to Full Declarative Answer Form**
 
 
 
 
 
 
 
 
 
 
 
2
 
3
  Considering the question of "Which drug did you take?" and the answer of "Doliprane", the aim of this model is to derive a full answer to the question "I took Doliprane".
4
 
 
5
  We fine-tune T5 (Raffel et al.,2019), a pre-trained encoder-decoder model, on two datasets of (question, incomplete answer, full answer) triples, one for wh- and one for yes-no(YN) questions. For wh-questions, we use 3,300 entries of the dataset consisting of (question, answer, declarative answer sentence) triples gathered by Demszky et al. (2018) using Amazon Mechanical Turk workers. For YN questions, we used the SAMSum corpus, (Gliwa et al., 2019) which contains short dialogs in chit-chat format. We created
6
  1,100 (question, answer, full answer) triples by au-
7
  tomatically extracting YN (question, answer) pairs
@@ -11,15 +23,27 @@ was splitted into train and test (9:1) and the fine-
11
  tuned model achieved 0.90 ROUGE-L score on the
12
  test set.
13
 
14
- In paper "Exploring the Influence of Dialog Input Format for Unsupervised Clinical Questionnaire Filling", the model was applied in information-seeking dialogs to form the declarative transform of the whole dialog.
 
15
 
16
- ---
17
- tags:
18
- - question answering
19
- - t5
20
- - declarative
21
- - text generation
22
- widget:
23
- - text: "q: do you have any hobbies that you like to do while you are at home ? a: watching a online shows"
24
- - text: "q: do you watch a lot of comedy ? a: yes it will helpful the mind relaxation"
25
- ---
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - question answering
4
+ - t5
5
+ - declarative
6
+ - text generation
7
+ widget:
8
+ - text: "q: do you have any hobbies that you like to do while you are at home ? a: watching online shows"
9
+ - text: "q: do you watch a lot of comedy ? a: yes it will helpful the mind relaxation"
10
+ ---
11
+
12
+ # Transforming Question-Answer Pairs to Full Declarative Answer Form
13
 
14
  Considering the question of "Which drug did you take?" and the answer of "Doliprane", the aim of this model is to derive a full answer to the question "I took Doliprane".
15
 
16
+ ## Model training
17
  We fine-tune T5 (Raffel et al.,2019), a pre-trained encoder-decoder model, on two datasets of (question, incomplete answer, full answer) triples, one for wh- and one for yes-no(YN) questions. For wh-questions, we use 3,300 entries of the dataset consisting of (question, answer, declarative answer sentence) triples gathered by Demszky et al. (2018) using Amazon Mechanical Turk workers. For YN questions, we used the SAMSum corpus, (Gliwa et al., 2019) which contains short dialogs in chit-chat format. We created
18
  1,100 (question, answer, full answer) triples by au-
19
  tomatically extracting YN (question, answer) pairs
 
23
  tuned model achieved 0.90 ROUGE-L score on the
24
  test set.
25
 
26
+ ## Model Details
27
+ Model was developed as one of the proposed modules in the following paper for dialog transformation.
28
 
29
+ ```"F. Ghassemi Toudeshki, A. Liednikova, P Jolivet and C. Gardent, Exploring the Influence of Dialog Input Format for Unsupervised Clinical Questionnaire Filling, Proceedings of the 13th International Workshop on Health Text Mining and Information Analysis (co-located with EMNLP 2022), Abu Dhabi, 7 December 2022."```
30
+
31
+ It was used to transform question-answer paris in information-seeking dialogs to declarative form and at the end to have the declarative transform of the whole dialog.
32
+
33
+
34
+ ## Test the model
35
+ ```python
36
+ from transformers import AutoModelWithLMHead, AutoTokenizer
37
+
38
+ tokenizer = AutoTokenizer.from_pretrained("Farnazgh/QA2D")
39
+ model = AutoModelWithLMHead.from_pretrained("Farnazgh/QA2D")
40
+
41
+ def transform_qa2d(question, answer, max_length=150):
42
+
43
+ text = "q: "+question+" a: "+answer
44
+ input_ids = tokenizer.encode(text, return_tensors="pt", add_special_tokens=True)
45
+ generated_ids = model.generate(input_ids=input_ids, num_beams=2, max_length=max_length, early_stopping=True)[0]
46
+ preds = tokenizer.decode(generated_ids, skip_special_tokens=True, clean_up_tokenization_spaces=True)
47
+
48
+ return preds
49
+ ```