Ramji commited on
Commit
80a54ed
1 Parent(s): 559fd43

Updated ReadMe

Browse files
Files changed (1) hide show
  1. README.md +20 -1
README.md CHANGED
@@ -18,9 +18,14 @@ This is a Text-Summarization model finetuned for medical-summary data using bart
18
  Instance Details:
19
  - Kaggle Notebook with TPUx2 instance
20
 
 
 
 
 
 
21
  Training Code - [Notebook](https://www.kaggle.com/code/ramjib/llm-finetuning-for-text-summarization?scriptVersionId=197771806)
22
 
23
- How to use:
24
 
25
  ```python
26
  from transformers import pipeline
@@ -49,3 +54,17 @@ print(summarizer(ARTICLE, max_length=130, min_length=30, do_sample=False))
49
  >>> [{'summary_text': 'Liana Barrientos, 39, is charged with two counts of "offering a false instrument for filing in the first degree" In total, she has been married 10 times, with nine of her marriages occurring between 1999 and 2002. She is believed to still be married to four men.'}]
50
  ```
51
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
18
  Instance Details:
19
  - Kaggle Notebook with TPUx2 instance
20
 
21
+ Why BART:
22
+
23
+ Language model in transformer consists of three types encoder-only, encoder-decoder, decoder-only model. Each type is suitable for different tasks
24
+ As text summarization comes under seq2seq type, we need encoder-decoder based architecture. As Bart comes with top score in hugging face summarization i have preferred that.
25
+
26
  Training Code - [Notebook](https://www.kaggle.com/code/ramjib/llm-finetuning-for-text-summarization?scriptVersionId=197771806)
27
 
28
+ **How to use:**
29
 
30
  ```python
31
  from transformers import pipeline
 
54
  >>> [{'summary_text': 'Liana Barrientos, 39, is charged with two counts of "offering a false instrument for filing in the first degree" In total, she has been married 10 times, with nine of her marriages occurring between 1999 and 2002. She is believed to still be married to four men.'}]
55
  ```
56
 
57
+ **Deployment**
58
+ - APP:
59
+ - [Streamlit](https://huggingface.co/spaces/Ramji/Bart-Medical-summary)
60
+ - [Flask APP](https://huggingface.co/spaces/Ramji/Bart-CNN-Medical-summary-Flask)
61
+ - Code Repo:
62
+ - [Streamlit](https://huggingface.co/spaces/Ramji/Bart-Medical-summary/tree/main)
63
+ - [Flask APP](https://huggingface.co/spaces/Ramji/Bart-CNN-Medical-summary-Flask/tree/main)
64
+
65
+ **Limitations**
66
+ - Deployed on 16 CPU so **time** is higher (GPU based deployment prefered)
67
+ - Model needs to be trained for longer (currently 1 epoch trained)
68
+ - summary features are not captured as the data output expects (Subject, Adjective, Assessment and Plan - SOAP style)
69
+ - Instruction column from data has to be used for better accuracy and abstraction understanding
70
+ - Scalabilty needs to be handled