Updated ReadMe
Browse files
README.md
CHANGED
@@ -18,9 +18,14 @@ This is a Text-Summarization model finetuned for medical-summary data using bart
|
|
18 |
Instance Details:
|
19 |
- Kaggle Notebook with TPUx2 instance
|
20 |
|
|
|
|
|
|
|
|
|
|
|
21 |
Training Code - [Notebook](https://www.kaggle.com/code/ramjib/llm-finetuning-for-text-summarization?scriptVersionId=197771806)
|
22 |
|
23 |
-
How to use
|
24 |
|
25 |
```python
|
26 |
from transformers import pipeline
|
@@ -49,3 +54,17 @@ print(summarizer(ARTICLE, max_length=130, min_length=30, do_sample=False))
|
|
49 |
>>> [{'summary_text': 'Liana Barrientos, 39, is charged with two counts of "offering a false instrument for filing in the first degree" In total, she has been married 10 times, with nine of her marriages occurring between 1999 and 2002. She is believed to still be married to four men.'}]
|
50 |
```
|
51 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
18 |
Instance Details:
|
19 |
- Kaggle Notebook with TPUx2 instance
|
20 |
|
21 |
+
Why BART:
|
22 |
+
|
23 |
+
Language model in transformer consists of three types encoder-only, encoder-decoder, decoder-only model. Each type is suitable for different tasks
|
24 |
+
As text summarization comes under seq2seq type, we need encoder-decoder based architecture. As Bart comes with top score in hugging face summarization i have preferred that.
|
25 |
+
|
26 |
Training Code - [Notebook](https://www.kaggle.com/code/ramjib/llm-finetuning-for-text-summarization?scriptVersionId=197771806)
|
27 |
|
28 |
+
**How to use:**
|
29 |
|
30 |
```python
|
31 |
from transformers import pipeline
|
|
|
54 |
>>> [{'summary_text': 'Liana Barrientos, 39, is charged with two counts of "offering a false instrument for filing in the first degree" In total, she has been married 10 times, with nine of her marriages occurring between 1999 and 2002. She is believed to still be married to four men.'}]
|
55 |
```
|
56 |
|
57 |
+
**Deployment**
|
58 |
+
- APP:
|
59 |
+
- [Streamlit](https://huggingface.co/spaces/Ramji/Bart-Medical-summary)
|
60 |
+
- [Flask APP](https://huggingface.co/spaces/Ramji/Bart-CNN-Medical-summary-Flask)
|
61 |
+
- Code Repo:
|
62 |
+
- [Streamlit](https://huggingface.co/spaces/Ramji/Bart-Medical-summary/tree/main)
|
63 |
+
- [Flask APP](https://huggingface.co/spaces/Ramji/Bart-CNN-Medical-summary-Flask/tree/main)
|
64 |
+
|
65 |
+
**Limitations**
|
66 |
+
- Deployed on 16 CPU so **time** is higher (GPU based deployment prefered)
|
67 |
+
- Model needs to be trained for longer (currently 1 epoch trained)
|
68 |
+
- summary features are not captured as the data output expects (Subject, Adjective, Assessment and Plan - SOAP style)
|
69 |
+
- Instruction column from data has to be used for better accuracy and abstraction understanding
|
70 |
+
- Scalabilty needs to be handled
|