Update README.md
Browse files
README.md
CHANGED
@@ -20,7 +20,7 @@ inference: false
|
|
20 |
<img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
|
21 |
</a>
|
22 |
|
23 |
-
This model is a fine-tuned version of [facebook/opt-2.7b](https://huggingface.co/facebook/opt-2.7b) on about 80k
|
24 |
|
25 |
Test it out on Google Colab by clicking the button above.
|
26 |
|
@@ -32,17 +32,23 @@ Test it out on Google Colab by clicking the button above.
|
|
32 |
- Seems to do a lot better than GPT-Neo with similar training parameters
|
33 |
- you can create your own digital clone and deploy it leveraging [this repository I am working on](https://github.com/pszemraj/ai-msgbot).
|
34 |
|
|
|
|
|
|
|
|
|
|
|
|
|
35 |
## Intended uses & limitations
|
36 |
|
37 |
-
> The base model has a custom license
|
38 |
|
39 |
-
- the model is probably too large to use via API here. Use in Python with GPU RAM / CPU RAM > 12
|
40 |
- alternatively, you can message [a bot on telegram](http://t.me/GPTPeter_bot) where I test LLMs for dialogue generation
|
41 |
- **any statements or claims made by this model do not reflect actual claims/statements by me.** Keep in mind it is a _fine-tuned_ version of the model on my data, so things from pre-training are also present in outputs.
|
42 |
|
43 |
## Training and evaluation data
|
44 |
|
45 |
-
WhatsApp & iMessage parsed using [ai-msgbot](https://github.com/pszemraj/ai-msgbot) and then fed as a text dataset to the HF trainer.
|
46 |
|
47 |
## Training procedure
|
48 |
|
|
|
20 |
<img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
|
21 |
</a>
|
22 |
|
23 |
+
This model is a fine-tuned version of [facebook/opt-2.7b](https://huggingface.co/facebook/opt-2.7b) on about 80k WhatsApp/text messages (mine). Please use responsibly :)
|
24 |
|
25 |
Test it out on Google Colab by clicking the button above.
|
26 |
|
|
|
32 |
- Seems to do a lot better than GPT-Neo with similar training parameters
|
33 |
- you can create your own digital clone and deploy it leveraging [this repository I am working on](https://github.com/pszemraj/ai-msgbot).
|
34 |
|
35 |
+
### sharded checkpoint
|
36 |
+
|
37 |
+
As this model file is 10+ GB, it can impose some constraints with lower RAM runtimes and/or download speeds. To help with this issue, a sharded checkpoint of this model is available [here](https://huggingface.co/pszemraj/opt-peter-2.7B-sharded).
|
38 |
+
|
39 |
+
The `pszemraj/opt-peter-2.7B-sharded` model can be used as a drop-in replacement for this one for all use cases.
|
40 |
+
|
41 |
## Intended uses & limitations
|
42 |
|
43 |
+
> The base model has a custom license that propagates to this one. **Most importantly, it cannot be used commercially**. Read more here: [facebook/opt-2.7b](https://huggingface.co/facebook/opt-2.7b)
|
44 |
|
45 |
+
- the model is probably too large to use via API here. Use in Python with GPU RAM / CPU RAM > 12 GB, Colab notebook linked above.
|
46 |
- alternatively, you can message [a bot on telegram](http://t.me/GPTPeter_bot) where I test LLMs for dialogue generation
|
47 |
- **any statements or claims made by this model do not reflect actual claims/statements by me.** Keep in mind it is a _fine-tuned_ version of the model on my data, so things from pre-training are also present in outputs.
|
48 |
|
49 |
## Training and evaluation data
|
50 |
|
51 |
+
WhatsApp & iMessage data were parsed using [ai-msgbot](https://github.com/pszemraj/ai-msgbot) and then fed as a text dataset to the HF trainer.
|
52 |
|
53 |
## Training procedure
|
54 |
|