ZennyKenny
commited on
Commit
•
38d172e
1
Parent(s):
039e377
Update blog post link
Browse filesWithout protocol, HF tries to build a relative link.
README.md
CHANGED
@@ -45,7 +45,7 @@ The following models are finetuned on MPT-7B:
|
|
45 |
* [MPT-7B-StoryWriter-65k+](https://huggingface.co/mosaicml/mpt-7b-storywriter): a model designed to read and write fictional stories with super long context lengths.
|
46 |
Built by finetuning MPT-7B with a context length of 65k tokens on a filtered fiction subset of the [books3 dataset](https://huggingface.co/datasets/the_pile_books3).
|
47 |
At inference time, thanks to [ALiBi](https://arxiv.org/abs/2108.12409), MPT-7B-StoryWriter-65k+ can extrapolate even beyond 65k tokens.
|
48 |
-
We demonstrate generations as long as 80k tokens on a single A100-80GB GPU in our [blogpost](www.mosaicml.com/blog/mpt-7b).
|
49 |
* License: Apache 2.0
|
50 |
|
51 |
* [MPT-7B-Instruct](https://huggingface.co/mosaicml/mpt-7b-instruct): a model for short-form instruction following.
|
|
|
45 |
* [MPT-7B-StoryWriter-65k+](https://huggingface.co/mosaicml/mpt-7b-storywriter): a model designed to read and write fictional stories with super long context lengths.
|
46 |
Built by finetuning MPT-7B with a context length of 65k tokens on a filtered fiction subset of the [books3 dataset](https://huggingface.co/datasets/the_pile_books3).
|
47 |
At inference time, thanks to [ALiBi](https://arxiv.org/abs/2108.12409), MPT-7B-StoryWriter-65k+ can extrapolate even beyond 65k tokens.
|
48 |
+
We demonstrate generations as long as 80k tokens on a single A100-80GB GPU in our [blogpost](https://www.mosaicml.com/blog/mpt-7b).
|
49 |
* License: Apache 2.0
|
50 |
|
51 |
* [MPT-7B-Instruct](https://huggingface.co/mosaicml/mpt-7b-instruct): a model for short-form instruction following.
|