|
# Vicuna-13B-V1.1 |
|
|
|
Vicuna 13B model weights. |
|
|
|
- 2023.04.16 Obtain the Vicuna weights by merging the LLaMA-13B model and Vicuna delta weights v1.1, and upload to the huggingfae.co model repository https://huggingface.co./uukuguy/vicuna-13b-v1.1 |
|
|
|
```bash |
|
# Make sure you have git-lfs installed (https://git-lfs.com) |
|
git lfs install |
|
git clone https://huggingface.co./uukuguy/vicuna-13b-v1.1 |
|
|
|
# if you want to clone without large files – just their pointers |
|
# prepend your git clone with the following env var: |
|
GIT_LFS_SKIP_SMUDGE=1 |
|
``` |
|
|
|
## Model Card |
|
|
|
### Model details |
|
Model type: Vicuna is an open-source chatbot trained by fine-tuning LLaMA on user-shared conversations collected from ShareGPT. It is an auto-regressive language model, based on the transformer architecture. |
|
|
|
Model date: Vicuna-13B-V1.1 weights was merged in April 2023. |
|
|
|
Organizations developing the model: The Vicuna team with members from UC Berkeley, CMU, Stanford, and UC San Diego. |
|
|
|
Paper or resources for more information: https://vicuna.lmsys.org/ |
|
|
|
License: Apache License 2.0 |
|
|
|
Where to send questions or comments about the model: https://github.com/uukuguy/Vicuna-LoRA/issues |
|
|
|
### Intended use |
|
Primary intended uses: The primary use of Vicuna is research on large language models and chatbots. |
|
|
|
Primary intended users: The primary intended users of the model are researchers and hobbyists in natural language processing, machine learning, and artificial intelligence. |
|
|
|
### Major updates of weights v1.1 |
|
Refactor the tokenization and separator. In Vicuna v1.1, the separator has been changed from "###" to the EOS token "</s>". This change makes it easier to determine the generation stop criteria and enables better compatibility with other libraries. |
|
Fix the supervised fine-tuning loss computation for better model quality. |
|
|
|
|