vicuna-13b-v1.1 / README.md
ootb's picture
Init
eb23f7c
|
raw
history blame
1.81 kB

Vicuna-13B-V1.1

Vicuna 13B model weights.

# Make sure you have git-lfs installed (https://git-lfs.com)
git lfs install
git clone https://huggingface.co./uukuguy/vicuna-13b-v1.1

# if you want to clone without large files – just their pointers
# prepend your git clone with the following env var:
GIT_LFS_SKIP_SMUDGE=1

Model Card

Model details

Model type: Vicuna is an open-source chatbot trained by fine-tuning LLaMA on user-shared conversations collected from ShareGPT. It is an auto-regressive language model, based on the transformer architecture.

Model date: Vicuna-13B-V1.1 weights was merged in April 2023.

Organizations developing the model: The Vicuna team with members from UC Berkeley, CMU, Stanford, and UC San Diego.

Paper or resources for more information: https://vicuna.lmsys.org/

License: Apache License 2.0

Where to send questions or comments about the model: https://github.com/uukuguy/Vicuna-LoRA/issues

Intended use

Primary intended uses: The primary use of Vicuna is research on large language models and chatbots.

Primary intended users: The primary intended users of the model are researchers and hobbyists in natural language processing, machine learning, and artificial intelligence.

Major updates of weights v1.1

Refactor the tokenization and separator. In Vicuna v1.1, the separator has been changed from "###" to the EOS token "". This change makes it easier to determine the generation stop criteria and enables better compatibility with other libraries. Fix the supervised fine-tuning loss computation for better model quality.