|
--- |
|
license: cc-by-nc-4.0 |
|
datasets: |
|
- bertin-project/alpaca-spanish |
|
language: |
|
- es |
|
inference: false |
|
--- |
|
|
|
|
|
# Model Card for Model ID |
|
|
|
This model is the Llama-2-13b-hf fine-tuned with an adapter on the Spanish Alpaca dataset. |
|
|
|
## Model Details |
|
|
|
### Model Description |
|
|
|
This is a Spanish chat model fine-tuned on a Spanish instruction dataset. |
|
|
|
The model expect a prompt containing the instruction, with an option to add an input (see examples below). |
|
|
|
|
|
|
|
- **Developed by:** 4i Intelligent Insights |
|
- **Model type:** Chat model |
|
- **Language(s) (NLP):** Spanish |
|
- **License:** cc-by-nc-4.0 (inhereted from the alpaca-spanish dataset), |
|
- **Finetuned from model :** Llama 2 13B ([license agreement](https://ai.meta.com/resources/models-and-libraries/llama-downloads/)) |
|
|
|
|
|
## Uses |
|
|
|
The model is intended to be used directly without the need of further fine-tuning. |
|
|
|
|
|
## Bias, Risks, and Limitations |
|
|
|
This model inherits the bias, risks, and limitations of its base model, Llama 2, and of the dataset used for fine-tuning. |
|
Note that the Spanish Alpaca dataset was obtained by translating the original Alpaca dataset. It contains translation errors that may have negatively impacted the fine-tuning of the model. |
|
|
|
|
|
|
|
## How to Get Started with the Model |
|
|
|
Use the code below to get started with the model for inference. The adapter was directly merged into the original Llama 2 model. |
|
|
|
|
|
The following code sample uses 4-bit quantization, you may load the model without it if you have enough VRAM. We show results for hyperparameters that we found work well for this set of prompts. |
|
|
|
```py |
|
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig, TrainingArguments, GenerationConfig |
|
import torch |
|
model_name = "4i-ai/Llama-2-13b-alpaca-es" |
|
|
|
|
|
#Tokenizer |
|
|
|
tokenizer = AutoTokenizer.from_pretrained(model_name, use_fast=True) |
|
|
|
def create_and_prepare_model(): |
|
compute_dtype = getattr(torch, "float16") |
|
bnb_config = BitsAndBytesConfig( |
|
load_in_4bit=True, |
|
bnb_4bit_quant_type="nf4", |
|
bnb_4bit_compute_dtype=compute_dtype, |
|
bnb_4bit_use_double_quant=True, |
|
) |
|
model = AutoModelForCausalLM.from_pretrained( |
|
model_name, quantization_config=bnb_config, device_map={"": 0} |
|
) |
|
return model |
|
model = create_and_prepare_model() |
|
|
|
def generate(instruction, input=None): |
|
#Format the prompt to look like the training data |
|
if input is not None: |
|
prompt = "### Instruction:\n"+instruction+"\n\n### Input:\n"+input+"\n\n### Response:\n" |
|
else : |
|
prompt = "### Instruction:\n"+instruction+"\n\n### Response:\n" |
|
|
|
|
|
inputs = tokenizer(prompt, return_tensors="pt") |
|
input_ids = inputs["input_ids"].cuda() |
|
|
|
generation_output = model.generate( |
|
input_ids=input_ids, |
|
repetition_penalty=1.5, |
|
generation_config=GenerationConfig(temperature=0.1, top_p=0.75, top_k=40, num_beams=20), #hyperparameters for generation |
|
return_dict_in_generate=True, |
|
output_scores=True, |
|
max_new_tokens=150, #maximum tokens generated, increase if you want longer asnwer (up to 2048 - the length of the prompt), generation "looks" slower for longer response |
|
|
|
) |
|
for seq in generation_output.sequences: |
|
output = tokenizer.decode(seq, skip_special_tokens=True) |
|
print(output.split("### Response:")[1].strip()) |
|
|
|
generate("Háblame de la superconductividad.") |
|
print("-----------") |
|
generate("Encuentra la capital de España.") |
|
print("-----------") |
|
generate("Encuentra la capital de Portugal.") |
|
print("-----------") |
|
generate("Organiza los números dados en orden ascendente.", "2, 3, 0, 8, 4, 10") |
|
print("-----------") |
|
generate("Compila una lista de 5 estados de EE. UU. ubicados en el Oeste.") |
|
print("-----------") |
|
generate("Compila una lista de 2 estados de EE. UU. ubicados en el Oeste.") |
|
print("-----------") |
|
generate("Compila una lista de 10 estados de EE. UU. ubicados en el Este.") |
|
print("-----------") |
|
generate("¿Cuál es el color de una fresa?") |
|
print("-----------") |
|
generate("¿Cuál es el color de la siguiente fruta?", "fresa") |
|
print("-----------") |
|
|
|
``` |
|
|
|
Expected output: |
|
|
|
``` |
|
La superconductividad es un fenómeno físico en el que los materiales pueden conducir corrientes eléctricas a bajas temperaturas sin pérdida de energía debido a la resistencia. Los materiales superconductores son capaces de conducir corrientes eléctricas a temperaturas mucho más bajas que los materiales normales. Esto se debe a que los electrones en los materiales superconductores se comportan de manera cooperativa, lo que les permite conducir corrientes eléctricas sin pérdida de energía. Los materiales superconductores tienen muchas aplicaciones |
|
----------- |
|
La capital de España es Madrid. |
|
----------- |
|
La capital de Portugal es Lisboa. |
|
----------- |
|
0, 2, 3, 4, 8, 10 |
|
----------- |
|
1. California |
|
2. Oregón |
|
3. Washington |
|
4. Nevada |
|
5. Arizona |
|
----------- |
|
California y Washington. |
|
----------- |
|
1. Maine |
|
2. Nuevo Hampshire |
|
3. Vermont |
|
4. Massachusetts |
|
5. Rhode Island |
|
6. Connecticut |
|
7. Nueva York |
|
8. Nueva Jersey |
|
9. Pensilvania |
|
10. Delaware |
|
----------- |
|
El color de una fresa es rojo brillante. |
|
----------- |
|
El color de la fresa es rojo. |
|
----------- |
|
``` |
|
|
|
|
|
|
|
|
|
|
|
## Contact Us |
|
[4i.ai](https://4i.ai/) provides natural language processing solutions with dialog, vision and voice capabilities to deliver real-life multimodal human-machine conversations. |
|
Please contact us at [email protected] |
|
|