juliehunter commited on
Commit
ad871e9
·
verified ·
1 Parent(s): cc725bd

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -0
README.md CHANGED
@@ -40,6 +40,8 @@ Lucie-7B-Instruct-human-data is a fine-tuned version of [Lucie-7B](), an open-so
40
 
41
  Lucie-7B-Instruct-human-data is fine-tuned on human-produced instructions collected either from open annotation campaigns or by applying templates to extant datasets. The performance of Lucie-7B-Instruct-human-data falls below that of [Lucie-7B-Instruct](https://huggingface.co/OpenLLM-France/Lucie-7B-Instruct); the interest of the model is to show what can be done to fine-tune LLMs to follow instructions without appealing to third party LLMs.
42
 
 
 
43
  While Lucie-7B-Instruct-human-data is trained on sequences of 4096 tokens, its base model, Lucie-7B has a context size of 32K tokens. Based on Needle-in-a-haystack evaluations, Lucie-7B-Instruct-human-data maintains the capacity of the base model to handle 32K-size context windows.
44
 
45
  ## Training details
 
40
 
41
  Lucie-7B-Instruct-human-data is fine-tuned on human-produced instructions collected either from open annotation campaigns or by applying templates to extant datasets. The performance of Lucie-7B-Instruct-human-data falls below that of [Lucie-7B-Instruct](https://huggingface.co/OpenLLM-France/Lucie-7B-Instruct); the interest of the model is to show what can be done to fine-tune LLMs to follow instructions without appealing to third party LLMs.
42
 
43
+ Note that both Lucie-7B-Instruct-human-data and Lucie-7B-Instruct are optimized for generation of French text. They have not been trained for code generation or optimized for math. Such capacities can be improved through further fine-tuning and alignment with methods such as DPO, RLHF, etc.
44
+
45
  While Lucie-7B-Instruct-human-data is trained on sequences of 4096 tokens, its base model, Lucie-7B has a context size of 32K tokens. Based on Needle-in-a-haystack evaluations, Lucie-7B-Instruct-human-data maintains the capacity of the base model to handle 32K-size context windows.
46
 
47
  ## Training details