BramVanroy
/

falcon-40b-ft-alpaca-dolly-dutch

Text Generation

text-generation-inference

Model card Files Files and versions Community

BramVanroy commited on Jul 6, 2023

Commit

14cc9c3

·

1 Parent(s): 113d430

Update README.md

Files changed (1) hide show

README.md +15 -14

README.md CHANGED Viewed

@@ -1,37 +1,38 @@
 ---
-license: apache-2.0
-tags:
-- generated_from_trainer
 datasets:
 - BramVanroy/alpaca-dolly-dutch
 model-index:
-- name: 2e-4lr+512tbs+5ep+32a+4r-v2
   results: []
 ---
-<!-- This model card has been generated automatically according to the information the Trainer had access to. You
-should probably proofread and complete it, then remove this comment. -->
-# 2e-4lr+512tbs+5ep+32a+4r-v2
 This model is a fine-tuned version of [tiiuae/falcon-40b](https://huggingface.co/tiiuae/falcon-40b) on the BramVanroy/alpaca-dolly-dutch dataset.
-It achieves the following results on the evaluation set:
-- Loss: 0.9154
-## Model description
-More information needed
 ## Intended uses & limitations
-More information needed
 ## Training and evaluation data
-More information needed
 ## Training procedure
 ### Training hyperparameters
 The following hyperparameters were used during training:
@@ -47,7 +48,7 @@ The following hyperparameters were used during training:
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 - lr_scheduler_warmup_steps: 150
-- num_epochs: 5
 ### Training results

 ---
+license: cc-by-nc-4.0
 datasets:
 - BramVanroy/alpaca-dolly-dutch
+language:
+- nl
+inference: false
 model-index:
+- name: falcon-7b-ft-alpaca-cleaned-dutch
   results: []
 ---
+# falcon-40b-ft-alpaca-dolly-dutch
+## Model description
 This model is a fine-tuned version of [tiiuae/falcon-40b](https://huggingface.co/tiiuae/falcon-40b) on the BramVanroy/alpaca-dolly-dutch dataset.
+This model is a fine-tuned version of [tiiuae/falcon-40b](https://huggingface.co/tiiuae/falcon-40b) on the [BramVanroy/alpaca-dolly-dutch](https://huggingface.co/datasets/BramVanroy/alpaca-dolly-dutch) dataset.
+See the original [tiiuae/falcon-40b](https://huggingface.co/tiiuae/falcon-40b) for more information, intended use, and biases.
 ## Intended uses & limitations
+This model is intended as a (poor) baseline for Dutch generative LLMs. It by no means aims to provide SOTA performance and is specifically intended for research purposes and experimentation.
 ## Training and evaluation data
+Trained on the synthetic [BramVanroy/alpaca-dolly-dutch](https://huggingface.co/datasets/BramVanroy/alpaca-dolly-dutch) instruction dataset.
+Therefore, commercial use of this model is forbidden. The model is intended for research purposes only.
 ## Training procedure
+Trained with LoRA and merged before upload. The adapters are in the `adapters` branch.
 ### Training hyperparameters
 The following hyperparameters were used during training:
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 - lr_scheduler_warmup_steps: 150
+- num_epochs: 5 (but with early stopping)
 ### Training results