AuriAetherwiing commited on
Commit
39c9cb4
1 Parent(s): 5496a82

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -27,7 +27,7 @@ base_model:
27
  This version seems to be more or less optimal for the current data. It (again) started crashing on each checkpoint after some point, but it was less of a problem this time, as eval/loss already flatlined by that time. This is epoch 2.7 checkpoint.
28
  </p>
29
 
30
- <p>Note: using quantized KV cache with Qwen2.5 <b>is not recommended</b> and can lead to degraded output quality. On the other hand, Qwen's KV cache is already light enough (2x lighter than Mistral Nemo's), so using f16 for it shouldn't be problematic.</p>
31
 
32
  <p>
33
  <p>Prompt format is ChatML.</p><br>
 
27
  This version seems to be more or less optimal for the current data. It (again) started crashing on each checkpoint after some point, but it was less of a problem this time, as eval/loss already flatlined by that time. This is epoch 2.7 checkpoint.
28
  </p>
29
 
30
+ <p>Note: using quantized KV cache with Qwen2.5 <b>is not recommended</b> and can lead to degraded output quality. On the other hand, Qwen's KV cache is already light enough, so using f16 for it shouldn't be problematic.</p>
31
 
32
  <p>
33
  <p>Prompt format is ChatML.</p><br>