EVA-UNIT-01
/

EVA-Qwen2.5-14B-v0.1

Model card Files Files and versions Community

AuriAetherwiing commited on 13 days ago

Commit

39c9cb4

•

1 Parent(s): 5496a82

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -27,7 +27,7 @@ base_model:
   This version seems to be more or less optimal for the current data. It (again) started crashing on each checkpoint after some point, but it was less of a problem this time, as eval/loss already flatlined by that time. This is epoch 2.7 checkpoint.
 </p>
-<p>Note: using quantized KV cache with Qwen2.5 <b>is not recommended</b> and can lead to degraded output quality. On the other hand, Qwen's KV cache is already light enough (2x lighter than Mistral Nemo's), so using f16 for it shouldn't be problematic.</p>
 <p>
   <p>Prompt format is ChatML.</p><br>

   This version seems to be more or less optimal for the current data. It (again) started crashing on each checkpoint after some point, but it was less of a problem this time, as eval/loss already flatlined by that time. This is epoch 2.7 checkpoint.
 </p>
+<p>Note: using quantized KV cache with Qwen2.5 <b>is not recommended</b> and can lead to degraded output quality. On the other hand, Qwen's KV cache is already light enough, so using f16 for it shouldn't be problematic.</p>
 <p>
   <p>Prompt format is ChatML.</p><br>