Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,56 @@
|
|
1 |
-
---
|
2 |
-
license: apache-2.0
|
3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: apache-2.0
|
3 |
+
base_model:
|
4 |
+
- speakleash/Bielik-11B-v2.3-Instruct
|
5 |
+
pipeline_tag: text-generation
|
6 |
+
tags:
|
7 |
+
- medit-merge
|
8 |
+
---
|
9 |
+
|
10 |
+
<div align="center">
|
11 |
+
<img src="https://i.ibb.co/YLfCzXR/imagine-image-c680e106-e404-45e5-98da-af700ffe41f4.png" alt="Llama-3.2-MedIT-SUN-2.5B" style="border-radius: 10px; box-shadow: 0 4px 8px 0 rgba(0, 0, 0, 0.2), 0 6px 20px 0 rgba(0, 0, 0, 0.19); max-width: 100%; height: auto;">
|
12 |
+
</div>
|
13 |
+
|
14 |
+
# Marsh Harrier
|
15 |
+
|
16 |
+
The Marsh Harrier (MSH) is a language model developed by MedIT Solutions using an advanced checkpoint merging technique. It represents a novel fusion of the Speakleash Bielik 11B v2.3 and Speakleash Bielik 11B v2 models, employing our proprietary weight-merging methodology.
|
17 |
+
|
18 |
+
## Key Features:
|
19 |
+
- Built on a pioneering approach to neural network weight fusion
|
20 |
+
- Supports merging models of identical parameter counts while maintaining architecture flexibility
|
21 |
+
- Demonstrates superior performance compared to its base models
|
22 |
+
- Optimized for Polish language understanding and generation
|
23 |
+
|
24 |
+
## Performance:
|
25 |
+
The model shows significant improvements over its predecessors across multiple metrics in the Open PL LLM Leaderboard evaluation framework (0-shot and 5-shot), which is part of the SpeakLeash.org open-science initiative.
|
26 |
+
|
27 |
+
Technical Details:
|
28 |
+
- Base Models: Speakleash Bielik 11B v2.3 and Bielik 11B v2 (https://huggingface.co/speakleash/Bielik-11B-v2.3-Instruct)
|
29 |
+
- Architecture: Compatible with original Bielik architecture
|
30 |
+
- Parameter Count: 11 billion parameters
|
31 |
+
- Special Feature: Utilizes MedIT Solutions' proprietary checkpoint merging technology
|
32 |
+
|
33 |
+
This model represents a step forward in developing the Polish language, demonstrating how merging techniques can enhance model performance while maintaining architectural efficiency.
|
34 |
+
|
35 |
+
# Polish LLM Open Leaderboard
|
36 |
+
|
37 |
+
Sentiment Analysis (PolEmo2):
|
38 |
+
- In-domain accuracy: Matches Bielik at 77.70%
|
39 |
+
- Out-of-domain accuracy: Improved performance at 79.76% (vs 79.35%)
|
40 |
+
|
41 |
+
Text Classification Tasks:
|
42 |
+
- 8tags classification: Significant improvement of ~3pp (76.14% vs 73.17%)
|
43 |
+
- Belebele benchmark: Matching performance at 88.56%
|
44 |
+
- CBD task: Substantial F1 score improvement by 10pp (23.91% vs 13.73%)
|
45 |
+
|
46 |
+
Language Understanding:
|
47 |
+
- DYK ("Did you know..."): Improved F1 score (69.77% vs 69.14%)
|
48 |
+
- Named Entity Recognition (KLEJ NER): Notable improvement of ~8pp (45.53% vs 37.61%)
|
49 |
+
- PolQA reranking: Slight decrease (81.99% vs 83.21%)
|
50 |
+
- PPC: Enhanced accuracy (78.00% vs 77.20%)
|
51 |
+
- PSC: Minor F1 score decrease (90.46% vs 93.63%)
|
52 |
+
|
53 |
+
Overall Performance:
|
54 |
+
MSH-v1 achieves a higher average score of 71.18% compared to Bielik v2.3's 69.33%, demonstrating the effectiveness of our checkpoint merging technique in improving model performance across diverse NLP tasks.
|
55 |
+
|
56 |
+
All evaluations were conducted using the Open PL LLM Leaderboard framework (0-shot) as part of the SpeakLeash.org open-science initiative.
|