mkurman commited on
Commit
c13a8ce
1 Parent(s): 08bd8c6

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +61 -3
README.md CHANGED
@@ -1,3 +1,61 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ base_model:
4
+ - speakleash/Bielik-11B-v2.3-Instruct
5
+ pipeline_tag: text-generation
6
+ tags:
7
+ - medit-merge
8
+ language:
9
+ - pl
10
+ - en
11
+ ---
12
+
13
+ <div align="center">
14
+ <img src="https://i.ibb.co/YLfCzXR/imagine-image-c680e106-e404-45e5-98da-af700ffe41f4.png" alt="Llama-3.2-MedIT-SUN-2.5B" style="border-radius: 10px; box-shadow: 0 4px 8px 0 rgba(0, 0, 0, 0.2), 0 6px 20px 0 rgba(0, 0, 0, 0.19); max-width: 100%; height: auto;">
15
+ </div>
16
+
17
+ # Marsh Harrier
18
+
19
+ The Marsh Harrier (MSH) is a language model developed by MedIT Solutions using an advanced checkpoint merging technique. It represents a novel fusion of the Speakleash Bielik 11B v2.3 Instruct and Speakleash Bielik 11B v2 models, employing our proprietary weight-merging methodology.
20
+
21
+ ## Key Features:
22
+ - Built on a pioneering approach to neural network weight fusion
23
+ - Supports merging models of identical parameter counts while maintaining architecture flexibility
24
+ - Demonstrates superior performance compared to its base models
25
+ - Optimized for Polish language understanding and generation
26
+
27
+ ## Performance:
28
+ The model shows significant improvements over its predecessors across multiple metrics in the Open PL LLM Leaderboard evaluation framework (0-shot), which is part of the SpeakLeash.org open-science initiative.
29
+
30
+ Technical Details:
31
+ - Base Models: [Speakleash Bielik 11B v2.3 Instruct](https://huggingface.co/speakleash/Bielik-11B-v2.3-Instruct) and [Bielik 11B v2](https://huggingface.co/speakleash/Bielik-11B-v2)
32
+ - Architecture: Compatible with original Bielik architecture
33
+ - Parameter Count: 11 billion parameters
34
+ - Special Feature: Utilizes MedIT Solutions' proprietary checkpoint merging technology
35
+
36
+ This model represents a step forward in developing the Polish language, demonstrating how merging techniques can enhance model performance while maintaining architectural efficiency.
37
+
38
+ # Polish LLM Open Leaderboard
39
+
40
+ Sentiment Analysis (PolEmo2):
41
+ - In-domain accuracy: Matches Bielik at 77.70%
42
+ - Out-of-domain accuracy: Improved performance at 79.76% (vs 79.35%)
43
+
44
+ Text Classification Tasks:
45
+ - 8tags classification: Significant improvement of ~3pp (76.14% vs 73.17%)
46
+ - Belebele benchmark: Matching performance at 88.56%
47
+ - CBD task: Substantial F1 score improvement by 10pp (23.91% vs 13.73%)
48
+
49
+ Language Understanding:
50
+ - DYK ("Did you know..."): Improved F1 score (69.77% vs 69.14%)
51
+ - Named Entity Recognition (KLEJ NER): Notable improvement of ~8pp (45.53% vs 37.61%)
52
+ - PolQA reranking: Slight decrease (81.99% vs 83.21%)
53
+ - PPC: Enhanced accuracy (78.00% vs 77.20%)
54
+ - PSC: Minor F1 score decrease (90.46% vs 93.63%)
55
+
56
+ Overall Performance:
57
+ MSH-v1 achieves a higher average score of 71.18% compared to Bielik v2.3's 69.33%, demonstrating the effectiveness of our checkpoint merging technique in improving model performance across diverse NLP tasks.
58
+
59
+ All evaluations were conducted using the Open PL LLM Leaderboard framework (0-shot) as part of the SpeakLeash.org open-science initiative.
60
+
61
+ Kudos to the **[SpeakLeash](https://speakleash.org)** project and **[ACK Cyfronet AGH](https://www.cyfronet.pl/)** for their extraordinary work.