mkurman commited on
Commit
42c2381
1 Parent(s): a3719ff

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +56 -3
README.md CHANGED
@@ -1,3 +1,56 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ base_model:
4
+ - speakleash/Bielik-11B-v2.3-Instruct
5
+ pipeline_tag: text-generation
6
+ tags:
7
+ - medit-merge
8
+ ---
9
+
10
+ <div align="center">
11
+ <img src="https://i.ibb.co/YLfCzXR/imagine-image-c680e106-e404-45e5-98da-af700ffe41f4.png" alt="Llama-3.2-MedIT-SUN-2.5B" style="border-radius: 10px; box-shadow: 0 4px 8px 0 rgba(0, 0, 0, 0.2), 0 6px 20px 0 rgba(0, 0, 0, 0.19); max-width: 100%; height: auto;">
12
+ </div>
13
+
14
+ # Marsh Harrier
15
+
16
+ The Marsh Harrier (MSH) is a language model developed by MedIT Solutions using an advanced checkpoint merging technique. It represents a novel fusion of the Speakleash Bielik 11B v2.3 and Speakleash Bielik 11B v2 models, employing our proprietary weight-merging methodology.
17
+
18
+ ## Key Features:
19
+ - Built on a pioneering approach to neural network weight fusion
20
+ - Supports merging models of identical parameter counts while maintaining architecture flexibility
21
+ - Demonstrates superior performance compared to its base models
22
+ - Optimized for Polish language understanding and generation
23
+
24
+ ## Performance:
25
+ The model shows significant improvements over its predecessors across multiple metrics in the Open PL LLM Leaderboard evaluation framework (0-shot and 5-shot), which is part of the SpeakLeash.org open-science initiative.
26
+
27
+ Technical Details:
28
+ - Base Models: Speakleash Bielik 11B v2.3 and Bielik 11B v2 (https://huggingface.co/speakleash/Bielik-11B-v2.3-Instruct)
29
+ - Architecture: Compatible with original Bielik architecture
30
+ - Parameter Count: 11 billion parameters
31
+ - Special Feature: Utilizes MedIT Solutions' proprietary checkpoint merging technology
32
+
33
+ This model represents a step forward in developing the Polish language, demonstrating how merging techniques can enhance model performance while maintaining architectural efficiency.
34
+
35
+ # Polish LLM Open Leaderboard
36
+
37
+ Sentiment Analysis (PolEmo2):
38
+ - In-domain accuracy: Matches Bielik at 77.70%
39
+ - Out-of-domain accuracy: Improved performance at 79.76% (vs 79.35%)
40
+
41
+ Text Classification Tasks:
42
+ - 8tags classification: Significant improvement of ~3pp (76.14% vs 73.17%)
43
+ - Belebele benchmark: Matching performance at 88.56%
44
+ - CBD task: Substantial F1 score improvement by 10pp (23.91% vs 13.73%)
45
+
46
+ Language Understanding:
47
+ - DYK ("Did you know..."): Improved F1 score (69.77% vs 69.14%)
48
+ - Named Entity Recognition (KLEJ NER): Notable improvement of ~8pp (45.53% vs 37.61%)
49
+ - PolQA reranking: Slight decrease (81.99% vs 83.21%)
50
+ - PPC: Enhanced accuracy (78.00% vs 77.20%)
51
+ - PSC: Minor F1 score decrease (90.46% vs 93.63%)
52
+
53
+ Overall Performance:
54
+ MSH-v1 achieves a higher average score of 71.18% compared to Bielik v2.3's 69.33%, demonstrating the effectiveness of our checkpoint merging technique in improving model performance across diverse NLP tasks.
55
+
56
+ All evaluations were conducted using the Open PL LLM Leaderboard framework (0-shot) as part of the SpeakLeash.org open-science initiative.