library_name: transformers
license: apache-2.0
base_model:
- MaziyarPanahi/calme-3.2-instruct-78b
- Sakalti/ultiima-72B
tags:
- merge
- mergekit
ECE-TRIOMPHANT-2.1-YL-72B-SLERP-V1
This model has been produced by:
- ROBERGE Marial, engineering student at French Engineering School ECE
- ESCRIVA Mathis, engineering student at French Engineering School ECE
- LALAIN Youri, engineering student at French Engineering School ECE
- RAGE LILIAN, engineering student at French Engineering School ECE
- HUVELLE Baptiste, engineering student at French Engineering School ECE
Under the supervision of:
- Andre-Louis Rochet, Lecturer at ECE & Co-Founder of TW3 Partners
- Paul Lemaistre, CTO of TW3 Partners
With the contribution of:
- ECE engineering school as sponsor and financial contributor
- François STEPHAN as director of ECE
- Gérard REUS as acting director of iLAB
- Matthieu JOLLARD ECE Alumni
- Louis GARCIA ECE Alumni
Supervisory structure
The iLab (intelligence Lab) is a structure created by the ECE and dedicated to artificial intelligence
About ECE
ECE, a multi-program, multi-campus, and multi-sector engineering school specializing in digital engineering, trains engineers and technology experts for the 21st century, capable of meeting the challenges of the dual digital and sustainable development revolutions.
ECE-TRIOMPHANT-2.1-YL-72B-SLERP-V1 est un modèle de langage fusionné créé à partir des modèles Sakalti/ultiima-72B et MaziyarPanahi/calme-3.2-instruct-78b. Grâce à la méthode SLERP (Spherical Linear Interpolation), il combine les forces des deux architectures pour offrir des performances optimales sur des tâches complexes de traitement du langage naturel (NLP).
Caractéristiques
- Méthode de fusion : SLERP (Spherical Linear Interpolation).
- Modèles sources :
- Points forts :
- Performances améliorées sur des tâches multi-domaines et de raisonnement.
- Capacité de traitement étendue grâce à la fusion des couches critiques.
- Optimisation en bfloat16 pour des calculs rapides et efficaces.
- Applications cibles :
- Raisonnement mathématique.
- Compréhension contextuelle.
- Tâches instructives (Instruction Following).
Configuration
slices:
- sources:
- model: MaziyarPanahi/calme-3.2-instruct-78b
layer_range: [0, 80] # Limité à 80 couches
- model: Sakalti/ultiima-72B
layer_range: [0, 80] # Correspondance avec le 78B
merge_method: slerp
base_model: MaziyarPanahi/calme-3.2-instruct-78b
parameters:
t:
- filter: self_attn
value: [0, 0.25, 0.5, 0.75, 1]
- filter: mlp
value: [1, 0.75, 0.5, 0.25, 0]
- value: 0.5
dtype: bfloat16