--- library_name: transformers license: apache-2.0 base_model: - MaziyarPanahi/calme-3.2-instruct-78b - Sakalti/ultiima-72B tags: - merge - mergekit --- # **ECE-TRIOMPHANT-2.1-YL-72B-SLERP-V1** This model has been produced by: - **ROBERGE Marial**, engineering student at French Engineering School ECE - **ESCRIVA Mathis**, engineering student at French Engineering School ECE - **LALAIN Youri**, engineering student at French Engineering School ECE - **RAGE LILIAN**, engineering student at French Engineering School ECE - **HUVELLE Baptiste**, engineering student at French Engineering School ECE Under the supervision of: - **Andre-Louis Rochet**, Lecturer at ECE & Co-Founder of TW3 Partners - **Paul Lemaistre**, CTO of TW3 Partners With the contribution of: - **ECE engineering school** as sponsor and financial contributor - **François STEPHAN** as director of ECE - **Gérard REUS** as acting director of iLAB - **Matthieu JOLLARD** ECE Alumni - **Louis GARCIA** ECE Alumni ### Supervisory structure The iLab (intelligence Lab) is a structure created by the ECE and dedicated to artificial intelligence ### About ECE ECE, a multi-program, multi-campus, and multi-sector engineering school specializing in digital engineering, trains engineers and technology experts for the 21st century, capable of meeting the challenges of the dual digital and sustainable development revolutions. **ECE-TRIOMPHANT-2.1-YL-72B-SLERP-V1** est un modèle de langage fusionné créé à partir des modèles **Sakalti/ultiima-72B** et **MaziyarPanahi/calme-3.2-instruct-78b**. Grâce à la méthode **SLERP (Spherical Linear Interpolation)**, il combine les forces des deux architectures pour offrir des performances optimales sur des tâches complexes de traitement du langage naturel (NLP). ## **Caractéristiques** - **Méthode de fusion :** SLERP (Spherical Linear Interpolation). - **Modèles sources :** - [Sakalti/ultiima-72B](https://huggingface.co./Sakalti/ultiima-72B) - [MaziyarPanahi/calme-3.2-instruct-78b](https://huggingface.co./MaziyarPanahi/calme-3.2-instruct-78b) - **Points forts :** - Performances améliorées sur des tâches multi-domaines et de raisonnement. - Capacité de traitement étendue grâce à la fusion des couches critiques. - Optimisation en **bfloat16** pour des calculs rapides et efficaces. - **Applications cibles :** - Raisonnement mathématique. - Compréhension contextuelle. - Tâches instructives (Instruction Following). ## **Configuration** ```yaml slices: - sources: - model: MaziyarPanahi/calme-3.2-instruct-78b layer_range: [0, 80] # Limité à 80 couches - model: Sakalti/ultiima-72B layer_range: [0, 80] # Correspondance avec le 78B merge_method: slerp base_model: MaziyarPanahi/calme-3.2-instruct-78b parameters: t: - filter: self_attn value: [0, 0.25, 0.5, 0.75, 1] - filter: mlp value: [1, 0.75, 0.5, 0.25, 0] - value: 0.5 dtype: bfloat16 ```