metadata

library_name: transformers
license: apache-2.0
base_model:
  - MaziyarPanahi/calme-3.2-instruct-78b
  - Sakalti/ultiima-72B
tags:
  - merge
  - mergekit

ECE-TRIOMPHANT-2.1-YL-72B-SLERP-V1

This model has been produced by:

ROBERGE Marial, engineering student at French Engineering School ECE
ESCRIVA Mathis, engineering student at French Engineering School ECE
LALAIN Youri, engineering student at French Engineering School ECE
RAGE LILIAN, engineering student at French Engineering School ECE
HUVELLE Baptiste, engineering student at French Engineering School ECE

Under the supervision of:

Andre-Louis Rochet, Lecturer at ECE & Co-Founder of TW3 Partners
Paul Lemaistre, CTO of TW3 Partners

With the contribution of:

ECE engineering school as sponsor and financial contributor
François STEPHAN as director of ECE
Gérard REUS as acting director of iLAB
Matthieu JOLLARD ECE Alumni
Louis GARCIA ECE Alumni

Supervisory structure

The iLab (intelligence Lab) is a structure created by the ECE and dedicated to artificial intelligence

About ECE

ECE, a multi-program, multi-campus, and multi-sector engineering school specializing in digital engineering, trains engineers and technology experts for the 21st century, capable of meeting the challenges of the dual digital and sustainable development revolutions.

ECE-TRIOMPHANT-2.1-YL-72B-SLERP-V1 est un modèle de langage fusionné créé à partir des modèles Sakalti/ultiima-72B et MaziyarPanahi/calme-3.2-instruct-78b. Grâce à la méthode SLERP (Spherical Linear Interpolation), il combine les forces des deux architectures pour offrir des performances optimales sur des tâches complexes de traitement du langage naturel (NLP).

Caractéristiques

Méthode de fusion : SLERP (Spherical Linear Interpolation).
Modèles sources :
- Sakalti/ultiima-72B
- MaziyarPanahi/calme-3.2-instruct-78b
Points forts :
- Performances améliorées sur des tâches multi-domaines et de raisonnement.
- Capacité de traitement étendue grâce à la fusion des couches critiques.
- Optimisation en bfloat16 pour des calculs rapides et efficaces.
Applications cibles :
- Raisonnement mathématique.
- Compréhension contextuelle.
- Tâches instructives (Instruction Following).

Configuration

slices:
  - sources:
      - model: MaziyarPanahi/calme-3.2-instruct-78b
        layer_range: [0, 80]  # Limité à 80 couches
      - model: Sakalti/ultiima-72B
        layer_range: [0, 80]  # Correspondance avec le 78B
merge_method: slerp
base_model: MaziyarPanahi/calme-3.2-instruct-78b
parameters:
  t:
    - filter: self_attn
      value: [0, 0.25, 0.5, 0.75, 1]
    - filter: mlp
      value: [1, 0.75, 0.5, 0.25, 0]
    - value: 0.5
dtype: bfloat16