Aether-12b

Aether-12b is a fine-tuned large language model based on Arcanum-12b, further trained on the CleverBoi-Data-20k dataset.

Model Details πŸ“Š

Model Architecture πŸ—οΈ

  • Base model: Arcanum-12b
  • Parameter count: ~12 billion
  • Architecture specifics: Transformer-based language model

Open LLM Leaderboard Evaluation Results

Coming Soon !

Training & Fine-tuning πŸ”„

Aether-12b was fine-tuned on the following dataset:

  • Dataset: theprint/CleverBoi-Data-20k
  • Fine-tuning method: TRL SFTTrainer with AdamW optimizer, cosine decay LR scheduler, bfloat16 precision.

The CleverBoi-Data-20k dataset improved the model in the following ways:

  1. Enhanced reasoning and problem-solving capabilities
  2. Broader knowledge across various topics
  3. Improved performance on specific tasks like writing, analysis, and problem-solving
  4. Better contextual understanding and response generation

Intended Use 🎯

As an assistant or specific role bot.

Ethical Considerations πŸ€”

As a fine-tuned model based on Arcanum-12b, this model may inherit biases and limitations from its parent model and the fine-tuning dataset. Users should be aware of potential biases in generated content and use the model responsibly.

Acknowledgments πŸ™

We acknowledge the contributions of:

  • theprint for the amazing CleverBoi-Data-20k dataset
Downloads last month
7
Safetensors
Model size
12.2B params
Tensor type
BF16
Β·
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.

Model tree for aixonlab/Aether-12b

Base model

Xclbr7/Arcanum-12b
Finetuned
(2)
this model
Finetunes
1 model
Quantizations
3 models

Collection including aixonlab/Aether-12b