Edit model card

OpenCerebrum-1.0-7B-SFT

OpenCerebrum-1.0-7B-SFT is an open-source language model fine-tuned from the alpindale/Mistral-7B-v0.2-hf base model on a diverse dataset aimed at replicating capabilities of AetherResearch's proprietary Cerebrum model.

The model was fine-tuned on approximately 1.2 million examples across 14 datasets spanning coding, math, science, reasoning, and general instruction-following. The goal was to assemble public datasets that could help the model achieve strong performance on benchmarks where Cerebrum excels.

Model Details

  • Base Model: alpindale/Mistral-7B-v0.2-hf
  • Parameters: 7 billion
  • Fine-Tuning Dataset Size: ~1,200,000 examples
  • Fine-Tuning Data: Amalgamation of 14 public datasets
  • Language: English
  • License: Apache 2.0

Intended Use

OpenCerebrum-1.0-7B-SFT is intended to be a powerful open-source model for coding, math, science, and general question-answering and text generation tasks. Its diverse fine-tuning data aims to equip it with broad knowledge and reasoning capabilities.

However, as an open-source replica trained on a subset of data compared to the original Cerebrum, it may not match Cerebrum's full performance. Additionally, biases and limitations of the fine-tuning data may be reflected in the model's outputs.

Limitations and Biases

  • The model may have biases and limitations inherited from its fine-tuning datasets. Thorough testing is needed to characterize these.
  • With 1.2 million training examples, the fine-tuning data is still limited compared to the proprietary Cerebrum data.
  • As the model is based on a 7B parameter model, it has computational and memory constraints compared to larger models.

Training Details

The model was fine-tuned on the 14 datasets listed in the Datasets section, totaling approximately 1.2 million examples. Default training hyperparameters were used. In the future, the fine-tuning dataset may be condensed to more closely match the 5,000 example dataset reputedly used for the original Cerebrum model.

Downloads last month
88
Safetensors
Model size
7.24B params
Tensor type
BF16
Β·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Datasets used to train Locutusque/OpenCerebrum-1.0-7b-SFT

Spaces using Locutusque/OpenCerebrum-1.0-7b-SFT 5