🎡 NotaGen: Advancing Musicality in Symbolic Music Generation with Large Language Model Training Paradigms

Paper    GitHub    Weights    Demo

NotaGen

πŸ“– Overview

NotaGen is a symbolic music generation model that explores the potential of producing high-quality classical sheet music. Inspired by the success of Large Language Models (LLMs), NotaGen adopts a three-stage training paradigm:

  • 🧠 Pre-training on 1.6M musical pieces
  • 🎯 Fine-tuning on ~9K classical compositions with period-composer-instrumentation prompts
  • πŸš€ Reinforcement Learning using our novel CLaMP-DPO method (no human annotations or pre-defined rewards required.)

Check our demo page and enjoy music composed by NotaGen!

βš™οΈ Environment Setup

conda create --name notagen python=3.10
conda activate notagen
conda install pytorch==2.3.0 pytorch-cuda=11.8 -c pytorch -c nvidia
pip install accelerate
pip install optimum
pip install -r requirements.txt

πŸ‹οΈ NotaGen Model Weights

Pre-training

We provide pre-trained weights of different scales:

Models Parameters Patch-level Decoder Layers Character-level Decoder Layers Hidden Size Patch Length (Context Length)
NotaGen-small 110M 12 3 768 2048
NotaGen-medium 244M 16 3 1024 2048
NotaGen-large 516M 20 6 1280 1024

Fine-tuning

We fine-tuned NotaGen-large on a corpus of approximately 9k classical pieces. You can download the weights here.

Reinforcement-Learning

After pre-training and fine-tuning, we optimized NotaGen-large with 3 iterations of CLaMP-DPO. You can download the weights here.

🌟 NotaGen-X

Inspired by Deepseek-R1, we further optimized the training procedures of NotaGen and released a better version --- NotaGen-X. Compared to the version in the paper, NotaGen-X incorporates the following improvements:

  • We introduced a post-training stage between pre-training and fine-tuning, refining the model with a classical-style subset of the pre-training dataset.
  • We removed the key augmentation in the Fine-tune stage, making the instrument range of the generated compositions more reasonable.
  • After RL, we utilized the resulting checkpoint to gather a new set of post-training data. Starting from the pre-trained checkpoint, we conducted another round of post-training, fine-tuning, and reinforcement learning.

For implementation of pre-training, fine-tuning and reinforcement learning on NotaGen, please view our github page.

πŸ“š Citation

If you find NotaGen or CLaMP-DPO useful in your work, please cite our paper.

@misc{wang2025notagenadvancingmusicalitysymbolic,
      title={NotaGen: Advancing Musicality in Symbolic Music Generation with Large Language Model Training Paradigms}, 
      author={Yashan Wang and Shangda Wu and Jianhuai Hu and Xingjian Du and Yueqi Peng and Yongxin Huang and Shuai Fan and Xiaobing Li and Feng Yu and Maosong Sun},
      year={2025},
      eprint={2502.18008},
      archivePrefix={arXiv},
      primaryClass={cs.SD},
      url={https://arxiv.org/abs/2502.18008}, 
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.