toothacher17 commited on
Commit
930af5a
·
verified ·
1 Parent(s): fc14f0a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +8 -4
README.md CHANGED
@@ -143,9 +143,13 @@ Moonlight has the same architecture as DeepSeek-V3, which is supported by many p
143
  ## Citation
144
  If you find Moonlight is useful or want to use in your projects, please kindly cite our paper:
145
  ```
146
- @article{MoonshotAIMuon,
147
- author = {Jingyuan Liu and Jianlin Su and Xingcheng Yao and Zhejun Jiang and Guokun Lai and Yulun Du and Yidao Qin and Weixin Xu and Enzhe Lu and Junjie Yan and Yanru Chen and Huabin Zheng and Yibo Liu and Shaowei Liu and Bohong Yin and Weiran He and Han Zhu and Yuzhi Wang and Jianzhou Wang and Mengnan Dong and Zheng Zhang and Yongsheng Kang and Hao Zhang and Xinran Xu and Yutao Zhang and Yuxin Wu and Xinyu Zhou and Zhilin Yang},
148
- title = {Muon is Scalable For LLM Training},
149
- year = {2025},
 
 
 
 
150
  }
151
  ```
 
143
  ## Citation
144
  If you find Moonlight is useful or want to use in your projects, please kindly cite our paper:
145
  ```
146
+ @misc{liu2025muonscalablellmtraining,
147
+ title={Muon is Scalable for LLM Training},
148
+ author={Jingyuan Liu and Jianlin Su and Xingcheng Yao and Zhejun Jiang and Guokun Lai and Yulun Du and Yidao Qin and Weixin Xu and Enzhe Lu and Junjie Yan and Yanru Chen and Huabin Zheng and Yibo Liu and Shaowei Liu and Bohong Yin and Weiran He and Han Zhu and Yuzhi Wang and Jianzhou Wang and Mengnan Dong and Zheng Zhang and Yongsheng Kang and Hao Zhang and Xinran Xu and Yutao Zhang and Yuxin Wu and Xinyu Zhou and Zhilin Yang},
149
+ year={2025},
150
+ eprint={2502.16982},
151
+ archivePrefix={arXiv},
152
+ primaryClass={cs.LG},
153
+ url={https://arxiv.org/abs/2502.16982},
154
  }
155
  ```