Update README.md
Browse files
README.md
CHANGED
@@ -10,6 +10,11 @@ library_name: transformers
|
|
10 |
<img src="https://dscache.tencent-cloud.cn/upload/uploader/hunyuan-64b418fd052c033b228e04bc77bbc4b54fd7f5bc.png" width="400"/> <br>
|
11 |
</p><p></p>
|
12 |
|
|
|
|
|
|
|
|
|
|
|
13 |
### Model Introduction
|
14 |
|
15 |
With the rapid development of artificial intelligence technology, large language models (LLMs) have made significant progress in fields such as natural language processing, computer vision, and scientific tasks. However, as the scale of these models increases, optimizing resource consumption while maintaining high performance has become a key challenge. To address this challenge, we have explored Mixture of Experts (MoE) models. The currently unveiled Hunyuan-Large (Hunyuan-MoE-A52B) model is the largest open-source Transformer-based MoE model in the industry, featuring a total of 389 billion parameters and 52 billion active parameters. This is currently the largest open-source Transformer-based MoE model in the industry, featuring a total of 389 billion parameters and 52 billion active parameters.
|
@@ -92,6 +97,16 @@ Remarkably, this leap in accuracy is achieved with only 52 billion activated par
|
|
92 |
| AlpacaEval-2.0 | 39.3 | 34.3 | 30.9 | 50.5 | **51.8** |
|
93 |
|
94 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
95 |
|
96 |
|
97 |
### Citation
|
|
|
10 |
<img src="https://dscache.tencent-cloud.cn/upload/uploader/hunyuan-64b418fd052c033b228e04bc77bbc4b54fd7f5bc.png" width="400"/> <br>
|
11 |
</p><p></p>
|
12 |
|
13 |
+
<p align="center">
|
14 |
+
🫣 <a href="https://huggingface.co/tencent/Tencent-Hunyuan-Large"><b>Hugging Face</b></a>   |   🖥️  <a href="https://llm.hunyuan.tencent.com/" style="color: red;"><b>official website</b></a>  |  🕖   <a href="https://cloud.tencent.com/product/hunyuan" ><b>HunyuanAPI</b></a>
|
15 |
+
</p><p align="center">
|
16 |
+
<a href="https://arxiv.org/abs/2411.02265" style="color: red;"><b>Technical Report</b></a>  |   <a href="https://huggingface.co/spaces/tencent/Hunyuan-Large"><b>Demo</b></a>   |   <a href="https://cloud.tencent.com/document/product/851/112032" style="color: red;"><b>Tencent Cloud TI</b></a>   </p>
|
17 |
+
|
18 |
### Model Introduction
|
19 |
|
20 |
With the rapid development of artificial intelligence technology, large language models (LLMs) have made significant progress in fields such as natural language processing, computer vision, and scientific tasks. However, as the scale of these models increases, optimizing resource consumption while maintaining high performance has become a key challenge. To address this challenge, we have explored Mixture of Experts (MoE) models. The currently unveiled Hunyuan-Large (Hunyuan-MoE-A52B) model is the largest open-source Transformer-based MoE model in the industry, featuring a total of 389 billion parameters and 52 billion active parameters. This is currently the largest open-source Transformer-based MoE model in the industry, featuring a total of 389 billion parameters and 52 billion active parameters.
|
|
|
97 |
| AlpacaEval-2.0 | 39.3 | 34.3 | 30.9 | 50.5 | **51.8** |
|
98 |
|
99 |
|
100 |
+
## Quick Start
|
101 |
+
|
102 |
+
You can quickly get started by referring to the content in the <a href="https://github.com/Tencent/Tencent-Hunyuan-Large/tree/main/examples">Quick Start Guide</a>.
|
103 |
+
|
104 |
+
|
105 |
+
## Inference and Deployment
|
106 |
+
|
107 |
+
HunyuanLLM uses TRT-LLM and vLLM for deployment. We are open sourcing the vLLM deployment (see Reasoning with vLLM), and the TRT-LLM deployment (see Reasoning with TRT-LLM) will be available in the near future.
|
108 |
+
|
109 |
+
Learn More at <a href="https://github.com/Tencent/Tencent-Hunyuan-Large">Tencent-Hunyuan-Large</a>.
|
110 |
|
111 |
|
112 |
### Citation
|