mhiroaki-pfn commited on
Commit
1548fd0
·
verified ·
1 Parent(s): 51f1bcc

Correct README about arcitecture

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -13,7 +13,7 @@ library_name: transformers
13
  ## Model Description
14
  PLaMo 2 1B is a 1B model pre-trained on English and Japanese datasets, developed by Preferred Elements, Inc.
15
 
16
- PLaMo 2 models adapt the [Samba](https://arxiv.org/abs/2406.07522) architecture rather than the Transformer architecture. Samba integrates [Mamba](https://arxiv.org/abs/2312.00752), a selective State Space Model (SSM), with sliding window attention, combining their strengths for improved efficiency and performance.
17
 
18
  PLaMo 2 1B is released under Apache License version 2.0.
19
 
 
13
  ## Model Description
14
  PLaMo 2 1B is a 1B model pre-trained on English and Japanese datasets, developed by Preferred Elements, Inc.
15
 
16
+ PLaMo 2 models adapt the hybrid architecture like [Samba](https://arxiv.org/abs/2406.07522) rather than the Transformer architecture. Samba integrates [Mamba](https://arxiv.org/abs/2312.00752), a selective State Space Model (SSM), with sliding window attention, combining their strengths for improved efficiency and performance. The major differences between Samba and PLaMo 2 are 1) adding normalization layers to improve training stability, and 2) using Mamba2 kernel for computational efficiency.
17
 
18
  PLaMo 2 1B is released under Apache License version 2.0.
19