File size: 1,971 Bytes
70b1103
 
 
04d6513
b2d79c4
04d6513
 
 
5e1ad70
04d6513
 
 
f449f50
22824f5
04d6513
22824f5
b2d79c4
f449f50
 
 
 
 
 
 
 
04d6513
 
8f0af50
04d6513
 
 
 
 
 
 
 
8f0af50
04d6513
 
 
 
 
 
 
 
 
 
 
 
 
 
42f1231
04d6513
 
5d9d273
04d6513
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
---
license: apache-2.0
---

# XGen-7B-8K-Base

Official research release for the family of **XGen** models (`7B`) by Salesforce AI Research:

*Title*: [Long Sequence Modeling with XGen: A 7B LLM Trained on 8K Input Sequence Length](https://blog.salesforceairesearch.com/xgen/)

## Models

### Base models
* [XGen-7B-4K-Base](https://huggingface.co./Salesforce/xgen-7b-4k-base): XGen-7B model pre-trained under 4K sequence length.
  * License: Apache-2.0
* [XGen-7B-8K-Base](https://huggingface.co./Salesforce/xgen-7b-8k-base): XGen-7B model pre-trained under 8K sequence length.
  * License: Apache-2.0

### Instruction-finetuned models

Supervised finetuned model on public domain instructional data. Released for ***research purpose*** only.

* [XGen-7B-8K-Inst](https://huggingface.co./Salesforce/xgen-7b-8k-inst)

## How to run

The training data for the models are tokenized with OpenAI Tiktoken library.
To use this model, install the package via `pip`:

```sh
pip install tiktoken
```

The models can be used as auto-regressive samplers as follows:

```python
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("Salesforce/xgen-7b-8k-base", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained("Salesforce/xgen-7b-8k-base", torch_dtype=torch.bfloat16)
inputs = tokenizer("The world is", return_tensors="pt")
sample = model.generate(**inputs, max_length=128)
print(tokenizer.decode(sample[0]))
```

## Citation

```bibtex
@misc{XGen,
  title={Long Sequence Modeling with XGen: A 7B LLM Trained on 8K Input Sequence Length},
  author={Erik Nijkamp, Hiroaki Hayashi, Tian Xie, Congying Xia, Bo Pang, Rui Meng, Wojciech Kryscinski, Lifu Tu, Meghana Bhat, Semih Yavuz, Chen Xing, Jesse Vig, Lidiya Murakhovs'ka, Jason Wu, Yingbo Zhou, Shafiq Rayhan Joty, Caiming Xiong},
  howpublished={Salesforce AI Research Blog},
  year={2023},
  url={https://blog.salesforceairesearch.com/xgen}
}
```