nbroad commited on
Commit
5479a1e
·
verified ·
1 Parent(s): f28e641

remove part about long context modifications

Browse files

this was probably copied from an old model card. this model has a default of 131k without yarn

Files changed (1) hide show
  1. README.md +0 -13
README.md CHANGED
@@ -106,19 +106,6 @@ To achieve optimal performance, we recommend the following settings:
106
  - **Math Problems**: Include "Please reason step by step, and put your final answer within \boxed{}." in the prompt.
107
  - **Multiple-Choice Questions**: Add the following JSON structure to the prompt to standardize responses: "Please show your choice in the `answer` field with only the choice letter, e.g.,`\"answer\": \"C\"`." in the prompt.
108
 
109
- 5. **Handle Long Inputs**: For inputs exceeding 32,768 tokens, enable [YaRN](https://arxiv.org/abs/2309.00071) to improve the model's ability to capture long-sequence information effectively.
110
-
111
- For supported frameworks, you could add the following to `config.json` to enable YaRN:
112
- ```json
113
- {
114
- ...,
115
- "rope_scaling": {
116
- "factor": 4.0,
117
- "original_max_position_embeddings": 32768,
118
- "type": "yarn"
119
- }
120
- }
121
- ```
122
 
123
  For deployment, we recommend using vLLM. Please refer to our [Documentation](https://qwen.readthedocs.io/en/latest/deployment/vllm.html) for usage if you are not familar with vLLM.
124
  Presently, vLLM only supports static YARN, which means the scaling factor remains constant regardless of input length, **potentially impacting performance on shorter texts**.
 
106
  - **Math Problems**: Include "Please reason step by step, and put your final answer within \boxed{}." in the prompt.
107
  - **Multiple-Choice Questions**: Add the following JSON structure to the prompt to standardize responses: "Please show your choice in the `answer` field with only the choice letter, e.g.,`\"answer\": \"C\"`." in the prompt.
108
 
 
 
 
 
 
 
 
 
 
 
 
 
 
109
 
110
  For deployment, we recommend using vLLM. Please refer to our [Documentation](https://qwen.readthedocs.io/en/latest/deployment/vllm.html) for usage if you are not familar with vLLM.
111
  Presently, vLLM only supports static YARN, which means the scaling factor remains constant regardless of input length, **potentially impacting performance on shorter texts**.