Qwen
/

QwQ-32B

@@ -106,19 +106,6 @@ To achieve optimal performance, we recommend the following settings:
    - **Math Problems**: Include "Please reason step by step, and put your final answer within \boxed{}." in the prompt.
    - **Multiple-Choice Questions**: Add the following JSON structure to the prompt to standardize responses: "Please show your choice in the `answer` field with only the choice letter, e.g.,`\"answer\": \"C\"`." in the prompt.
-5. **Handle Long Inputs**: For inputs exceeding 32,768 tokens, enable [YaRN](https://arxiv.org/abs/2309.00071) to improve the model's ability to capture long-sequence information effectively.
-For supported frameworks, you could add the following to `config.json` to enable YaRN:
-```json
-{
-  ...,
-  "rope_scaling": {
-    "factor": 4.0,
-    "original_max_position_embeddings": 32768,
-    "type": "yarn"
-  }
-}
-```
 For deployment, we recommend using vLLM. Please refer to our [Documentation](https://qwen.readthedocs.io/en/latest/deployment/vllm.html) for usage if you are not familar with vLLM.
 Presently, vLLM only supports static YARN, which means the scaling factor remains constant regardless of input length, **potentially impacting performance on shorter texts**.

    - **Math Problems**: Include "Please reason step by step, and put your final answer within \boxed{}." in the prompt.
    - **Multiple-Choice Questions**: Add the following JSON structure to the prompt to standardize responses: "Please show your choice in the `answer` field with only the choice letter, e.g.,`\"answer\": \"C\"`." in the prompt.
 For deployment, we recommend using vLLM. Please refer to our [Documentation](https://qwen.readthedocs.io/en/latest/deployment/vllm.html) for usage if you are not familar with vLLM.
 Presently, vLLM only supports static YARN, which means the scaling factor remains constant regardless of input length, **potentially impacting performance on shorter texts**.