TheBloke commited on
Commit
3cc4ec2
·
1 Parent(s): 7c00623

Upload README.md

Browse files
Files changed (1) hide show
  1. README.md +13 -6
README.md CHANGED
@@ -1,7 +1,7 @@
1
  ---
2
  base_model: Nexusflow/NexusRaven-V2-13B
3
  inference: false
4
- license: llama2
5
  model-index:
6
  - name: NexusRaven-13B
7
  results: []
@@ -98,8 +98,15 @@ User Query: {prompt}<human_end>
98
  ```
99
 
100
  <!-- prompt-template end -->
 
 
101
 
 
102
 
 
 
 
 
103
  <!-- compatibility_gguf start -->
104
  ## Compatibility
105
 
@@ -212,12 +219,12 @@ Windows Command Line users: You can set the environment variable by running `set
212
  Make sure you are using `llama.cpp` from commit [d0cee0d](https://github.com/ggerganov/llama.cpp/commit/d0cee0d36d5be95a0d9088b674dbb27354107221) or later.
213
 
214
  ```shell
215
- ./main -ngl 35 -m nexusraven-v2-13b.Q4_K_M.gguf --color -c 16384 --temp 0.7 --repeat_penalty 1.1 -n -1 -p "Function:\ndef function_here(arg1):\n """\n Comments explaining the function here\n\n Args:\n list args\n\n Returns:\n list returns\n """\n\nFunction:\ndef another_function_here(arg1):\n ...\n\nUser Query: {prompt}<human_end>"
216
  ```
217
 
218
  Change `-ngl 32` to the number of layers to offload to GPU. Remove it if you don't have GPU acceleration.
219
 
220
- Change `-c 16384` to the desired sequence length. For extended sequence models - eg 8K, 16K, 32K - the necessary RoPE scaling parameters are read from the GGUF file and set by llama.cpp automatically. Note that longer sequence lengths require much more resources, so you may need to reduce this value.
221
 
222
  If you want to have a chat-style conversation, replace the `-p <PROMPT>` argument with `-i -ins`
223
 
@@ -266,7 +273,7 @@ from llama_cpp import Llama
266
  # Set gpu_layers to the number of layers to offload to GPU. Set to 0 if no GPU acceleration is available on your system.
267
  llm = Llama(
268
  model_path="./nexusraven-v2-13b.Q4_K_M.gguf", # Download the model file first
269
- n_ctx=16384, # The max sequence length to use - note that longer sequence lengths require much more resources
270
  n_threads=8, # The number of CPU threads to use, tailor to your system and the resulting performance
271
  n_gpu_layers=35 # The number of layers to offload to GPU, if you have GPU acceleration available
272
  )
@@ -377,7 +384,7 @@ NexusRaven-V2 is capable of generating deeply nested function calls, parallel fu
377
 
378
  ### Quick Start Prompting Guide
379
 
380
- Please refer to our notebook, [How-To-Prompt.ipynb](How-To-Prompt.ipynb), for more advanced tutorials on using NexusRaven-V2!
381
 
382
  1. We strongly recommend to set sampling to False when prompting NexusRaven-V2.
383
  2. We strongly recommend a very low temperature (~0.001).
@@ -468,7 +475,7 @@ For a deeper dive into the results, please see our [Github README](https://githu
468
  3. The explanations generated by NexusRaven-V2 might be incorrect. Please ensure proper guardrails are present to capture errant behavior.
469
 
470
  ## License
471
- This model was trained on commercially viable data and is licensed under the [Llama 2 community license](https://huggingface.co/codellama/CodeLlama-13b-hf/blob/main/LICENSE) following the original [CodeLlama-13b-hf](https://huggingface.co/codellama/CodeLlama-13b-hf/) model.
472
 
473
 
474
  ## References
 
1
  ---
2
  base_model: Nexusflow/NexusRaven-V2-13B
3
  inference: false
4
+ license: other
5
  model-index:
6
  - name: NexusRaven-13B
7
  results: []
 
98
  ```
99
 
100
  <!-- prompt-template end -->
101
+ <!-- licensing start -->
102
+ ## Licensing
103
 
104
+ The creator of the source model has listed its license as `other`, and this quantization has therefore used that same license.
105
 
106
+ As this model is based on Llama 2, it is also subject to the Meta Llama 2 license terms, and the license files for that are additionally included. It should therefore be considered as being claimed to be licensed under both licenses. I contacted Hugging Face for clarification on dual licensing but they do not yet have an official position. Should this change, or should Meta provide any feedback on this situation, I will update this section accordingly.
107
+
108
+ In the meantime, any questions regarding licensing, and in particular how these two licenses might interact, should be directed to the original model repository: [Nexusflow's NexusRaven V2 13B](https://huggingface.co/Nexusflow/NexusRaven-V2-13B).
109
+ <!-- licensing end -->
110
  <!-- compatibility_gguf start -->
111
  ## Compatibility
112
 
 
219
  Make sure you are using `llama.cpp` from commit [d0cee0d](https://github.com/ggerganov/llama.cpp/commit/d0cee0d36d5be95a0d9088b674dbb27354107221) or later.
220
 
221
  ```shell
222
+ ./main -ngl 35 -m nexusraven-v2-13b.Q4_K_M.gguf --color -c 2048 --temp 0.7 --repeat_penalty 1.1 -n -1 -p "Function:\ndef function_here(arg1):\n """\n Comments explaining the function here\n\n Args:\n list args\n\n Returns:\n list returns\n """\n\nFunction:\ndef another_function_here(arg1):\n ...\n\nUser Query: {prompt}<human_end>"
223
  ```
224
 
225
  Change `-ngl 32` to the number of layers to offload to GPU. Remove it if you don't have GPU acceleration.
226
 
227
+ Change `-c 2048` to the desired sequence length. For extended sequence models - eg 8K, 16K, 32K - the necessary RoPE scaling parameters are read from the GGUF file and set by llama.cpp automatically. Note that longer sequence lengths require much more resources, so you may need to reduce this value.
228
 
229
  If you want to have a chat-style conversation, replace the `-p <PROMPT>` argument with `-i -ins`
230
 
 
273
  # Set gpu_layers to the number of layers to offload to GPU. Set to 0 if no GPU acceleration is available on your system.
274
  llm = Llama(
275
  model_path="./nexusraven-v2-13b.Q4_K_M.gguf", # Download the model file first
276
+ n_ctx=2048, # The max sequence length to use - note that longer sequence lengths require much more resources
277
  n_threads=8, # The number of CPU threads to use, tailor to your system and the resulting performance
278
  n_gpu_layers=35 # The number of layers to offload to GPU, if you have GPU acceleration available
279
  )
 
384
 
385
  ### Quick Start Prompting Guide
386
 
387
+ Please refer to our notebook, [How-To-Prompt.ipynb](https://colab.research.google.com/drive/19JYixRPPlanmW5q49WYi_tU8rhHeCEKW?usp=sharing), for more advanced tutorials on using NexusRaven-V2!
388
 
389
  1. We strongly recommend to set sampling to False when prompting NexusRaven-V2.
390
  2. We strongly recommend a very low temperature (~0.001).
 
475
  3. The explanations generated by NexusRaven-V2 might be incorrect. Please ensure proper guardrails are present to capture errant behavior.
476
 
477
  ## License
478
+ This model was trained on commercially viable data and is licensed under the [Nexusflow community license](https://huggingface.co/Nexusflow/NexusRaven-V2-13B/blob/main/LICENSE.txt).
479
 
480
 
481
  ## References