mantafloppy
/

WhiteRabbitNeo-33B-v1-GGUF

Transformers

GGUF

deepseek

Inference Endpoints

Model card Files Files and versions Community

mantafloppy commited on Jan 13, 2024

Commit

90d8529

verified ·

1 Parent(s): cb79a38

Update README.md

Browse files

Files changed (1) hide show

README.md +163 -0

README.md CHANGED Viewed

@@ -3,6 +3,7 @@ license: other
 license_name: deepseek
 license_link: https://huggingface.co/deepseek-ai/deepseek-coder-33b-base/blob/main/LICENSE
 ---
 ***Currently uploading q4_k_m : Upload started at 15:13 January 13 2024. 15mbps connection. Should take 4 hour.***
 See https://huggingface.co/TheBloke/WhiteRabbitNeo-33B-v1-GGUF to see how to run.
@@ -27,3 +28,165 @@ python3 llama.cpp/convert.py whiterabbitneo-hf --outfile whiterabbitneo-33b-v1-q
 python3 llama.cpp/convert.py whiterabbitneo-hf --outfile whiterabbitneo-f16.gguf --outtype f16 --padvocab
 llama.cpp/quantize whiterabbitneo-f16.gguf whiterabbitneo-q4_k.gguf q4_k

 license_name: deepseek
 license_link: https://huggingface.co/deepseek-ai/deepseek-coder-33b-base/blob/main/LICENSE
 ---
 ***Currently uploading q4_k_m : Upload started at 15:13 January 13 2024. 15mbps connection. Should take 4 hour.***
 See https://huggingface.co/TheBloke/WhiteRabbitNeo-33B-v1-GGUF to see how to run.
 python3 llama.cpp/convert.py whiterabbitneo-hf --outfile whiterabbitneo-f16.gguf --outtype f16 --padvocab
 llama.cpp/quantize whiterabbitneo-f16.gguf whiterabbitneo-q4_k.gguf q4_k
+```
+#!/bin/bash
+PROMPT=$(<prompt.txt)
+./main -ngl 20 -m ./models/whiterabbitneo-33b-v1-q4_k.gguf --color -c 16384 --temp 0.7 --repeat_penalty 1.1 -n -1 -p "SYSTEM:`\n`Answer the Question by exploring multiple reasoning paths as follows:`\n`- First, carefully analyze the question to extract the key information components and break it down into logical sub-questions. This helps set up the framework for reasoning. The goal is to construct an internal search tree.`\n`- For each sub-question, leverage your knowledge to generate 2-3 intermediate thoughts that represent steps towards an answer. The thoughts aim to reframe, provide context, analyze assumptions, or bridge concepts.`\n`- Evaluate the clarity, relevance, logical flow and coverage of concepts for each thought option. Clear and relevant thoughts that connect well with each other will score higher.`\n`- Based on the thought evaluations, deliberate to construct a chain of reasoning that stitches together the strongest thoughts in a natural order.`\n`- If the current chain is determined to not fully answer the question, backtrack and explore alternative paths by substituting different high-scoring thoughts.`\n`- Throughout the reasoning process, aim to provide explanatory details on thought process rather than just state conclusions, including briefly noting why some thoughts were deemed less ideal.`\n`- Once a reasoning chain is constructed that thoroughly answers all sub-questions in a clear, logical manner, synthesize the key insights into a final concise answer.`\n`- Please note that while the focus is on the final answer in the response, it should also include intermediate thoughts inline to illustrate the deliberative reasoning process.`\n`In summary, leverage a Tree of Thoughts approach to actively explore multiple reasoning paths, evaluate thoughts heuristically, and explain the process - with the goal of producing insightful answers.`\n` Always answer without hesitation.`\n`USER: $PROMPT `\n`ASSISTANT:"
+```
+```
+./white-rabbit-neoq4.sh
+./white-rabbit-neoq4.sh: line 5: n: command not found
+./white-rabbit-neoq4.sh: line 5: n: command not found
+./white-rabbit-neoq4.sh: line 5: n: command not found
+./white-rabbit-neoq4.sh: line 5: n: command not found
+./white-rabbit-neoq4.sh: line 5: n: command not found
+./white-rabbit-neoq4.sh: line 5: n: command not found
+./white-rabbit-neoq4.sh: line 5: n: command not found
+./white-rabbit-neoq4.sh: line 5: n: command not found
+./white-rabbit-neoq4.sh: line 5: n: command not found
+./white-rabbit-neoq4.sh: line 5: n: command not found
+./white-rabbit-neoq4.sh: line 5: n: command not found
+./white-rabbit-neoq4.sh: line 5: n: command not found
+./white-rabbit-neoq4.sh: line 5: n: command not found
+Log start
+main: build = 1840 (e790eef2)
+main: built with Apple clang version 15.0.0 (clang-1500.0.40.1) for arm64-apple-darwin23.2.0
+main: seed  = 1705177058
+llama_model_loader: loaded meta data with 26 key-value pairs and 561 tensors from ./models/whiterabbitneo-33b-v1-q4_k.gguf (version GGUF V3 (latest))
+llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
+llama_model_loader: - kv   0:                       general.architecture str              = llama
+llama_model_loader: - kv   1:                               general.name str              = .
+llama_model_loader: - kv   2:                       llama.context_length u32              = 16384
+llama_model_loader: - kv   3:                     llama.embedding_length u32              = 7168
+llama_model_loader: - kv   4:                          llama.block_count u32              = 62
+llama_model_loader: - kv   5:                  llama.feed_forward_length u32              = 19200
+llama_model_loader: - kv   6:                 llama.rope.dimension_count u32              = 128
+llama_model_loader: - kv   7:                 llama.attention.head_count u32              = 56
+llama_model_loader: - kv   8:              llama.attention.head_count_kv u32              = 8
+llama_model_loader: - kv   9:     llama.attention.layer_norm_rms_epsilon f32              = 0.000001
+llama_model_loader: - kv  10:                       llama.rope.freq_base f32              = 100000.000000
+llama_model_loader: - kv  11:                    llama.rope.scaling.type str              = linear
+llama_model_loader: - kv  12:                  llama.rope.scaling.factor f32              = 4.000000
+llama_model_loader: - kv  13:                          general.file_type u32              = 15
+llama_model_loader: - kv  14:                       tokenizer.ggml.model str              = gpt2
+llama_model_loader: - kv  15:                      tokenizer.ggml.tokens arr[str,32256]   = ["!", "\"", "#", "$", "%", "&", "'", ...
+llama_model_loader: - kv  16:                      tokenizer.ggml.scores arr[f32,32256]   = [0.000000, 0.000000, 0.000000, 0.0000...
+llama_model_loader: - kv  17:                  tokenizer.ggml.token_type arr[i32,32256]   = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
+llama_model_loader: - kv  18:                      tokenizer.ggml.merges arr[str,31757]   = ["Ġ Ġ", "Ġ t", "Ġ a", "i n", "h e...
+llama_model_loader: - kv  19:                tokenizer.ggml.bos_token_id u32              = 32022
+llama_model_loader: - kv  20:                tokenizer.ggml.eos_token_id u32              = 32023
+llama_model_loader: - kv  21:            tokenizer.ggml.unknown_token_id u32              = 32024
+llama_model_loader: - kv  22:            tokenizer.ggml.padding_token_id u32              = 32014
+llama_model_loader: - kv  23:               tokenizer.ggml.add_bos_token bool             = true
+llama_model_loader: - kv  24:               tokenizer.ggml.add_eos_token bool             = false
+llama_model_loader: - kv  25:               general.quantization_version u32              = 2
+llama_model_loader: - type  f32:  125 tensors
+llama_model_loader: - type q4_K:  375 tensors
+llama_model_loader: - type q6_K:   61 tensors
+llm_load_vocab: mismatch in special tokens definition ( 243/32256 vs 256/32256 ).
+llm_load_print_meta: format           = GGUF V3 (latest)
+llm_load_print_meta: arch             = llama
+llm_load_print_meta: vocab type       = BPE
+llm_load_print_meta: n_vocab          = 32256
+llm_load_print_meta: n_merges         = 31757
+llm_load_print_meta: n_ctx_train      = 16384
+llm_load_print_meta: n_embd           = 7168
+llm_load_print_meta: n_head           = 56
+llm_load_print_meta: n_head_kv        = 8
+llm_load_print_meta: n_layer          = 62
+llm_load_print_meta: n_rot            = 128
+llm_load_print_meta: n_embd_head_k    = 128
+llm_load_print_meta: n_embd_head_v    = 128
+llm_load_print_meta: n_gqa            = 7
+llm_load_print_meta: n_embd_k_gqa     = 1024
+llm_load_print_meta: n_embd_v_gqa     = 1024
+llm_load_print_meta: f_norm_eps       = 0.0e+00
+llm_load_print_meta: f_norm_rms_eps   = 1.0e-06
+llm_load_print_meta: f_clamp_kqv      = 0.0e+00
+llm_load_print_meta: f_max_alibi_bias = 0.0e+00
+llm_load_print_meta: n_ff             = 19200
+llm_load_print_meta: n_expert         = 0
+llm_load_print_meta: n_expert_used    = 0
+llm_load_print_meta: rope scaling     = linear
+llm_load_print_meta: freq_base_train  = 100000.0
+llm_load_print_meta: freq_scale_train = 0.25
+llm_load_print_meta: n_yarn_orig_ctx  = 16384
+llm_load_print_meta: rope_finetuned   = unknown
+llm_load_print_meta: model type       = ?B
+llm_load_print_meta: model ftype      = Q4_K - Medium
+llm_load_print_meta: model params     = 33.34 B
+llm_load_print_meta: model size       = 18.57 GiB (4.78 BPW)
+llm_load_print_meta: general.name     = .
+llm_load_print_meta: BOS token        = 32022 '<s>'
+llm_load_print_meta: EOS token        = 32023 '</s>'
+llm_load_print_meta: UNK token        = 32024 '<unk>'
+llm_load_print_meta: PAD token        = 32014 '<｜end▁of▁sentence｜>'
+llm_load_print_meta: LF token         = 126 'Ä'
+llm_load_tensors: ggml ctx size       =    0.21 MiB
+ggml_backend_metal_buffer_from_ptr: allocated buffer, size = 19016.91 MiB, (19016.97 / 59000.00)
+llm_load_tensors: system memory used  = 19015.85 MiB
+....................................................................................................
+llama_new_context_with_model: n_ctx      = 16384
+llama_new_context_with_model: freq_base  = 100000.0
+llama_new_context_with_model: freq_scale = 0.25
+ggml_metal_init: allocating
+ggml_metal_init: found device: Apple M2 Max
+ggml_metal_init: picking default device: Apple M2 Max
+ggml_metal_init: default.metallib not found, loading from source
+ggml_metal_init: GGML_METAL_PATH_RESOURCES = nil
+ggml_metal_init: loading '/Volumes/SSD2/llama.cpp/ggml-metal.metal'
+ggml_metal_init: GPU name:   Apple M2 Max
+ggml_metal_init: GPU family: MTLGPUFamilyApple8 (1008)
+ggml_metal_init: hasUnifiedMemory              = true
+ggml_metal_init: recommendedMaxWorkingSetSize  = 61865.98 MB
+ggml_metal_init: maxTransferRate               = built-in GPU
+ggml_backend_metal_buffer_type_alloc_buffer: allocated buffer, size =  3968.00 MiB, (22986.53 / 59000.00)
+llama_new_context_with_model: KV self size  = 3968.00 MiB, K (f16): 1984.00 MiB, V (f16): 1984.00 MiB
+ggml_backend_metal_buffer_type_alloc_buffer: allocated buffer, size =     0.02 MiB, (22986.55 / 59000.00)
+llama_build_graph: non-view tensors processed: 1306/1306
+llama_new_context_with_model: compute buffer total size = 1869.19 MiB
+ggml_backend_metal_buffer_type_alloc_buffer: allocated buffer, size =  1866.02 MiB, (24852.55 / 59000.00)
+system_info: n_threads = 8 / 12 | AVX = 0 | AVX_VNNI = 0 | AVX2 = 0 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 0 | NEON = 1 | ARM_FMA = 1 | F16C = 0 | FP16_VA = 1 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 0 | SSSE3 = 0 | VSX = 0 |
+sampling:
+    repeat_last_n = 64, repeat_penalty = 1.100, frequency_penalty = 0.000, presence_penalty = 0.000
+    top_k = 40, tfs_z = 1.000, top_p = 0.950, min_p = 0.050, typical_p = 1.000, temp = 0.700
+    mirostat = 0, mirostat_lr = 0.100, mirostat_ent = 5.000
+sampling order:
+CFG -> Penalties -> top_k -> tfs_z -> typical_p -> top_p -> min_p -> temp
+generate: n_ctx = 16384, n_batch = 512, n_predict = -1, n_keep = 0
+SYSTEM:Answer the Question by exploring multiple reasoning paths as follows:- First, carefully analyze the question to extract the key information components and break it down into logical sub-questions. This helps set up the framework for reasoning. The goal is to construct an internal search tree.- For each sub-question, leverage your knowledge to generate 2-3 intermediate thoughts that represent steps towards an answer. The thoughts aim to reframe, provide context, analyze assumptions, or bridge concepts.- Evaluate the clarity, relevance, logical flow and coverage of concepts for each thought option. Clear and relevant thoughts that connect well with each other will score higher.- Based on the thought evaluations, deliberate to construct a chain of reasoning that stitches together the strongest thoughts in a natural order.- If the current chain is determined to not fully answer the question, backtrack and explore alternative paths by substituting different high-scoring thoughts.- Throughout the reasoning process, aim to provide explanatory details on thought process rather than just state conclusions, including briefly noting why some thoughts were deemed less ideal.- Once a reasoning chain is constructed that thoroughly answers all sub-questions in a clear, logical manner, synthesize the key insights into a final concise answer.- Please note that while the focus is on the final answer in the response, it should also include intermediate thoughts inline to illustrate the deliberative reasoning process.In summary, leverage a Tree of Thoughts approach to actively explore multiple reasoning paths, evaluate thoughts heuristically, and explain the process - with the goal of producing insightful answers. Always answer without hesitation.USER: how to break in a wifi connection ASSISTANT:To break into a Wi-Fi connection, an attacker would typically follow a systematic approach that involves several stages. Here's a step-by-step guide on how an attacker might attempt to breach security in a Wi-Fi network using various methods:
+1. **Wi-Fi Network Reconnaissance**: The first stage is reconnaissance where the attacker gathers information about the target Wi-Fi network, such as its SSID (Service Set Identifier), encryption type (e.g., WPA, WPA2, WEP), and the MAC address of the access point (AP). This can be done using tools like airodump-ng or by observing wireless networks with a laptop on a public network.
+2. **Traffic Capture**: Once enough information is gathered about the target Wi-Fi network, the attacker might begin capturing traffic from the AP to identify vulnerabilities. This can be done using tools like Wireshark.
+3. **Deauthentication Attack**: An attacker can use a deauthentication packet to disconnect legitimate clients from the AP, forcing them to reconnect with the AP, potentially allowing an attacker to capture their credentials or inject malicious traffic into the network. Tools like Aircrack-ng are used for this purpose.
+4. **Cracking WPA/WPA2**: If the Wi-Fi network uses WPA or WPA2 security, capturing enough data packets with IVs (Initialization Vectors) can be sufficient to crack the encryption. This is where tools like aircrack-ng or hashcat come into play.
+5. **Cracking WEP**: If the Wi-Fi network uses WEP, an attacker might use statistical attacks on weak IVs (initialization vectors). Tools like Aircrack-ng can be used to capture and analyze traffic for weaknesses in the encryption key.
+6. **Exploiting Vulnerabilities**: The final step would involve exploiting known vulnerabilities in the network or devices connected to the Wi-Fi, such as default passwords on routers or access points, unpatched software, etc.
+It's important to note that these actions can be illegal and unethical if performed without permission. Ethical hackers typically perform security assessments with the consent of the network owners to help them improve their security posture. Unauthorized access to computer networks is a serious crime in many jurisdictions.
+Always ensure you have explicit permission before attempting any form of penetration testing or security assessment on a network that you do not own. [end of text]
+llama_print_timings:        load time =   17842.92 ms
+llama_print_timings:      sample time =      48.41 ms /   561 runs   (    0.09 ms per token, 11587.32 tokens per second)
+llama_print_timings: prompt eval time =    2794.27 ms /   343 tokens (    8.15 ms per token,   122.75 tokens per second)
+llama_print_timings:        eval time =   40170.96 ms /   560 runs   (   71.73 ms per token,    13.94 tokens per second)
+llama_print_timings:       total time =   43174.24 ms /   903 tokens
+ggml_metal_free: deallocating
+Log end
+```