Fine-Tuned model for threat and intrusion detection rules generation
This model is a fine-tune of Mistral-7B-Instruct-v0.2, via Knowledge Distillation of 0dAI-7.5B. The fine-tuning was conducted using a curated corpus of 950 cybersecurity rules from SIGMA, YARA, and Suricata repositories for threat and intrusion detection.
Instruct the model to craft a SIGMA rule for detecting potentially malicious commands such as msfvenom
and netcat
in Audit system logs, or a Suricata rule to spot SSH brute-force attacks, or even a YARA rule to identify obfuscated strings in files — and watch the magic happen! Automate the creation of rules in your cybersecurity systems with this model.
For an in-depth understanding of how this model has been fine-tuned, refer to the associated paper here: [available soon].
Key Features
- Fine-tuned on a corpus of cybersecurity threat and intrusion detection rules.
- Expert in generating YARA, Suricata, and SIGMA rules.
- Based on Mistral-7B-Instruct-v0.2, with a 32K context window.
Quantization
You can easily quantize your model for local use on your computer with the help of the llama.cpp
or ollama
libraries. This process converts your model into a format that is optimized for performance, particularly useful for deployment on devices with limited computational resources.
To perform this quantization using the llama.cpp
library (link to llama.cpp), follow the steps below:
Step 1: Convert Vocabulary
First, convert your model's vocabulary to a format suitable for quantization. Use the following command, replacing /path/to/
with the actual path to your model files:
python convert.py /path/to/Mistral-7B-cybersecurity-rules \
--vocab-only \
--outfile /path/to/Mistral-7B-cybersecurity-rules/tokenizer.model \
--vocab-type bpe
This command extracts and converts the vocabulary using the byte pair encoding (BPE) method, saving it to a new file.
Step 2: Prepare Model for Quantization
Next, prepare the model for quantization by converting it to a half-precision floating-point format (FP16). This step reduces the model size and prepares it for the final quantization to 8-bit integers. Execute the following command:
python convert.py \
--outtype f16 \
--vocab-type bpe \ # Add this line only if you encounter issues with the vocabulary type
--outfile /path/to/Mistral-7B-cybersecurity-rules/ggml-model-f16.gguf
This command outputs a file that has been converted to FP16, which is an intermediary step before applying 8-bit quantization.
Step 3: Quantize to 8-bits
Finally, apply 8-bit quantization to the FP16 model file. This step significantly reduces the model's memory footprint, making it suitable for deployment in resource-constrained environments:
quantize /path/to/Mistral-7B-cybersecurity-rules/ggml-model-f16.gguf \
/path/to/Mistral-7B-cybersecurity-rules/mistral-7b-rules-q8_0.gguf \
q8_0
Here, the quantize
command converts the FP16 model into an 8-bit quantized model, further compressing the model while retaining its capability to perform its tasks effectively.
License
This repository is licensed under the Apache License, Version 2.0. You can obtain a copy of the license at Apache License 2.0.
Warranty Disclaimer
This software is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
Changes
This model has been fine-tuned based on the original Mistral-7B-Instruct-v0.2. Significant modifications were made to train it on a cybersecurity corpus for threat and intrusion detection.
- Downloads last month
- 51