Evolution Learning Network (ELN) with QLoRA and Genetic Algorithms For LLM
Overview
This project implements an Evolution Learning Network (ELN) to fine-tune transformer-based models like LLaMA using a combination of Quantized Low-Rank Adaptation (QLoRA) and Genetic Algorithms (GA). The primary objective is to evolve a population of models across multiple generations to optimize for performance (fitness) and specialization, while maintaining diversity.
Key Features
- Efficient model fine-tuning using QLoRA.
- Evolutionary strategies, including random mutations and fitness-based selection.
- Hardware-efficient training with 4-bit quantization.
- Comprehensive experiment tracking with WandB.
- Diversity maintenance through LoRA weight fingerprinting.
Model Details
Base Model
- Name: meta-llama/Llama-3.2-1B (can be replaced with any Hugging Face model).
- Architecture: Transformer-based causal language model.
Quantization Configuration
- Quantization Type: 4-bit using
bitsandbytes
(bnb_4bit
). - Parameters:
- Compute Type:
torch.float16
- Quantization Type:
"nf4"
(Nonlinear quantization). - Double Quantization: Enabled.
- Nested Quantization: Enabled.
- Compute Type:
LoRA (Low-Rank Adaptation)
- Dimensions (r): 8
- Alpha (Scaling): 16
- Target Modules: Query and Value projections (
q_proj
,v_proj
). - Dropout: 0.05
- Task Type: Causal Language Modeling (
CAUSAL_LM
).
Training Strategy
- Optimizer:
paged_adamw_8bit
for memory-efficient updates. - Precision: Mixed precision (
fp16
) for faster training.
Hyperparameters
General Parameters
- Generations: 10
- Population Size: 4
- Dataset Size: 2000 samples per split (adjustable for larger datasets).
Training
- Batch Size: 8
- Gradient Accumulation: 16 steps.
- Learning Rate:
2e-4
- Epochs per Model: 2
Mutations
- Mutation Rate: 10% (probability per parameter).
- Mutation Scale: Noise added with a standard deviation of 0.02.
Dataset Details
Source
- Name: WikiText (wikitext-2-raw-v1 for larger datasets).
- Splits:
train
→ Model training.validation
→ General task evaluation.test
→ Specific task evaluation.
Tokenization
- Tokenizer: Hugging Face
AutoTokenizer
. - Max Token Length: 128 tokens.
- Padding: Fixed to
"max_length"
.
Results
Summary
- Total Generations: 10
- Best Fitness Achieved: 0.4772
- Final Population Diversity: 0.0011
Evolution History (Highlights)
Generation | Best Fitness | Avg Fitness | Diversity | Best Specialization |
---|---|---|---|---|
1 | 0.4096 | 0.4023 | 0.00097 | 0.9967 |
5 | 0.4727 | 0.4722 | 0.00099 | 0.9968 |
10 | 0.4772 | 0.4768 | 0.00106 | 0.9972 |
Hardware & Framework
Hardware
- Multi-GPU support with
torch.nn.parallel.DistributedDataParallel
orAccelerator
. - Logs GPU/CPU usage with
psutil
andtorch.cuda
.
Frameworks & Libraries
- Transformers: Hugging Face model and tokenizer handling.
- Datasets: Data loading and processing.
- WandB: Experiment tracking and visualization.
- BitsAndBytes: 4-bit quantization.
- PEFT: LoRA-based fine-tuning.
Future Work
- Explore larger population sizes and more generations for enhanced diversity.
- Experiment with other datasets to generalize findings.
- Integrate additional mutation strategies for broader exploration.
Citation
Remaining
Code to run locally
from peft import PeftModel
from transformers import AutoModelForCausalLM
base_model = AutoModelForCausalLM.from_pretrained("meta-llama/Llama-3.2-1B")
model = PeftModel.from_pretrained(base_model, "diabolic6045/ELN-AOC-CAIN")
Framework versions
- PEFT 0.14.0
- Downloads last month
- 20