--- base_model: unsloth/gemma-2-27b-it-bnb-4bit language: - en license: apache-2.0 tags: - text-generation-inference - transformers - unsloth - gemma2 - trl - sft --- # Model Specifications - **Max Sequence Length**: Trained at 16384 (via RoPE Scaling) - **Data Type**: Auto detection, with options for Float16 and Bfloat16 - **Quantization**: 4bit, to reduce memory usage ## Training Data Used a private dataset with hundreds of technical tutorials and associated summaries. ## Implementation Highlights - **Efficiency**: Emphasis on reducing memory usage and accelerating download speeds through 4bit quantization. - **Adaptability**: Auto detection of data types and support for advanced configuration options like RoPE scaling, LoRA, and gradient checkpointing. # Uploaded Model - **Developed by:** ndebuhr - **License:** apache-2.0 - **Finetuned from model :** unsloth/gemma-2-27b-it-bnb-4bit # Configuration and Usage ```python from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline import torch input_text = "" # Set device based on CUDA availability device = "cuda" if torch.cuda.is_available() else "cpu" # Load the model and tokenizer model_name = "ndebuhr/Gemma-2-27B-Technical-Tutorial-Summarization-QLoRA" tokenizer = AutoTokenizer.from_pretrained(model_name) model = AutoModelForCausalLM.from_pretrained(model_name).to(device) instruction = "Clarify and summarize this tutorial transcript" prompt = """{} ### Raw Transcript: {} ### Summary: """ # Tokenize the input text inputs = tokenizer( prompt.format(instruction, input_text), return_tensors="pt", truncation=True, max_length=16384 ).to(device) # Generate outputs outputs = model.generate( **inputs, max_length=16384, num_return_sequences=1, use_cache=True ) # Decode the generated text generated_text = tokenizer.batch_decode(outputs, skip_special_tokens=True) ``` ## Compute Infrastructure * Fine-tuning: used 1xA100 (40GB) * Inference: recommend 1xL4 (24GB) This gemma2 model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library. [](https://github.com/unslothai/unsloth)