|
--- |
|
library_name: transformers |
|
tags: [] |
|
--- |
|
|
|
<div align="center"> |
|
<!-- Replace `#` with your actual links --> |
|
<a href="https://youtube.com/@devsdocode"><img alt="YouTube" src="https://img.shields.io/badge/YouTube-FF0000?style=for-the-badge&logo=youtube&logoColor=white"></a> |
|
<a href="https://t.me/devsdocode"><img alt="Telegram" src="https://img.shields.io/badge/Telegram-2CA5E0?style=for-the-badge&logo=telegram&logoColor=white"></a> |
|
<a href="https://www.instagram.com/sree.shades_/"><img alt="Instagram" src="https://img.shields.io/badge/Instagram-E4405F?style=for-the-badge&logo=instagram&logoColor=white"></a> |
|
<a href="https://www.linkedin.com/in/developer-sreejan/"><img alt="LinkedIn" src="https://img.shields.io/badge/LinkedIn-0077B5?style=for-the-badge&logo=linkedin&logoColor=white"></a> |
|
<a href="https://buymeacoffee.com/devsdocode"><img alt="Buy Me A Coffee" src="https://img.shields.io/badge/Buy%20Me%20A%20Coffee-FFDD00?style=for-the-badge&logo=buymeacoffee&logoColor=black"></a> |
|
</div> |
|
|
|
## Crafted with ❤️ by Devs Do Code (Sree) & OEVortex (Abhay) |
|
|
|
# Usage Code |
|
## WebScout Local (Low Ram Usage) |
|
``` |
|
import os |
|
import dotenv |
|
from webscout.Local.samplers import SamplerSettings |
|
from webscout.Local.utils import download_model |
|
from webscout.Local.thread import Thread |
|
from webscout.Local.model import Model |
|
from webscout.Local import formats |
|
|
|
dotenv.load_dotenv() |
|
|
|
REPO_ID = "Vortex4ai/arif" |
|
|
|
FILENAME = "arif-q8_0.gguf" |
|
|
|
HF_TOKEN = "YOUR HUGGING-FACE API READ TOKEN" |
|
|
|
def download_and_load_model() -> Model: |
|
"""Download the model and load it into memory""" |
|
model_path = download_model(REPO_ID, FILENAME, HF_TOKEN) |
|
return Model(model_path, n_gpu_layers=20) |
|
|
|
def create_custom_chatml_format(system_prompt: str) -> dict: |
|
"""Create a custom ChatML format with the system prompt""" |
|
custom_chatml = formats.chatml.copy() |
|
custom_chatml['system_content'] = system_prompt |
|
return custom_chatml |
|
|
|
def create_sampler_settings() -> SamplerSettings: |
|
"""Create a sampler settings object with default values""" |
|
return SamplerSettings(temp=0.7, top_p=0.9) |
|
|
|
def create_thread(model: Model, custom_chatml: dict, sampler: SamplerSettings) -> Thread: |
|
"""Create a new thread with the custom format and sampler""" |
|
return Thread(model, custom_chatml, sampler=sampler) |
|
|
|
def interact_with_model(thread: Thread) -> None: |
|
"""Start interacting with the model""" |
|
thread.interact(header="🌟 Welcome to the Jarvis-3B Prototype by Sree and OEvortex 🚀", color=True) |
|
# response = thread.send("Initiate system startup") |
|
|
|
if __name__ == "__main__": |
|
model = download_and_load_model() |
|
system_prompt = "You are Jarvis a helpful AI that will always follow user i.e. **Hatim**" |
|
custom_chatml = create_custom_chatml_format(system_prompt) |
|
sampler = create_sampler_settings() |
|
thread = create_thread(model, custom_chatml, sampler) |
|
interact_with_model(thread) |
|
``` |
|
|
|
## Transformers (High Ram Usage) |
|
``` |
|
import torch |
|
from transformers import AutoModelForCausalLM, AutoTokenizer, TextStreamer |
|
|
|
# Let's bring in the big guns! Our super cool HelpingAI-3B model |
|
model = AutoModelForCausalLM.from_pretrained("Vortex4ai/Hatim", trust_remote_code=True, torch_dtype=torch.float16).to("cuda") |
|
|
|
# We also need the special HelpingAI translator to understand our chats |
|
tokenizer = AutoTokenizer.from_pretrained("Vortex4ai/Hatim", trust_remote_code=True, torch_dtype=torch.float16) |
|
|
|
# This TextStreamer thingy is our secret weapon for super smooth conversation flow |
|
streamer = TextStreamer(tokenizer) |
|
|
|
# Now, here comes the magic! ✨ This is the basic template for our chat |
|
prompt = """ |
|
<|im_start|>system: {system} |
|
<|im_end|> |
|
<|im_start|>user: {insaan} |
|
<|im_end|> |
|
<|im_start|>assistant: |
|
""" |
|
|
|
system = "You are HelpingAI a emotional AI always answer my question in HelpingAI style" |
|
|
|
|
|
# And the insaan is curious (like you!) insaan means human in hindi |
|
insaan = "My best friend recently lost their parent to cancer after a long battle. They are understandably devastated and struggling with grief. What would be a caring and supportive way to respond to help them through this difficult time?" |
|
|
|
# Now we combine system and user messages into the template, like adding sprinkles to our conversation cupcake |
|
prompt = prompt.format(system=system, insaan=insaan) |
|
|
|
# Time to chat! We'll use the tokenizer to translate our text into a language the model understands |
|
inputs = tokenizer(prompt, return_tensors="pt", return_attention_mask=False).to("cuda") |
|
|
|
# Here comes the fun part! Let's unleash the power of HelpingAI-3B to generate some awesome text |
|
generated_text = model.generate(**inputs, max_length=3084, top_p=0.95, do_sample=True, temperature=0.6, use_cache=True, streamer=streamer) |
|
``` |
|
|
|
## Model Details |
|
|
|
### Model Description |
|
|
|
<!-- Provide a longer summary of what this model is. --> |
|
|
|
This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated. |
|
|
|
- **Developed by:** Devs Do Code & Vortex |
|
- **Funded by [optional]:** Devs Do Code & Vortex |
|
- **Shared by [optional]:** Devs Do Code & Vortex |
|
- **Model type:** GGUF |
|
- **Language(s) (NLP):** English |
|
- **Finetuned from model [optional]:** Jarvis Base Model (Secret) |
|
|
|
### Model Sources [optional] |
|
|
|
<!-- Provide the basic links for the model. --> |
|
|
|
- **Repository:** [More Information Needed] |
|
- **Paper [optional]:** [More Information Needed] |
|
- **Demo [optional]:** [More Information Needed] |
|
|
|
#### GGUF Technical Specifications |
|
|
|
Delve into the intricacies of GGUF, a meticulously crafted format that builds upon the robust foundation of the GGJT model. Tailored for heightened extensibility and user-centric functionality, GGUF introduces a suite of indispensable features: |
|
|
|
**Single-file Deployment:** Streamline distribution and loading effortlessly. GGUF models have been meticulously architected for seamless deployment, necessitating no external files for supplementary information. |
|
|
|
**Extensibility:** Safeguard the future of your models. GGUF seamlessly accommodates the integration of new features into GGML-based executors, ensuring compatibility with existing models. |
|
|
|
**mmap Compatibility:** Prioritize efficiency. GGUF models are purposefully engineered to support mmap, facilitating rapid loading and saving, thus optimizing your workflow. |
|
|
|
**User-Friendly:** Simplify your coding endeavors. Load and save models effortlessly, irrespective of the programming language used, obviating the dependency on external libraries. |
|
|
|
**Full Information:** A comprehensive repository in a single file. GGUF models encapsulate all requisite information for loading, eliminating the need for users to furnish additional data. |
|
|
|
The differentiator between GGJT and GGUF lies in the deliberate adoption of a key-value structure for hyperparameters (now termed metadata). Bid farewell to untyped lists, and embrace a structured approach that seamlessly accommodates new metadata without compromising compatibility with existing models. Augment your model with supplementary information for enhanced inference and model identification. |
|
|
|
|
|
**QUANTIZATION_METHODS:** |
|
|
|
| Method | Quantization | Advantages | Trade-offs | |
|
|---|---|---|---| |
|
| q2_k | 2-bit integers | Significant model size reduction | Minimal impact on accuracy | |
|
| q3_k_l | 3-bit integers | Balance between model size reduction and accuracy preservation | Moderate impact on accuracy | |
|
| q3_k_m | 3-bit integers | Enhanced accuracy with mixed precision | Increased computational complexity | |
|
| q3_k_s | 3-bit integers | Improved model efficiency with structured pruning | Reduced accuracy | |
|
| q4_0 | 4-bit integers | Significant model size reduction | Moderate impact on accuracy | |
|
| q4_1 | 4-bit integers | Enhanced accuracy with mixed precision | Increased computational complexity | |
|
| q4_k_m | 4-bit integers | Optimized model size and accuracy with mixed precision and structured pruning | Reduced accuracy | |
|
| q4_k_s | 4-bit integers | Improved model efficiency with structured pruning | Reduced accuracy | |
|
| q5_0 | 5-bit integers | Balance between model size reduction and accuracy preservation | Moderate impact on accuracy | |
|
| q5_1 | 5-bit integers | Enhanced accuracy with mixed precision | Increased computational complexity | |
|
| q5_k_m | 5-bit integers | Optimized model size and accuracy with mixed precision and structured pruning | Reduced accuracy | |
|
| q5_k_s | 5-bit integers | Improved model efficiency with structured pruning | Reduced accuracy | |
|
| q6_k | 6-bit integers | Balance between model size reduction and accuracy preservation | Moderate impact on accuracy | |
|
| q8_0 | 8-bit integers | Significant model size reduction | Minimal impact on accuracy | |
|
|
|
|
|
|
|
<div align="center"> |
|
<!-- Replace `#` with your actual links --> |
|
<a href="https://youtube.com/@devsdocode"><img alt="YouTube" src="https://img.shields.io/badge/YouTube-FF0000?style=for-the-badge&logo=youtube&logoColor=white"></a> |
|
<a href="https://t.me/devsdocode"><img alt="Telegram" src="https://img.shields.io/badge/Telegram-2CA5E0?style=for-the-badge&logo=telegram&logoColor=white"></a> |
|
<a href="https://www.instagram.com/sree.shades_/"><img alt="Instagram" src="https://img.shields.io/badge/Instagram-E4405F?style=for-the-badge&logo=instagram&logoColor=white"></a> |
|
<a href="https://www.linkedin.com/in/developer-sreejan/"><img alt="LinkedIn" src="https://img.shields.io/badge/LinkedIn-0077B5?style=for-the-badge&logo=linkedin&logoColor=white"></a> |
|
<a href="https://buymeacoffee.com/devsdocode"><img alt="Buy Me A Coffee" src="https://img.shields.io/badge/Buy%20Me%20A%20Coffee-FFDD00?style=for-the-badge&logo=buymeacoffee&logoColor=black"></a> |
|
</div> |