Model Information

EpistemeAI/Fireball-R1-Llama-3.1-8B-Medical-COT

License Version

This is a state-of-the-art language model optimized for neutrality, STEM proficiency, and ethical alignment. Fine-tuned Deepseek-R1-distill-llama-8b-unsloth-bnb-4bit for science, chemistry, and mathematics with reduced cultural/political bias. This large language model is open source. This is supervised fine tuned with medical chain of thought


Table of Contents


Features

  • Neutral Worldview: Minimizes political/cultural bias via globally diverse training data and human feedback.
  • STEM Specialization: Enhanced performance in:
    • Chemistry: Reaction mechanisms, periodic trends, spectroscopy.
    • Mathematics: Equation solving, proofs, calculus.
    • General Science: Hypothesis generation, research summarization.
  • Ethical Guardrails: Filters sensitive content and flags uncertain outputs.

Installation

pip install transformers torch
pip install accelerate
pip install -U transformers

Basic Inference


from transformers import AutoTokenizer, AutoModelForCausalLM  

tokenizer = AutoTokenizer.from_pretrained("EpistemeAI/Fireball-R1-Llama-3.1-8B-Medical-COT")  
model = AutoModelForCausalLM.from_pretrained("EpistemeAI/Fireball-R1-Llama-3.1-8B-Medical-COT")  

prompt = "Calculate the molar mass of sulfuric acid (H₂SO₄)."  
inputs = tokenizer(prompt, return_tensors="pt")  
outputs = model.generate(**inputs, max_length=200)  
print(tokenizer.decode(outputs[0], skip_special_tokens=True))


##advance inference
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM

# Load the tokenizer
tokenizer = AutoTokenizer.from_pretrained("EpistemeAI/Fireball-R1-Llama-3.1-8B-Medical-COT")

# Load the model in 8-bit precision using bitsandbytes (requires a CUDA GPU)
model = AutoModelForCausalLM.from_pretrained(
    "EpistemeAI/Fireball-R1-Llama-3.1-8B",
    load_in_8bit=True,      # Enable 8-bit loading to reduce memory usage
    device_map="auto"       # Automatically map model layers to the available device(s)
)

# Define the system prompt and the user prompt
system_prompt = "You are a highly knowledgeable assistant with expertise in chemistry and physics. <think>"
user_prompt = "Calculate the molar mass of sulfuric acid (H₂SO₄)."

# Combine the system prompt with the user prompt. The format here follows a common convention for chat-like interactions.
full_prompt = f"System: {system_prompt}\nUser: {user_prompt}\nAssistant:"

# Tokenize the combined prompt and move the inputs to the GPU
inputs = tokenizer(full_prompt, return_tensors="pt").to("cuda")

# Generate output text from the model
outputs = model.generate(**inputs, max_length=12200)

# Decode and print the result, skipping special tokens
result = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(result)

Uploaded model

  • Developed by: EpistemeAI
  • License: apache-2.0
  • Finetuned from model : unsloth/deepseek-r1-distill-llama-8b-unsloth-bnb-4bit

Ethical Considerations

Do Not Use For:

  • legal advice without expert oversight.
  • Generating partisan or culturally insensitive content.

Limitations:

  • May occasionally produce plausible but incorrect scientific explanations.
  • Not fully immune to subtle biases.

Thank you

We appreciate the companies as following: Unsloth, Meta and Deepseek.

License

This model is licensed under [apache-2.0] - see LICENSE for details.

Uploaded model

  • Developed by: EpistemeAI
  • License: apache-2.0
  • Finetuned from model : EpistemeAI/Fireball-R1-Llama-3.1-8B

This llama model was trained 2x faster with Unsloth and Huggingface's TRL library.

Downloads last month
73
Safetensors
Model size
8.03B params
Tensor type
FP16
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Model tree for EpistemeAI/Fireball-R1-Llama-3.1-8B-Medical-COT

Finetuned
(3)
this model
Quantizations
9 models

Dataset used to train EpistemeAI/Fireball-R1-Llama-3.1-8B-Medical-COT

Space using EpistemeAI/Fireball-R1-Llama-3.1-8B-Medical-COT 1