metadata

license: apache-2.0
datasets:
  - GainEnergy/SMoE-Training
  - GainEnergy/reasoner
  - GainEnergy/ogai-8x7B
  - GainEnergy/oilandgas-engineering-dataset
  - GainEnergy/ogdataset
  - GainEnergy/upstrimacentral
  - open-r1/OpenR1-Math-220k
  - unsloth/LaTeX_OCR
base_model: mistralai/Mathstral-7B-v0.1
tags:
  - oil-gas
  - drilling-engineering
  - mathstral-7b
  - lora
  - fine-tuned
  - energy-ai
  - pragmatic-ai
  - gguf
  - text-generation-inference
  - text-generation
model-index:
  - name: OGAI-STEM-7B
    results:
      - task:
          type: text-generation
          name: Engineering AI for Oil & Gas
        dataset:
          name: GainEnergy Oil & Gas Corpus
          type: custom
        metrics:
          - name: Engineering Calculations Accuracy
            type: accuracy
            value: 94.5
          - name: Scientific Computation Precision
            type: precision
            value: 92.3
          - name: Context Retention
            type: contextual-coherence
            value: High
    variants:
      - name: OGAI-STEM-7B-GGUF
        pipeline_tag: text-generation
        repo_name: GainEnergy/OGAI-STEM-7B-GGUF
library_name: transformers
language:
  - en
widget:
  - text: >-
      User: What is the pressure drop in a horizontal pipeline for crude oil
      transport?

      AI:
    example_title: Pipeline Pressure Drop Calculation
  - text: >-
      User: Explain the differences between gas lift and electric submersible
      pumps in artificial lift.

      AI:
    example_title: Artificial Lift Methods
  - text: |-
      User: How do you calculate mud weight for deepwater drilling?
      AI:
    example_title: Mud Weight Calculation
  - text: >-
      User: Describe the steps to optimize wellbore stability in unconventional
      reservoirs.

      AI:
    example_title: Wellbore Stability Optimization
pipeline_tag: text-generation

OGAI-STEM-7B: AI-Powered Engineering Model for Oil & Gas Calculations

Model Description

OGAI-STEM-7B is a LoRA fine-tuned Mathstral-7B model, designed specifically for oil and gas engineering, scientific computing, and technical problem-solving. It is optimized for numerical accuracy, complex engineering calculations, and technical document understanding.

The model is an integral part of GainEnergy's Upstrima AI Platform, enhancing workflows with pragmatic AI agents, scientific computing tools, and retrieval-augmented generation (RAG)-based document analysis.

Technical Architecture

Base Model Specifications

Architecture: Mathstral-7B (Mistral fine-tuned for advanced math reasoning)
Parameters: 7B
Context Length: 32,768 tokens for long-form scientific queries
Mathematical Precision: Enhanced for oil & gas engineering computations

Fine-tuning Approach

Method: Low-Rank Adaptation (LoRA) with rank 64
Training Dataset: 3.2M datapoints from specialized oil & gas engineering sources
Hardware: Trained on 8x NVIDIA A100 80GB GPUs
Training Time: 2,200 GPU hours
Special Features: Improved accuracy in fluid mechanics, pressure drop, and geomechanics calculations

Performance Optimizations

Quantization: 4-bit and 8-bit versions optimized for low-memory inference
Inference Speed: Tuned KV cache management for real-time engineering computations
Memory Footprint: Runs efficiently on 12GB VRAM with 4-bit quantization
Reduced Hallucinations: Domain-specific fine-tuning minimizes incorrect scientific results

Deployment-Optimized Versions

Version	Memory Requirement	Performance
OGAI-STEM-7B-GGUF	CPU optimized	Suitable for edge computing

Local Deployment with vLLM

python -m vllm.entrypoints.openai.api_server \
  --model GainEnergy/ogai-stem-7b \
  --tensor-parallel-size 2

How to Use

Run Inference in Python

from transformers import AutoTokenizer, AutoModelForCausalLM

model_name = "GainEnergy/ogai-stem-7b"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto")

prompt = "Calculate the pressure drop in a 500m pipeline with a 10,000 BPD flow rate."
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
outputs = model.generate(**inputs, max_new_tokens=100)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Citing OGAI-STEM-7B

@article{ogai_stem_7b_2025,
  title={OGAI-STEM-7B: AI Model for Oil & Gas Scientific Computing},
  author={GainEnergy AI Team},
  year={2025},
  publisher={Hugging Face Models}
}