OGAI-STEM-7B / README.md
tommytracx's picture
Update README.md
a5d7c6c verified
|
raw
history blame
4.72 kB
---
license: apache-2.0
datasets:
- GainEnergy/SMoE-Training
- GainEnergy/reasoner
- GainEnergy/ogai-8x7B
- GainEnergy/oilandgas-engineering-dataset
- GainEnergy/ogdataset
- GainEnergy/upstrimacentral
- open-r1/OpenR1-Math-220k
- unsloth/LaTeX_OCR
base_model: mistralai/Mathstral-7B-v0.1
tags:
- oil-gas
- drilling-engineering
- mathstral-7b
- lora
- fine-tuned
- energy-ai
- pragmatic-ai
- gguf
- text-generation-inference
- text-generation
model-index:
- name: OGAI-STEM-7B
results:
- task:
type: text-generation
name: Engineering AI for Oil & Gas
dataset:
name: GainEnergy Oil & Gas Corpus
type: custom
metrics:
- name: Engineering Calculations Accuracy
type: accuracy
value: 94.5
- name: Scientific Computation Precision
type: precision
value: 92.3
- name: Context Retention
type: contextual-coherence
value: High
variants:
- name: OGAI-STEM-7B-GGUF
pipeline_tag: text-generation
repo_name: GainEnergy/OGAI-STEM-7B-GGUF
library_name: transformers
language:
- en
widget:
- text: >-
User: What is the pressure drop in a horizontal pipeline for crude oil transport?
AI:
example_title: Pipeline Pressure Drop Calculation
- text: >-
User: Explain the differences between gas lift and electric submersible pumps in artificial lift.
AI:
example_title: Artificial Lift Methods
- text: >-
User: How do you calculate mud weight for deepwater drilling?
AI:
example_title: Mud Weight Calculation
- text: >-
User: Describe the steps to optimize wellbore stability in unconventional reservoirs.
AI:
example_title: Wellbore Stability Optimization
pipeline_tag: text-generation
---
# OGAI-STEM-7B: AI-Powered Engineering Model for Oil & Gas Calculations
![Hugging Face](https://img.shields.io/badge/HuggingFace-OGAI--STEM--7B-blue)
[![License](https://img.shields.io/github/license/huggingface/transformers.svg)](LICENSE)
## Model Description
**OGAI-STEM-7B** is a **LoRA fine-tuned Mathstral-7B model**, designed specifically for **oil and gas engineering, scientific computing, and technical problem-solving**. It is optimized for numerical accuracy, complex engineering calculations, and technical document understanding.
The model is an integral part of **GainEnergy's Upstrima AI Platform**, enhancing workflows with **pragmatic AI agents, scientific computing tools, and retrieval-augmented generation (RAG)-based document analysis**.
## Technical Architecture
### Base Model Specifications
- **Architecture**: Mathstral-7B (Mistral fine-tuned for advanced math reasoning)
- **Parameters**: 7B
- **Context Length**: 32,768 tokens for long-form scientific queries
- **Mathematical Precision**: Enhanced for oil & gas engineering computations
### Fine-tuning Approach
- **Method**: Low-Rank Adaptation (LoRA) with rank 64
- **Training Dataset**: 3.2M datapoints from specialized oil & gas engineering sources
- **Hardware**: Trained on 8x NVIDIA A100 80GB GPUs
- **Training Time**: 2,200 GPU hours
- **Special Features**: Improved accuracy in fluid mechanics, pressure drop, and geomechanics calculations
### Performance Optimizations
- **Quantization**: 4-bit and 8-bit versions optimized for low-memory inference
- **Inference Speed**: Tuned KV cache management for real-time engineering computations
- **Memory Footprint**: Runs efficiently on **12GB VRAM** with 4-bit quantization
- **Reduced Hallucinations**: Domain-specific fine-tuning minimizes incorrect scientific results
## Deployment-Optimized Versions
| **Version** | **Memory Requirement** | **Performance** |
|------------|----------------------|----------------|
| [OGAI-STEM-7B-GGUF](https://huggingface.co./GainEnergy/OGAI-STEM-7B-GGUF) | CPU optimized | Suitable for edge computing |
### Local Deployment with vLLM
```bash
python -m vllm.entrypoints.openai.api_server \
--model GainEnergy/ogai-stem-7b \
--tensor-parallel-size 2
```
## How to Use
### Run Inference in Python
```python
from transformers import AutoTokenizer, AutoModelForCausalLM
model_name = "GainEnergy/ogai-stem-7b"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto")
prompt = "Calculate the pressure drop in a 500m pipeline with a 10,000 BPD flow rate."
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
outputs = model.generate(**inputs, max_new_tokens=100)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```
## Citing OGAI-STEM-7B
```
@article{ogai_stem_7b_2025,
title={OGAI-STEM-7B: AI Model for Oil & Gas Scientific Computing},
author={GainEnergy AI Team},
year={2025},
publisher={Hugging Face Models}
}
```