|
--- |
|
license: apache-2.0 |
|
datasets: |
|
- GainEnergy/SMoE-Training |
|
- GainEnergy/reasoner |
|
- GainEnergy/ogai-8x7B |
|
- GainEnergy/oilandgas-engineering-dataset |
|
- GainEnergy/ogdataset |
|
- GainEnergy/upstrimacentral |
|
- open-r1/OpenR1-Math-220k |
|
- unsloth/LaTeX_OCR |
|
base_model: mistralai/Mathstral-7B-v0.1 |
|
tags: |
|
- oil-gas |
|
- drilling-engineering |
|
- mathstral-7b |
|
- lora |
|
- fine-tuned |
|
- energy-ai |
|
- pragmatic-ai |
|
- gguf |
|
- text-generation-inference |
|
- text-generation |
|
model-index: |
|
- name: OGAI-STEM-7B |
|
results: |
|
- task: |
|
type: text-generation |
|
name: Engineering AI for Oil & Gas |
|
dataset: |
|
name: GainEnergy Oil & Gas Corpus |
|
type: custom |
|
metrics: |
|
- name: Engineering Calculations Accuracy |
|
type: accuracy |
|
value: 94.5 |
|
- name: Scientific Computation Precision |
|
type: precision |
|
value: 92.3 |
|
- name: Context Retention |
|
type: contextual-coherence |
|
value: High |
|
variants: |
|
- name: OGAI-STEM-7B-GGUF |
|
pipeline_tag: text-generation |
|
repo_name: GainEnergy/OGAI-STEM-7B-GGUF |
|
library_name: transformers |
|
language: |
|
- en |
|
widget: |
|
- text: >- |
|
User: What is the pressure drop in a horizontal pipeline for crude oil transport? |
|
|
|
AI: |
|
example_title: Pipeline Pressure Drop Calculation |
|
- text: >- |
|
User: Explain the differences between gas lift and electric submersible pumps in artificial lift. |
|
|
|
AI: |
|
example_title: Artificial Lift Methods |
|
- text: >- |
|
User: How do you calculate mud weight for deepwater drilling? |
|
|
|
AI: |
|
example_title: Mud Weight Calculation |
|
- text: >- |
|
User: Describe the steps to optimize wellbore stability in unconventional reservoirs. |
|
|
|
AI: |
|
example_title: Wellbore Stability Optimization |
|
pipeline_tag: text-generation |
|
|
|
--- |
|
|
|
# OGAI-STEM-7B: AI-Powered Engineering Model for Oil & Gas Calculations |
|
|
|
 |
|
[](LICENSE) |
|
|
|
## Model Description |
|
|
|
**OGAI-STEM-7B** is a **LoRA fine-tuned Mathstral-7B model**, designed specifically for **oil and gas engineering, scientific computing, and technical problem-solving**. It is optimized for numerical accuracy, complex engineering calculations, and technical document understanding. |
|
|
|
The model is an integral part of **GainEnergy's Upstrima AI Platform**, enhancing workflows with **pragmatic AI agents, scientific computing tools, and retrieval-augmented generation (RAG)-based document analysis**. |
|
|
|
## Technical Architecture |
|
|
|
### Base Model Specifications |
|
- **Architecture**: Mathstral-7B (Mistral fine-tuned for advanced math reasoning) |
|
- **Parameters**: 7B |
|
- **Context Length**: 32,768 tokens for long-form scientific queries |
|
- **Mathematical Precision**: Enhanced for oil & gas engineering computations |
|
|
|
### Fine-tuning Approach |
|
- **Method**: Low-Rank Adaptation (LoRA) with rank 64 |
|
- **Training Dataset**: 3.2M datapoints from specialized oil & gas engineering sources |
|
- **Hardware**: Trained on 8x NVIDIA A100 80GB GPUs |
|
- **Training Time**: 2,200 GPU hours |
|
- **Special Features**: Improved accuracy in fluid mechanics, pressure drop, and geomechanics calculations |
|
|
|
### Performance Optimizations |
|
- **Quantization**: 4-bit and 8-bit versions optimized for low-memory inference |
|
- **Inference Speed**: Tuned KV cache management for real-time engineering computations |
|
- **Memory Footprint**: Runs efficiently on **12GB VRAM** with 4-bit quantization |
|
- **Reduced Hallucinations**: Domain-specific fine-tuning minimizes incorrect scientific results |
|
|
|
## Deployment-Optimized Versions |
|
|
|
| **Version** | **Memory Requirement** | **Performance** | |
|
|------------|----------------------|----------------| |
|
| [OGAI-STEM-7B-GGUF](https://huggingface.co./GainEnergy/OGAI-STEM-7B-GGUF) | CPU optimized | Suitable for edge computing | |
|
|
|
### Local Deployment with vLLM |
|
```bash |
|
python -m vllm.entrypoints.openai.api_server \ |
|
--model GainEnergy/ogai-stem-7b \ |
|
--tensor-parallel-size 2 |
|
``` |
|
|
|
## How to Use |
|
|
|
### Run Inference in Python |
|
```python |
|
from transformers import AutoTokenizer, AutoModelForCausalLM |
|
|
|
model_name = "GainEnergy/ogai-stem-7b" |
|
tokenizer = AutoTokenizer.from_pretrained(model_name) |
|
model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto") |
|
|
|
prompt = "Calculate the pressure drop in a 500m pipeline with a 10,000 BPD flow rate." |
|
inputs = tokenizer(prompt, return_tensors="pt").to("cuda") |
|
outputs = model.generate(**inputs, max_new_tokens=100) |
|
print(tokenizer.decode(outputs[0], skip_special_tokens=True)) |
|
``` |
|
|
|
## Citing OGAI-STEM-7B |
|
``` |
|
@article{ogai_stem_7b_2025, |
|
title={OGAI-STEM-7B: AI Model for Oil & Gas Scientific Computing}, |
|
author={GainEnergy AI Team}, |
|
year={2025}, |
|
publisher={Hugging Face Models} |
|
} |
|
``` |
|
|
|
|