405M_TIES-merge_pile_300B_into_german_200B_from_pile_replay25_density-0.05
405M_TIES-merge_pile_300B_into_german_200B_from_pile_replay25_density-0.05 is a merge of the following models using mergekit:
🧩 Configuration
```yamlmodels:
- model: btherien/JOB-3312838_410M_it-86245_tr-german-replay-25_scratch
no parameters necessary for base model
- model: btherien/JOB-3150994_410M_it-132366_tr-pile-train_scratch parameters: density: 0.05 weight: 1.0 merge_method: ties base_model: btherien/JOB-3312838_410M_it-86245_tr-german-replay-25_scratch parameters: normalize: true dtype: float16```
- Downloads last month
- 11
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.