CausalLM
/

35b-beta2ep

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Tokenizer is different from cohere - and chat template is ChatML - fully fine-tuned at 128K+ ~ 30M entries long, web crawl input, GPT-4-32k/3.5-16k output, synthetic dataset - 1 epoch

For another candidate version of 1 epoch - https://huggingface.co./CausalLM/35b-beta - somehow less overfitting?

No loras, no quants, no tricks.

This one is not "very 128k", use https://huggingface.co./CausalLM/35b-beta-long for long context. But better in general tasks, knowledge, coding and so on.

And, merge them if you want!

Downloads last month: 17

Safetensors

Model size

35B params

Tensor type

BF16

·

Inference Providers NEW

Text Generation

This model is not currently available via any of the supported Inference Providers.

Model tree for CausalLM/35b-beta2ep

Quantizations

1 model

Datasets used to train CausalLM/35b-beta2ep

Collection including CausalLM/35b-beta2ep

34B & 35B

5 items • Updated Aug 25, 2024 • 2