DopeorNope
/

SOLARC-MOE-10.7Bx6

Text Generation

Mixture of Experts

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

SOLARC-MOE-10.7Bx6 / README.md

DopeorNope's picture

Update README.md

eaf7f8f verified 8 months ago

|

history blame contribute delete

No virus

1.81 kB

	---
	language:
	- en
	library_name: transformers
	pipeline_tag: text-generation
	license: cc-by-nc-sa-4.0
	tags:
	- moe
	- merge
	- MoE
	---
	The license is `cc-by-nc-sa-4.0`.

	# 🐻‍❄️SOLARC-MOE-10.7Bx6🐻‍❄️
	![img](https://drive.google.com/uc?export=view&id=1_Qa2TfLMw3WeJ23dHkrP1Xln_RNt1jqG)


	## Model Details

	Model Developers Seungyoo Lee(DopeorNope)

	I am in charge of Large Language Models (LLMs) at Markr AI team in South Korea.

	Input Models input text only.

	Output Models generate text only.

	Model Architecture
	SOLARC-MOE-10.7Bx6 is an auto-regressive language model based on the SOLAR architecture.

	---

	## Base Model

	[kyujinpy/Sakura-SOLAR-Instruct](https://huggingface.co./kyujinpy/Sakura-SOLAR-Instruct)

	[Weyaxi/SauerkrautLM-UNA-SOLAR-Instruct](https://huggingface.co./Weyaxi/SauerkrautLM-UNA-SOLAR-Instruct)

	[VAGOsolutions/SauerkrautLM-SOLAR-Instruct](https://huggingface.co./VAGOsolutions/SauerkrautLM-SOLAR-Instruct)

	[fblgit/UNA-SOLAR-10.7B-Instruct-v1.0](https://huggingface.co./fblgit/UNA-SOLAR-10.7B-Instruct-v1.0)

	[jeonsworld/CarbonVillain-en-10.7B-v1](https://huggingface.co./jeonsworld/CarbonVillain-en-10.7B-v1)


	## Implemented Method

	I have built a model using the Mixture of Experts (MOE) approach, utilizing each of these models as the base.

	I wanted to test if it was possible to compile with a non-power of 2, like with 6

	---

	# Implementation Code


	## Load model
	```python

	from transformers import AutoModelForCausalLM, AutoTokenizer
	import torch

	repo = "DopeorNope/SOLARC-MOE-10.7Bx6"
	OpenOrca = AutoModelForCausalLM.from_pretrained(
	repo,
	return_dict=True,
	torch_dtype=torch.float32,
	device_map='auto'
	)
	OpenOrca_tokenizer = AutoTokenizer.from_pretrained(repo)
	```


	---