MARS / README.md

Update README.md

29dd2a8 verified 7 months ago

4.34 kB

	---
	license: llama3
	language:
	- tr
	- en
	base_model: meta-llama/Meta-Llama-3-8B-Instruct
	model-index:
	- name: MARS
	results:
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: AI2 Reasoning Challenge TR v0.2
	type: ai2_arc
	config: ARC-Challenge
	split: test
	args:
	num_few_shot: 25
	metrics:
	- type: acc
	value: 46.08
	name: accuracy
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: MMLU TR v0.2
	type: cais/mmlu
	config: all
	split: test
	args:
	num_few_shot: 5
	metrics:
	- type: acc
	value: 47.02
	name: accuracy
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: TruthfulQA TR v0.2
	type: truthful_qa
	config: multiple_choice
	split: validation
	args:
	num_few_shot: 0
	metrics:
	- type: acc
	name: accuracy
	value: 49.38
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: Winogrande TR v0.2
	type: winogrande
	config: winogrande_xl
	split: validation
	args:
	num_few_shot: 5
	metrics:
	- type: acc
	value: 53.71
	name: accuracy
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: GSM8k TR v0.2
	type: gsm8k
	config: main
	split: test
	args:
	num_few_shot: 5
	metrics:
	- type: acc
	value: 53.08
	name: accuracy
	pipeline_tag: text-generation
	---


	<img src="MARS-1.0.png" alt="Curiosity MARS model logo" style="border-radius: 1rem; width: 100%">


	<div style="display: flex; justify-content: center; align-items: center; flex-direction: column">
	<h1 style="font-size: 5em; margin-bottom: 0; padding-bottom: 0;">MARS</h1>
	<aside>by <a href="https://curiosity.tech">Curiosity Technology</a></aside>
	</div>

	MARS is the first iteration of Curiosity Technology models, based on Llama 3 8B.

	We have trained MARS on in-house Turkish dataset, as well as several open-source datasets and their Turkish
	translations.
	It is our intention to release Turkish translations in near future for community to have their go on them.

	MARS have been trained for 3 days on 4xA100.

	## Model Details

	- Base Model: Meta Llama 3 8B Instruct
	- Training Dataset: In-house & Translated Open Source Turkish Datasets
	- Training Method: LoRA Fine Tuning


	## How to use

	You can run conversational inference using the Transformers pipeline abstraction, or by leveraging the Auto classes with the `generate()` function. Let's see examples of both.

	### Transformers pipeline

	```python
	import transformers
	import torch

	model_id = "curiositytech/MARS"

	pipeline = transformers.pipeline(
	"text-generation",
	model=model_id,
	model_kwargs={"torch_dtype": torch.bfloat16},
	device_map="auto",
	)

	messages = [
	{"role": "system", "content": "Sen korsan gibi konuşan bir korsan chatbotsun!"},
	{"role": "user", "content": "Sen kimsin?"},
	]

	terminators = [
	pipeline.tokenizer.eos_token_id,
	pipeline.tokenizer.convert_tokens_to_ids("<\|eot_id\|>")
	]

	outputs = pipeline(
	messages,
	max_new_tokens=256,
	eos_token_id=terminators,
	do_sample=True,
	temperature=0.6,
	top_p=0.9,
	)
	print(outputs[0]["generated_text"][-1])
	```

	### Transformers AutoModelForCausalLM

	```python
	from transformers import AutoTokenizer, AutoModelForCausalLM
	import torch

	model_id = "curiositytech/MARS"

	tokenizer = AutoTokenizer.from_pretrained(model_id)
	model = AutoModelForCausalLM.from_pretrained(
	model_id,
	torch_dtype=torch.bfloat16,
	device_map="auto",
	)

	messages = [
	{"role": "system", "content": "Sen korsan gibi konuşan bir korsan chatbotsun!"},
	{"role": "user", "content": "Sen kimsin?"},
	]

	input_ids = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	return_tensors="pt"
	).to(model.device)

	terminators = [
	tokenizer.eos_token_id,
	tokenizer.convert_tokens_to_ids("<\|eot_id\|>")
	]

	outputs = model.generate(
	input_ids,
	max_new_tokens=256,
	eos_token_id=terminators,
	do_sample=True,
	temperature=0.6,
	top_p=0.9,
	)
	response = outputs[0][input_ids.shape[-1]:]
	print(tokenizer.decode(response, skip_special_tokens=True))
	```