Defalt-404
/

GPT-6B_Tuned_small_pile

Text Generation

Inference Endpoints

Model card Files Files and versions Community

GPT-6B_Tuned_small_pile / README.md

Defalt-404's picture

Update README.md

38a656e over 1 year ago

|

history blame contribute delete

1.28 kB

	---
	datasets:
	- ola13/small-the_pile
	language:
	- en
	metrics:
	- accuracy
	- code_eval
	pipeline_tag: text-generation
	tags:
	- casual_lm
	---
	# Model Card for Model ID

	<!-- Provide a quick summary of what the model is/does. -->

	GPT-6B_Tuned_small_pile is a GPT-j-6B model trained on 0.1 million example of pile dataset.

	n_embd: 4096, n_layer: 28, n_positions: 2048

	Tuning Parameters:

	val_split_percent: 20,

	momentum: 0.9

	train_batch_size (eff) : 32

	train_micro_batch: 16

	gradient_accumulation_steps: 2

	gradient_clipping: 0.5

	learning_rate: 0.00001

	weight_decay: 0.01

	lr_schedular: cosine

	lr_warmup_steps: 1000

	lr_decay: 0.1

	lr_decay_step: 2000

	mixed_precision: bf16



	![image.png](https://s3.amazonaws.com/moonup/production/uploads/642bb1915df44ff245471fca/Ke-ShGT0sBVGEjrShxped.png)## Model Details

	### Model Description

	<!-- Provide a longer summary of what this model is. -->



	- Developed by: [More Information Needed]
	- Shared by [optional]: [More Information Needed]
	- Model type: [More Information Needed]
	- Language(s) (NLP): [More Information Needed]
	- License: [More Information Needed]
	- Finetuned from model: EleutherAI/gpt-j-6b