meelu
/

DA-MORPH-CEREBRAS-TOKEN

Morphological Tokenization

Model card Files Files and versions Community

DA-MORPH-CEREBRAS-TOKEN / README.md

MikkelWK's picture

Update README.md

90685c6 verified 16 days ago

|

history blame contribute delete

882 Bytes

	---
	library_name: tokenizers
	tags: [Danish, Morphological Tokenization, CerebrasGPT]
	---
	```
	_______ ___ .___ ___. ______ .______ .______ __ __
	\| \ / \ \| \/ \| / __ \ \| _ \ \| _ \ \| \| \| \|
	\| .--. \| / ^ \ \| \ / \| \| \| \| \| \| \|_) \| \| \|_) \| \| \|__\| \|
	\| \| \| \| / /_\ \ \| \|\/\| \| \| \| \| \| \| / \| ___/ \| __ \|
	\| '--' \| / _____ \ \| \| \| \| \| `--' \| \| \|\ \----.\| \| \| \| \| \|
	\|_______/ /__/ \__\ \|__\| \|__\| \______/ \| _\| `._____\|\| _\| \|__\| \|__\|

	```
	### DA-MORPH-CEREBRAS-TOKEN

	This morphological tokenizer is designed for the CerebrasGPT architecture and focuses on segmenting Danish text based on linguistic principles, enabling more meaningful subword tokenization.