Monor
/

hwtcmner

Token Classification

Traditional Chinese Medicine

Inference Endpoints

Model card Files Files and versions Community

hwtcmner / README.md

Monor's picture

Update README.md

d61d95a verified 8 months ago

|

history blame contribute delete

3.09 kB

	---
	license: apache-2.0
	language:
	- zh
	tags:
	- NER
	- TCM
	- Traditional Chinese Medicine
	- medical
	widget:
	- text: "化滞汤,出处：《证治汇补》卷八。。组成：青皮20g，陈皮20g，厚朴20g，枳实20g，黄芩20g，黄连20g，当归20g，芍药20g，木香5g，槟榔8g，滑石3g，甘草4g。。主治：下痢因于食积气滞者。"
	example_title: "Example 1"
	---
	# TCMNER

	[About Author](https://github.com/huangxinping).
	[Our Products](https://zhongyigen.com)

	# Model description

	TCMNER is a fine-tuned BERT model that is ready to use for Named Entity Recognition of Traditional Chinese Medicine and achieves state-of-the-art performance for the NER task. It has been trained to recognize six types of entities: prescription (方剂), herb (本草), source (来源), disease (病名), symptom (症状) and syndrome（证型）.

	Specifically, this model is a TCMRoBERTa model, a fine-tuned model of RoBERTa for Traditional Chinese medicine, that was fine-tuned on the Chinese version of the [Haiwei AI Lab](https://www.haiweikexin.com/)'s Named Entity Recognition dataset.

	Currently, TCMRoBERTa is just a closed-source model for my own company and will be open-source in the future.


	# How to use

	You can use this model with Transformers pipeline for NER.

	```
	from transformers import AutoTokenizer, AutoModelForTokenClassification
	from transformers import pipeline

	tokenizer = AutoTokenizer.from_pretrained("Monor/TCMNER")
	model = AutoModelForTokenClassification.from_pretrained("Monor/TCMNER")

	nlp = pipeline("ner", model=model, tokenizer=tokenizer)
	example = "化滞汤,出处：《证治汇补》卷八。。组成：青皮20g，陈皮20g，厚朴20g，枳实20g，黄芩20g，黄连20g，当归20g，芍药20g，木香5g，槟榔8g，滑石3g，甘草4g。。主治：下痢因于食积气滞者。"

	ner_results = nlp(example)
	print(ner_results)
	```


	## Training data

	This model was fine-tuned on MY DATASET.

	Abbreviation\|Description
	-\|-
	O\|Outside of a named entity
	B-方剂 \|Beginning of a prescription entity right after another prescription entity
	I-方剂 \| Prescription entity
	B-本草 \|Beginning of a herb entity right after another herb entity
	I-本草 \|Herb entity
	B-来源 \|Beginning of a source of prescription right after another source of prescription
	I-来源 \|Source entity
	B-病名 \|Beginning of a disease's name right after another disease's name
	I-病名 \|Disease's name
	B-症状 \|Beginning of a symptom right after another symptom
	I-症状 \|Symptom
	B-证型 \|Beginning of a syndrome right after another syndrome
	I-证型 \|Syndrome

	# Eval results

	![alt text](images/iShot_2024-06-07_18.03.00.png "Title")


	# Notices

	1. The model is commercially available for free.
	2. I am not going to write a paper about this model, if you use any details in your paper, please mention it, thanks.

	---

	# Bonus

	All of our TCM domain models will be open-sourced soon, including:
	1. A series of pre-trained models
	2. Named entity recognition for TCM
	3. Text localization in ancient images
	4. OCR for ancient images

	And so on