bge-m3-nvidia-ko-v1-Q4_K_M-GGUF / README.md

Upload README.md with huggingface_hub

4d2a761 verified 11 days ago

11.3 kB

	---
	base_model: harheem/bge-m3-nvidia-ko-v1
	language:
	- en
	library_name: sentence-transformers
	license: apache-2.0
	metrics:
	- cosine_accuracy@1
	- cosine_accuracy@3
	- cosine_accuracy@5
	- cosine_accuracy@10
	- cosine_precision@1
	- cosine_precision@3
	- cosine_precision@5
	- cosine_precision@10
	- cosine_recall@1
	- cosine_recall@3
	- cosine_recall@5
	- cosine_recall@10
	- cosine_ndcg@10
	- cosine_mrr@10
	- cosine_map@100
	pipeline_tag: sentence-similarity
	tags:
	- sentence-transformers
	- sentence-similarity
	- feature-extraction
	- dataset_size:1K<n<10K
	- loss:MatryoshkaLoss
	- loss:MultipleNegativesRankingLoss
	- llama-cpp
	- gguf-my-repo
	widget:
	- source_sentence: 하이브리다이저란 무엇인가요?
	sentences:
	- 하이퍼바이저는 보안에서 어떤 역할을 합니까?
	- 지난 몇 년간 CUDA 생태계는 어떻게 발전해 왔나요?
	- 로컬 메모리 액세스 성능을 결정하는 요소는 무엇입니까?
	- source_sentence: 임시 구독의 용도는 무엇입니까?
	sentences:
	- 메모리 액세스 최적화에서 프리패치의 역할은 무엇입니까?
	- CUDA 인식 MPI는 확장 측면에서 어떻게 작동합니까?
	- CUDA 8이 해결하는 계산상의 과제에는 어떤 것이 있습니까?
	- source_sentence: '''saxpy''는 무엇을 뜻하나요?'
	sentences:
	- CUDA C/C++의 맥락에서 SAXPY는 무엇입니까?
	- Numba는 다른 GPU 가속 방법과 어떻게 다른가요?
	- 장치 LTO는 CUDA 애플리케이션에 어떤 이점을 제공합니까?
	- source_sentence: USD/Hydra란 무엇인가요?
	sentences:
	- 쿠다란 무엇인가요?
	- y 미분 계산에 사용되는 접근 방식의 단점은 무엇입니까?
	- Pascal 아키텍처는 통합 메모리를 어떻게 개선합니까?
	- source_sentence: CUDAcast란 무엇인가요?
	sentences:
	- CUDACast 시리즈에서는 어떤 주제를 다룰 예정인가요?
	- 이 게시물에 기여한 것으로 인정받은 사람은 누구입니까?
	- WSL 2에서 NVML의 목적은 무엇입니까?
	model-index:
	- name: BGE base Financial Matryoshka
	results:
	- task:
	type: information-retrieval
	name: Information Retrieval
	dataset:
	name: dim 768
	type: dim_768
	metrics:
	- type: cosine_accuracy@1
	value: 0.5443037974683544
	name: Cosine Accuracy@1
	- type: cosine_accuracy@3
	value: 0.7749648382559775
	name: Cosine Accuracy@3
	- type: cosine_accuracy@5
	value: 0.8523206751054853
	name: Cosine Accuracy@5
	- type: cosine_accuracy@10
	value: 0.9409282700421941
	name: Cosine Accuracy@10
	- type: cosine_precision@1
	value: 0.5443037974683544
	name: Cosine Precision@1
	- type: cosine_precision@3
	value: 0.2583216127519925
	name: Cosine Precision@3
	- type: cosine_precision@5
	value: 0.17046413502109703
	name: Cosine Precision@5
	- type: cosine_precision@10
	value: 0.09409282700421939
	name: Cosine Precision@10
	- type: cosine_recall@1
	value: 0.5443037974683544
	name: Cosine Recall@1
	- type: cosine_recall@3
	value: 0.7749648382559775
	name: Cosine Recall@3
	- type: cosine_recall@5
	value: 0.8523206751054853
	name: Cosine Recall@5
	- type: cosine_recall@10
	value: 0.9409282700421941
	name: Cosine Recall@10
	- type: cosine_ndcg@10
	value: 0.7411108924386547
	name: Cosine Ndcg@10
	- type: cosine_mrr@10
	value: 0.677065054807671
	name: Cosine Mrr@10
	- type: cosine_map@100
	value: 0.6802131506478553
	name: Cosine Map@100
	- task:
	type: information-retrieval
	name: Information Retrieval
	dataset:
	name: dim 512
	type: dim_512
	metrics:
	- type: cosine_accuracy@1
	value: 0.5386779184247539
	name: Cosine Accuracy@1
	- type: cosine_accuracy@3
	value: 0.7749648382559775
	name: Cosine Accuracy@3
	- type: cosine_accuracy@5
	value: 0.8593530239099859
	name: Cosine Accuracy@5
	- type: cosine_accuracy@10
	value: 0.9451476793248945
	name: Cosine Accuracy@10
	- type: cosine_precision@1
	value: 0.5386779184247539
	name: Cosine Precision@1
	- type: cosine_precision@3
	value: 0.2583216127519925
	name: Cosine Precision@3
	- type: cosine_precision@5
	value: 0.17187060478199717
	name: Cosine Precision@5
	- type: cosine_precision@10
	value: 0.09451476793248943
	name: Cosine Precision@10
	- type: cosine_recall@1
	value: 0.5386779184247539
	name: Cosine Recall@1
	- type: cosine_recall@3
	value: 0.7749648382559775
	name: Cosine Recall@3
	- type: cosine_recall@5
	value: 0.8593530239099859
	name: Cosine Recall@5
	- type: cosine_recall@10
	value: 0.9451476793248945
	name: Cosine Recall@10
	- type: cosine_ndcg@10
	value: 0.7413571133247474
	name: Cosine Ndcg@10
	- type: cosine_mrr@10
	value: 0.6759917844306029
	name: Cosine Mrr@10
	- type: cosine_map@100
	value: 0.678939165210132
	name: Cosine Map@100
	- task:
	type: information-retrieval
	name: Information Retrieval
	dataset:
	name: dim 256
	type: dim_256
	metrics:
	- type: cosine_accuracy@1
	value: 0.540084388185654
	name: Cosine Accuracy@1
	- type: cosine_accuracy@3
	value: 0.7791842475386779
	name: Cosine Accuracy@3
	- type: cosine_accuracy@5
	value: 0.8621659634317862
	name: Cosine Accuracy@5
	- type: cosine_accuracy@10
	value: 0.9423347398030942
	name: Cosine Accuracy@10
	- type: cosine_precision@1
	value: 0.540084388185654
	name: Cosine Precision@1
	- type: cosine_precision@3
	value: 0.25972808251289264
	name: Cosine Precision@3
	- type: cosine_precision@5
	value: 0.1724331926863572
	name: Cosine Precision@5
	- type: cosine_precision@10
	value: 0.09423347398030943
	name: Cosine Precision@10
	- type: cosine_recall@1
	value: 0.540084388185654
	name: Cosine Recall@1
	- type: cosine_recall@3
	value: 0.7791842475386779
	name: Cosine Recall@3
	- type: cosine_recall@5
	value: 0.8621659634317862
	name: Cosine Recall@5
	- type: cosine_recall@10
	value: 0.9423347398030942
	name: Cosine Recall@10
	- type: cosine_ndcg@10
	value: 0.7403981257690416
	name: Cosine Ndcg@10
	- type: cosine_mrr@10
	value: 0.6756379344986938
	name: Cosine Mrr@10
	- type: cosine_map@100
	value: 0.6787046866761269
	name: Cosine Map@100
	- task:
	type: information-retrieval
	name: Information Retrieval
	dataset:
	name: dim 128
	type: dim_128
	metrics:
	- type: cosine_accuracy@1
	value: 0.5218002812939522
	name: Cosine Accuracy@1
	- type: cosine_accuracy@3
	value: 0.7679324894514767
	name: Cosine Accuracy@3
	- type: cosine_accuracy@5
	value: 0.8635724331926864
	name: Cosine Accuracy@5
	- type: cosine_accuracy@10
	value: 0.9367088607594937
	name: Cosine Accuracy@10
	- type: cosine_precision@1
	value: 0.5218002812939522
	name: Cosine Precision@1
	- type: cosine_precision@3
	value: 0.2559774964838256
	name: Cosine Precision@3
	- type: cosine_precision@5
	value: 0.17271448663853725
	name: Cosine Precision@5
	- type: cosine_precision@10
	value: 0.09367088607594935
	name: Cosine Precision@10
	- type: cosine_recall@1
	value: 0.5218002812939522
	name: Cosine Recall@1
	- type: cosine_recall@3
	value: 0.7679324894514767
	name: Cosine Recall@3
	- type: cosine_recall@5
	value: 0.8635724331926864
	name: Cosine Recall@5
	- type: cosine_recall@10
	value: 0.9367088607594937
	name: Cosine Recall@10
	- type: cosine_ndcg@10
	value: 0.7305864977688176
	name: Cosine Ndcg@10
	- type: cosine_mrr@10
	value: 0.6641673922264634
	name: Cosine Mrr@10
	- type: cosine_map@100
	value: 0.6671648971944116
	name: Cosine Map@100
	- task:
	type: information-retrieval
	name: Information Retrieval
	dataset:
	name: dim 64
	type: dim_64
	metrics:
	- type: cosine_accuracy@1
	value: 0.509142053445851
	name: Cosine Accuracy@1
	- type: cosine_accuracy@3
	value: 0.7426160337552743
	name: Cosine Accuracy@3
	- type: cosine_accuracy@5
	value: 0.8284106891701828
	name: Cosine Accuracy@5
	- type: cosine_accuracy@10
	value: 0.9310829817158931
	name: Cosine Accuracy@10
	- type: cosine_precision@1
	value: 0.509142053445851
	name: Cosine Precision@1
	- type: cosine_precision@3
	value: 0.24753867791842477
	name: Cosine Precision@3
	- type: cosine_precision@5
	value: 0.16568213783403654
	name: Cosine Precision@5
	- type: cosine_precision@10
	value: 0.09310829817158929
	name: Cosine Precision@10
	- type: cosine_recall@1
	value: 0.509142053445851
	name: Cosine Recall@1
	- type: cosine_recall@3
	value: 0.7426160337552743
	name: Cosine Recall@3
	- type: cosine_recall@5
	value: 0.8284106891701828
	name: Cosine Recall@5
	- type: cosine_recall@10
	value: 0.9310829817158931
	name: Cosine Recall@10
	- type: cosine_ndcg@10
	value: 0.7135661304090457
	name: Cosine Ndcg@10
	- type: cosine_mrr@10
	value: 0.6444829549259928
	name: Cosine Mrr@10
	- type: cosine_map@100
	value: 0.6474431148702396
	name: Cosine Map@100
	---

	# hongkeon/bge-m3-nvidia-ko-v1-Q4_K_M-GGUF
	This model was converted to GGUF format from [`harheem/bge-m3-nvidia-ko-v1`](https://huggingface.co./harheem/bge-m3-nvidia-ko-v1) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co./spaces/ggml-org/gguf-my-repo) space.
	Refer to the [original model card](https://huggingface.co./harheem/bge-m3-nvidia-ko-v1) for more details on the model.

	## Use with llama.cpp
	Install llama.cpp through brew (works on Mac and Linux)

	```bash
	brew install llama.cpp

	```
	Invoke the llama.cpp server or the CLI.

	### CLI:
	```bash
	llama-cli --hf-repo hongkeon/bge-m3-nvidia-ko-v1-Q4_K_M-GGUF --hf-file bge-m3-nvidia-ko-v1-q4_k_m.gguf -p "The meaning to life and the universe is"
	```

	### Server:
	```bash
	llama-server --hf-repo hongkeon/bge-m3-nvidia-ko-v1-Q4_K_M-GGUF --hf-file bge-m3-nvidia-ko-v1-q4_k_m.gguf -c 2048
	```

	Note: You can also use this checkpoint directly through the [usage steps](https://github.com/ggerganov/llama.cpp?tab=readme-ov-file#usage) listed in the Llama.cpp repo as well.

	Step 1: Clone llama.cpp from GitHub.
	```
	git clone https://github.com/ggerganov/llama.cpp
	```

	Step 2: Move into the llama.cpp folder and build it with `LLAMA_CURL=1` flag along with other hardware-specific flags (for ex: LLAMA_CUDA=1 for Nvidia GPUs on Linux).
	```
	cd llama.cpp && LLAMA_CURL=1 make
	```

	Step 3: Run inference through the main binary.
	```
	./llama-cli --hf-repo hongkeon/bge-m3-nvidia-ko-v1-Q4_K_M-GGUF --hf-file bge-m3-nvidia-ko-v1-q4_k_m.gguf -p "The meaning to life and the universe is"
	```
	or
	```
	./llama-server --hf-repo hongkeon/bge-m3-nvidia-ko-v1-Q4_K_M-GGUF --hf-file bge-m3-nvidia-ko-v1-q4_k_m.gguf -c 2048
	```