ko-ctc-kenlm-spelling-only-wiki

Table of Contents

Model Details

  • Model Description
    • ์Œํ–ฅ ๋ชจ๋ธ์„ ์œ„ํ•œ N-gram Base์˜ LM์œผ๋กœ ์ž์†Œ๋ณ„ ๋‹จ์–ด๊ธฐ๋ฐ˜์œผ๋กœ ๋งŒ๋“ค์–ด์กŒ์œผ๋ฉฐ, KenLM์œผ๋กœ ํ•™์Šต๋˜์—ˆ์Šต๋‹ˆ๋‹ค. ํ•ด๋‹น ๋ชจ๋ธ์€ ko-spelling-wav2vec2-conformer-del-1s๊ณผ ์‚ฌ์šฉํ•˜์‹ญ์‹œ์˜ค.
    • HuggingFace Transformers Style๋กœ ๋ถˆ๋Ÿฌ์™€ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋„๋ก ์ฒ˜๋ฆฌํ–ˆ์Šต๋‹ˆ๋‹ค.
    • pyctcdecode lib์„ ์ด์šฉํ•ด์„œ๋„ ๋ฐ”๋กœ ์‚ฌ์šฉ๊ฐ€๋Šฅํ•ฉ๋‹ˆ๋‹ค.
    • data๋Š” wiki korean์„ ์‚ฌ์šฉํ–ˆ์Šต๋‹ˆ๋‹ค.
      spelling vocab data์— ์—†๋Š” ๋ฌธ์žฅ์€ ์ „๋ถ€ ์ œ๊ฑฐํ•˜์—ฌ, ์˜คํžˆ๋ ค LM์œผ๋กœ Outlier๊ฐ€ ๋ฐœ์ƒํ•  ์†Œ์š”๋ฅผ ์ตœ์†Œํ™” ์‹œ์ผฐ์Šต๋‹ˆ๋‹ค.
      ํ•ด๋‹น ๋ชจ๋ธ์€ ์ฒ ์ž์ „์‚ฌ ๊ธฐ์ค€์˜ ๋ฐ์ดํ„ฐ๋กœ ํ•™์Šต๋œ ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค. (์ˆซ์ž์™€ ์˜์–ด๋Š” ๊ฐ ํ‘œ๊ธฐ๋ฒ•์„ ๋”ฐ๋ฆ„)
  • Developed by: TADev (@lIlBrother)
  • Language(s): Korean
  • License: apache-2.0

How to Get Started With the Model

import librosa
from pyctcdecode import build_ctcdecoder
from transformers import (
    AutoConfig,
    AutoFeatureExtractor,
    AutoModelForCTC,
    AutoTokenizer,
    Wav2Vec2ProcessorWithLM,
)
from transformers.pipelines import AutomaticSpeechRecognitionPipeline

audio_path = ""

# ๋ชจ๋ธ๊ณผ ํ† ํฌ๋‚˜์ด์ €, ์˜ˆ์ธก์„ ์œ„ํ•œ ๊ฐ ๋ชจ๋“ˆ๋“ค์„ ๋ถˆ๋Ÿฌ์˜ต๋‹ˆ๋‹ค.
model = AutoModelForCTC.from_pretrained("42MARU/ko-spelling-wav2vec2-conformer-del-1s")
feature_extractor = AutoFeatureExtractor.from_pretrained("42MARU/ko-spelling-wav2vec2-conformer-del-1s")
tokenizer = AutoTokenizer.from_pretrained("42MARU/ko-spelling-wav2vec2-conformer-del-1s")
processor = Wav2Vec2ProcessorWithLM("42MARU/ko-ctc-kenlm-spelling-only-wiki")

# ์‹ค์ œ ์˜ˆ์ธก์„ ์œ„ํ•œ ํŒŒ์ดํ”„๋ผ์ธ์— ์ •์˜๋œ ๋ชจ๋“ˆ๋“ค์„ ์‚ฝ์ž….
asr_pipeline = AutomaticSpeechRecognitionPipeline(
    model=model,
    tokenizer=processor.tokenizer,
    feature_extractor=processor.feature_extractor,
    decoder=processor.decoder,
    device=-1,
)

# ์Œ์„ฑํŒŒ์ผ์„ ๋ถˆ๋Ÿฌ์˜ค๊ณ  beamsearch ํŒŒ๋ผ๋ฏธํ„ฐ๋ฅผ ํŠน์ •ํ•˜์—ฌ ์˜ˆ์ธก์„ ์ˆ˜ํ–‰ํ•ฉ๋‹ˆ๋‹ค.
raw_data, _ = librosa.load(audio_path, sr=16000)
kwargs = {"decoder_kwargs": {"beam_width": 100}}
pred = asr_pipeline(inputs=raw_data, **kwargs)["text"]
# ๋ชจ๋ธ์ด ์ž์†Œ ๋ถ„๋ฆฌ ์œ ๋‹ˆ์ฝ”๋“œ ํ…์ŠคํŠธ๋กœ ๋‚˜์˜ค๋ฏ€๋กœ, ์ผ๋ฐ˜ String์œผ๋กœ ๋ณ€ํ™˜ํ•ด์ค„ ํ•„์š”๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค.
result = unicodedata.normalize("NFC", pred)
print(result)
# ์•ˆ๋…•ํ•˜์„ธ์š” 123 ํ…Œ์ŠคํŠธ์ž…๋‹ˆ๋‹ค.
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The HF Inference API does not support text2text-generation models for kenlm library.