Lauler commited on
Commit
0e77c88
·
verified ·
1 Parent(s): 5c3a7b4

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +41 -0
README.md ADDED
@@ -0,0 +1,41 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - sv
4
+ pipeline_tag: automatic-speech-recognition
5
+ ---
6
+ ## KB-Whisper Large (Beta)
7
+
8
+ Preliminary checkpoint of the National Library of Sweden's new Whisper models for Swedish. This version is for testing only, it has completed its first stage of continued pre-training. We will be doing additional post-training to reduce hallucations before releasing the final version of the model.
9
+
10
+ ### Usage
11
+
12
+ ```python
13
+ import torch
14
+ from datasets import load_dataset
15
+ from transformers import AutoModelForSpeechSeq2Seq, AutoProcessor, pipeline
16
+
17
+ device = "cuda:0" if torch.cuda.is_available() else "cpu"
18
+ torch_dtype = torch.float16 if torch.cuda.is_available() else torch.float32
19
+ model_id = "KBLab/kb-whisper-large-beta"
20
+
21
+ model = AutoModelForSpeechSeq2Seq.from_pretrained(
22
+ model_id, torch_dtype=torch_dtype, use_safetensors=True, cache_dir="cache"
23
+ )
24
+ model.to(device)
25
+ processor = AutoProcessor.from_pretrained(model_id)
26
+
27
+ pipe = pipeline(
28
+ "automatic-speech-recognition",
29
+ model=model,
30
+ tokenizer=processor.tokenizer,
31
+ feature_extractor=processor.feature_extractor,
32
+ torch_dtype=torch_dtype,
33
+ device=device,
34
+ )
35
+
36
+ generate_kwargs = {"task": "transcribe", "language": "sv"}
37
+ # Add return_timestamps=True for output with timestamps
38
+ res = pipe("audio.mp3",
39
+ chunk_length_s=30,
40
+ generate_kwargs={"task": "transcribe", "language": "sv"})
41
+ ```