kamilakesbi commited on
Commit
44733bf
·
verified ·
1 Parent(s): f03975e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +47 -1
README.md CHANGED
@@ -27,7 +27,53 @@ It achieves the following results on the evaluation set:
27
 
28
  ## Model description
29
 
30
- More information needed
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
31
 
32
  ## Intended uses & limitations
33
 
 
27
 
28
  ## Model description
29
 
30
+
31
+ This segmentation model has been trained on Japanese data (Callhome) using [diarizers](https://github.com/huggingface/diarizers/tree/main).
32
+ It can be loaded with two lines of code:
33
+
34
+ ```python
35
+ from diarizers import SegmentationModel
36
+
37
+ segmentation_model = SegmentationModel().from_pretrained('diarizers-community/speaker-segmentation-fine-tuned-callhome-jpn')
38
+ ```
39
+
40
+ To use it within a pyannote speaker diarization pipeline, load the [pyannote/speaker-diarization-3.1](https://huggingface.co/pyannote/speaker-diarization-3.1) pipeline, and convert the model to a pyannote compatible format:
41
+
42
+ ```python
43
+
44
+ from pyannote.audio import Pipeline
45
+ import torch
46
+
47
+ device = torch.device("cuda:0") if torch.cuda.is_available() else torch.device("cpu")
48
+
49
+ # load the pre-trained pyannote pipeline
50
+ pipeline = Pipeline.from_pretrained("pyannote/speaker-diarization-3.1")
51
+ pipeline.to(device)
52
+
53
+ # replace the segmentation model with your fine-tuned one
54
+ segmentation_model = segmentation_model.to_pyannote_model()
55
+ pipeline._segmentation.model = model.to(device)
56
+ ```
57
+
58
+ You can now use the pipeline on audio examples:
59
+
60
+ ```python
61
+ from datasets import load_dataset
62
+ # load dataset example
63
+ dataset = load_dataset("diarizers-community/callhome", "jpn", split="data")
64
+ sample = dataset[0]["audio"]
65
+
66
+ # pre-process inputs
67
+ sample["waveform"] = torch.from_numpy(sample.pop("array")[None, :]).to(device, dtype=model.dtype)
68
+ sample["sample_rate"] = sample.pop("sampling_rate")
69
+
70
+ # perform inference
71
+ diarization = pipeline(sample)
72
+
73
+ # dump the diarization output to disk using RTTM format
74
+ with open("audio.rttm", "w") as rttm:
75
+ diarization.write_rttm(rttm)
76
+ ```
77
 
78
  ## Intended uses & limitations
79