RMSnow WelkinFang commited on
Commit
d9043cd
·
1 Parent(s): 408ce0f

Update README.md of svc (#2)

Browse files

- add guidance of vocoder and contentvec (8f4b4eb9cba11c6a3c615487aef7c823a9a58efc)


Co-authored-by: Zihao Fang <[email protected]>

Files changed (1) hide show
  1. README.md +19 -7
README.md CHANGED
@@ -27,27 +27,38 @@ We provide a [DiffWaveNetSVC](https://github.com/open-mmlab/Amphion/tree/main/eg
27
 
28
  To make these singers sing the songs you want to listen to, just run the following commands:
29
 
30
- ### Step1: Download the checkpoint
31
  ```bash
32
  git lfs install
33
  git clone https://huggingface.co/amphion/singing_voice_conversion
34
  ```
35
 
36
- ### Step2: Clone the Amphion's Source Code of GitHub
 
 
 
 
 
37
  ```bash
38
  git clone https://github.com/open-mmlab/Amphion.git
39
  ```
40
 
41
- ### Step3: Specify the checkpoint's path
42
- Use the soft link to specify the downloaded checkpoint in first step:
 
 
 
43
 
44
  ```bash
45
  cd Amphion
46
- mkdir ckpts/svc
47
- ln -s ../singing_voice_conversion/vocalist_l1_contentvec+whisper ckpts/svc/vocalist_l1_contentvec+whisper
 
48
  ```
49
 
50
- ### Step4: Conversion
 
 
51
 
52
  You can follow [this recipe](https://github.com/open-mmlab/Amphion/tree/main/egs/svc/MultipleContentsSVC#4-inferenceconversion) to conduct the conversion. For example, if you want to make Taylor Swift sing the songs in the `[Your Audios Folder]`, just run:
53
 
@@ -57,6 +68,7 @@ sh egs/svc/MultipleContentsSVC/run.sh --stage 3 --gpu "0" \
57
  --infer_expt_dir "ckpts/svc/vocalist_l1_contentvec+whisper" \
58
  --infer_output_dir "ckpts/svc/vocalist_l1_contentvec+whisper/result" \
59
  --infer_source_audio_dir [Your Audios Folder] \
 
60
  --infer_target_speaker "vocalist_l1_TaylorSwift" \
61
  --infer_key_shift "autoshift"
62
  ```
 
27
 
28
  To make these singers sing the songs you want to listen to, just run the following commands:
29
 
30
+ ### Step1: Download the acoustics model checkpoint
31
  ```bash
32
  git lfs install
33
  git clone https://huggingface.co/amphion/singing_voice_conversion
34
  ```
35
 
36
+ ### Step2: Download the vocoder checkpoint
37
+ ```bash
38
+ git clone https://huggingface.co/amphion/BigVGAN_singing_bigdata
39
+ ```
40
+
41
+ ### Step3: Clone the Amphion's Source Code of GitHub
42
  ```bash
43
  git clone https://github.com/open-mmlab/Amphion.git
44
  ```
45
 
46
+ ### Step4: Download ContentVec Checkpoint
47
+ You could download **ContentVec** Checkpoint from [this repo](https://github.com/auspicious3000/contentvec). In this pretrained model, we used `checkpoint_best_legacy_500.pt`, which is the legacy ContentVec with 500 classes.
48
+
49
+ ### Step5: Specify the checkpoints' path
50
+ Use the soft link to specify the downloaded checkpoints:
51
 
52
  ```bash
53
  cd Amphion
54
+ mkdir -p ckpts/svc
55
+ ln -s "$(realpath ../singing_voice_conversion/vocalist_l1_contentvec+whisper)" ckpts/svc/vocalist_l1_contentvec+whisper
56
+ ln -s "$(realpath ../BigVGAN_singing_bigdata/bigvgan_singing)" pretrained/bigvgan_singing
57
  ```
58
 
59
+ Also, you need to move `checkpoint_best_legacy_500.pt` you downloaded at **Step4** into `Amphion/pretrained/contentvec`.
60
+
61
+ ### Step6: Conversion
62
 
63
  You can follow [this recipe](https://github.com/open-mmlab/Amphion/tree/main/egs/svc/MultipleContentsSVC#4-inferenceconversion) to conduct the conversion. For example, if you want to make Taylor Swift sing the songs in the `[Your Audios Folder]`, just run:
64
 
 
68
  --infer_expt_dir "ckpts/svc/vocalist_l1_contentvec+whisper" \
69
  --infer_output_dir "ckpts/svc/vocalist_l1_contentvec+whisper/result" \
70
  --infer_source_audio_dir [Your Audios Folder] \
71
+ --infer_vocoder_dir "pretrained/bigvgan_singing" \
72
  --infer_target_speaker "vocalist_l1_TaylorSwift" \
73
  --infer_key_shift "autoshift"
74
  ```