Text-to-Speech
PyTorch
ONNX
Catalan
matcha-tts
acoustic modelling
speech
multispeaker
AlexK-PL commited on
Commit
e3c6df5
1 Parent(s): 4522849

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +103 -1
README.md CHANGED
@@ -1,3 +1,105 @@
1
  ---
2
- license: apache-2.0
 
 
 
 
 
 
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ language:
3
+ - ca
4
+ licence:
5
+ - apache-2.0
6
+ tags:
7
+ - matcha TTS
8
+ - speech
9
+ - text-to-speech
10
+ - catalan
11
+ pipeline_tag: text-to-speech
12
+ datasets:
13
+ - projecte-aina/CATalog
14
  ---
15
+
16
+ # Matcha TTS Catalan
17
+
18
+ ## Table of Contents
19
+ <details>
20
+ <summary>Click to expand</summary>
21
+
22
+ - [Model description](#model-description)
23
+ - [Intended uses and limitations](#intended-uses-and-limitations)
24
+ - [How to use](#how-to-use)
25
+ - [Limitations and bias](#limitations-and-bias)
26
+ - [Training](#training)
27
+ - [Evaluation](#evaluation)
28
+ - [Additional information](#additional-information)
29
+
30
+ </details>
31
+
32
+ ## Model description
33
+
34
+ ## Intended uses and limitations
35
+
36
+ ## How to use
37
+ ```python
38
+ import torch
39
+ from transformers import pipeline, AutoTokenizer, AutoModelForCausalLM
40
+
41
+ input_text = "Sovint em trobo pensant en tot allò que"
42
+
43
+ model_id = "projecte-aina/FLOR-6.3B"
44
+ tokenizer = AutoTokenizer.from_pretrained(model_id)
45
+ generator = pipeline(
46
+ "text-generation",
47
+ model=model_id,
48
+ tokenizer=tokenizer,
49
+ torch_dtype=torch.bfloat16,
50
+ trust_remote_code=True,
51
+ device_map="auto",
52
+ )
53
+ generation = generator(
54
+ input_text,
55
+ do_sample=True,
56
+ top_k=10,
57
+ eos_token_id=tokenizer.eos_token_id,
58
+ )
59
+
60
+ print(f"Result: {generation[0]['generated_text']}")
61
+ ```
62
+
63
+ ## Limitations and bias
64
+ At the time of submission, no measures have been taken to estimate the bias and toxicity embedded in the model.
65
+ However, we are well aware that our models may be biased since the corpora have been collected using crawling techniques
66
+ on multiple web sources. We intend to conduct research in these areas in the future, and if completed, this model card will be updated.
67
+
68
+
69
+ ## Training
70
+
71
+ ### Adaptation
72
+
73
+
74
+ ### Training data
75
+
76
+ ### Languages
77
+
78
+ Data comes from two different datasets: festcat and openslr69
79
+
80
+ ### Framework
81
+
82
+
83
+ ## Evaluation
84
+
85
+ ### Results
86
+
87
+
88
+ ## Additional information
89
+
90
+ ### Author
91
+ The Language Technologies Unit from Barcelona Supercomputing Center.
92
+
93
+ ### Contact
94
+ For further information, please send an email to <[email protected]>.
95
+
96
+ ### Copyright
97
+ Copyright(c) 2023 by Language Technologies Unit, Barcelona Supercomputing Center.
98
+
99
+ ### License
100
+ [Apache License, Version 2.0](https://www.apache.org/licenses/LICENSE-2.0)
101
+
102
+ ### Funding
103
+ This work was funded by [Departament de la Vicepresidència i de Polítiques Digitals i Territori de la Generalitat de Catalunya](https://politiquesdigitals.gencat.cat/ca/inici/index.html#googtrans(ca|en) within the framework of [Projecte AINA](https://politiquesdigitals.gencat.cat/ca/economia/catalonia-ai/aina).
104
+
105
+ ### Disclaimer