patrickvonplaten
commited on
Commit
·
8bff617
1
Parent(s):
c03697f
Update README.md
Browse files
README.md
CHANGED
@@ -58,7 +58,7 @@ print("Reference:", test_dataset["sentence"][:2])
|
|
58 |
```
|
59 |
The above code leads to the following prediction for the first two samples:
|
60 |
* Prediction: ["nel ler ket dont abenn eus netra la vez ser mirc'hid evel sij", 'an eil hag egile']
|
61 |
-
* Reference: ['"N
|
62 |
|
63 |
The model can be evaluated as follows on the {language} test data of Common Voice.
|
64 |
```python
|
@@ -76,7 +76,7 @@ model = Wav2Vec2ForCTC.from_pretrained('Marxav/wav2vec2-large-xlsr-53-breton')
|
|
76 |
model.to("cuda")
|
77 |
|
78 |
|
79 |
-
chars_to_ignore_regex =
|
80 |
|
81 |
def remove_special_characters(batch):
|
82 |
sentence = re.sub(chars_to_ignore_regex, '', batch["sentence"]).lower() + " "
|
|
|
58 |
```
|
59 |
The above code leads to the following prediction for the first two samples:
|
60 |
* Prediction: ["nel ler ket dont abenn eus netra la vez ser mirc'hid evel sij", 'an eil hag egile']
|
61 |
+
* Reference: ['"N\\'haller ket dont a-benn eus netra pa vezer nec\\'het evel-se."', 'An eil hag egile.']
|
62 |
|
63 |
The model can be evaluated as follows on the {language} test data of Common Voice.
|
64 |
```python
|
|
|
76 |
model.to("cuda")
|
77 |
|
78 |
|
79 |
+
chars_to_ignore_regex = """[\\\\,\\\\?\\\\.\\\\!\\\\-\\\\;\\\\:\\\\"\\\\“\\\\%\\\\‘\\\\”\\\\�\\\\'\\\\(\\\\)\\\\/\\\\«\\\\»\\\\½\\\\…]"""
|
80 |
|
81 |
def remove_special_characters(batch):
|
82 |
sentence = re.sub(chars_to_ignore_regex, '', batch["sentence"]).lower() + " "
|