abdiharyadi commited on
Commit
6a64b93
1 Parent(s): 4b1a072

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -28,7 +28,7 @@ The dataset consists of 388 Indonesian fable stories.
28
  These stories was gathered from [dongengceritarakyat.com](https://dongengceritarakyat.com/) at January 8, 2024.
29
  The duplicated stories without any paraphrashing was removed, based on the value of cosine similarity of TF-IDF trigram words.
30
  Furthermore, the remaining stories were cleaned manually for removing non-fable stories, incomplete stories (e.g. synopsis), some misused punctuations, and some typos.
31
- This cleaning were continued until now. If a mistake is found, the dataset will be modified as soon as possible.
32
 
33
  The cleaned stories was splitted with 80:10:10 ratio, giving
34
  - 310 stories for training,
 
28
  These stories was gathered from [dongengceritarakyat.com](https://dongengceritarakyat.com/) at January 8, 2024.
29
  The duplicated stories without any paraphrashing was removed, based on the value of cosine similarity of TF-IDF trigram words.
30
  Furthermore, the remaining stories were cleaned manually for removing non-fable stories, incomplete stories (e.g. synopsis), some misused punctuations, and some typos.
31
+ If a mistake is found, the dataset will be modified as soon as possible.
32
 
33
  The cleaned stories was splitted with 80:10:10 ratio, giving
34
  - 310 stories for training,