シサム語による説明
アイヌ語と日本語の双方向機械翻訳モデルです。 民話や叙事詩のコーパスばかり用いたので、それらに出てきそうな単語ばかり使っているなら、翻訳できます。
何かあれば、so-miyagawa at ninjal.ac.jpまでご連絡ください。
まあまあ良いSacreBLEUスコアです。chrFスコアなども計ってみたいです。詳しい人教えてください。
論文はこちらです。
So Miyagawa. 2023. Machine Translation for Highly Low-Resource Language: A Case Study of Ainu, a Critically Endangered Indigenous Language in Northern Japan. In Proceedings of the Joint 3rd International Conference on Natural Language Processing for Digital Humanities and 8th International Workshop on Computational Linguistics for Uralic Languages, pages 120–124, Tokyo, Japan. Association for Computational Linguistics. https://aclanthology.org/2023.nlp4dh-1.14/
右のInference APIで日本語やアイヌ語を入力して試してみてください。もちろん、完璧ではないので、初・中級者の方は結果はそのまま使わず、必ず専門家やアイヌ語上級者に見てもらってください。
英語による説明
This is a two-way machine translation model between Ainu and Japanese. We used only a corpus of folk tales and epic poems, so if you use only words that might appear in those, you can translate.
Please contact me at so-miyagawa at ninjal.ac.jp if you have any questions.
I have a so-so SacreBLEU score and would like to measure chrF score etc. Please let me know if you know more.
Here is the paper.
Miyagawa, So. 2023. Machine Translation for Highly Low-Resource Language: A Case Study of Ainu, a Critically Endangered Indigenous Language in Northern Japan. In Proceedings of the Joint 3rd International Conference on Natural Language Processing for Digital Humanities and 8th International Workshop In Proceedings of the Joint 3rd International Conference on Natural Language Processing for Digital Humanities and 8th International Workshop on Computational Linguistics for Uralic Languages, pages 120-124, Tokyo, Japan. https://aclanthology.org/2023.nlp4dh-1.14/
You can try using the Inference API on the right by entering Japanese or Ainu. Of course, it is not perfect, so if you are a beginner or intermediate user, please do not use the results as they are, and be sure to ask an expert or advanced Ainu speaker to review them.
Validation Metrics
- Loss: 1.216
- SacreBLEU: 29.910
- Gen len: 10.022
- Downloads last month
- 7