File size: 2,638 Bytes
5daf0c8
 
 
0a496d5
 
 
 
 
 
 
 
 
 
82c1063
5daf0c8
0a496d5
 
cc15887
0a496d5
cc15887
0a496d5
 
 
5daf0c8
cc15887
 
 
 
 
1240a55
cc15887
 
 
 
1240a55
 
 
49ab1c5
b8f325d
49ab1c5
 
b8f325d
82c1063
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
---
base_model: vilm/vinallama-7b-chat
library_name: peft
license: llama2
datasets:
- nluai/dataset_dhnl_qna_v2
language:
- vi
tags:
- vietnamese
- academic
- regulations
- nlu
pipeline_tag: text-generation
---
# chatbot-dhnl-v3
## Introduction 🎉

<!-- <p style="text-align: justify;">
Large Language Models (LLMs) are increasingly demonstrating their importance in addressing complex natural language processing tasks. However, they are still limited in generating text related to personalized datasets. Building a support system for answering questions in Vietnamese for students at Nong Lam University, Ho Chi Minh City, based on academic regulations, is a critical and practical task.
This study focuses on researching methods for preprocessing Vietnamese data and fine-tuning Large Language Models (LLMs) to align with the specific language characteristics and content of the university's academic regulations.
Additionally, the research team has constructed a dataset of the university's academic regulations and developed a Vietnamese text generation service to answer questions related to this dataset, which has been integrated into a chat website utilizing this service.
</p> -->

Large Language Models (LLMs) are increasingly demonstrating their importance in addressing complex natural language processing tasks. However, they are still limited in generating text related to personalized datasets. Building a support system for answering questions in Vietnamese for students at Nong Lam University, Ho Chi Minh City, based on academic regulations, is a critical and practical task.
This study focuses on researching methods for preprocessing Vietnamese data and fine-tuning Large Language Models (LLMs) to align with the specific language characteristics and content of the university's academic regulations.
Additionally, the research team has constructed a dataset of the university's academic regulations and developed a Vietnamese text generation service to answer questions related to this dataset, which has been integrated into a chat website utilizing this service.

- **Developed by:**
  - [Nguyễn Đăng Phước](https://www.linkedin.com/in/phuoc-nguyen-dang/)
  - Vũ Ngọc Thanh Trúc
 
- **Model type:** Multimodal Transformer with over 7B parameters
- **Languages (NLP):** Primarily Vietnamese with multilingual capabilities
- **Fine-tuned from:** [nluai/dataset_dhnl_qna_v2](hhttps://huggingface.co./datasets/nluai/dataset_dhnl_qna_v2)

## Examples 🧩
<div align="left">
  <img src="assets/demo_chat1.png" width="2000"/>
</div>
<div align="left">
  <img src="assets/demo_chat2.png" width="2000"/>
</div>