@vladbogo on Hugging Face: ""A Closer Look at the Limitations of Instruction Tuning" is a new paper that…"

Hugging Face

Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Back to feed

vladbogo

posted an update Feb 25

Post

"A Closer Look at the Limitations of Instruction Tuning" is a new paper that explores the efficacy and limitations of Instruction Tuning (IT) in Large Language Models (LLMs) for conversational agents. The authors conduct a series of experiments using both LoRA fine-tuning (LFT) and standard full-parameter fine-tuning (SFT) across various LLMs and IT datasets.

The key findings are:
* LoRA fine-tuning (LFT) preserves the pre-training token distribution while SFT doesn't. This indicates that using LFT, post fine-tuning the model still heavily relies on the pre-training and doesn't acquire new information.
* Dataset scaling is ineffective for LFT - experiments show that scaling the dataset size 52x or even 326x doesn't improve the performance.
* LoRA fine-tuning mainly enhances response initiation and style without substantial knowledge enhancement.
* Full-parameter fine-tuning tends to degrade LLM knowledge base and increase hallucination occurrences.
* Popular other methods and adjustments fail to significantly outperform simple LoRA fine-tuned models in terms of conversational quality and accuracy.

Congrats to the authors @Sreyan88 and others for their work!

Paper: A Closer Look at the Limitations of Instruction Tuning (2402.05119)

khazli

Feb 26

act like researcher

Felladrin

Feb 27

Agreed. After performing full-parameter fine-tuning (SFT) on models ranging from 32 to 248 million parameters, I've consistently observed all of the points listed above.

In this post