Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up
davidberenstein1957 
posted an update Jul 25
Post
1409
The Meta Llama-3.1 model series can be used for distilling and fine-tuning but this requires annotated preference data so I created a Human Feedback Collector based on Gradio that directly logs data to the Hugging Face Hub.

- Model meta-llama/Meta-Llama-3.1-8B-Instruct
- Data SFT, KTO and DPO data
- Runs on free Zero GPUs in Hugging Face Spaces
- Might need some human curation in Argilla
- Or provide some AI feedback with distilabel

https://huggingface.co./collections/davidberenstein1957/chatinterface-llm-human-feedback-collectors-66a22859c9e703d2af7500c1