Oliver Guhr

oliverguhr

AI & ML interests

Voice Interfaces, Robotics, Deep Learning

Recent Activity

liked a model 2 days ago
microsoft/phi-4
liked a model 21 days ago
ssary/XLM-RoBERTa-German-sentiment
liked a model about 2 months ago
openGPT-X/Teuken-7B-instruct-research-v0.4
View all activity

Organizations

Impact Labs GmbH's profile picture

oliverguhr's activity

reacted to MoritzLaurer's post with πŸ”₯πŸ‘ 4 months ago
view post
Post
1628
Why would you fine-tune a model if you can just prompt an LLM? The new paper "What is the Role of Small Models in the LLM Era: A Survey" provides a nice pro/con overview. My go-to approach combines both:

1. Start testing an idea by prompting an LLM/VLM behind an API. It's fast and easy and I avoid wasting time on tuning a model on a task that might not make it into production anyways.

2. The LLM/VLM then needs to be manually validated. Anyone seriously considering putting AI into production has to do at least some manual validation. Setting up a good validation pipeline with a tool like Argilla is crucial and it can be reused for any future experiments. Note: you can use LLM-as-a-judge to automate some evals, but you always also need to validate the judge!

3. Based on this validation I can then (a) either just continue using the prompted LLM if it is accurate enough and it makes sense financially given my load; or (b) if the LLM is not accurate enough or too expensive to run in the long-run, I reuse the existing validation pipeline to annotate some additional data for fine-tuning a smaller model. This can be sped up by reusing & correcting synthetic data from the LLM (or just pure distillation).

Paper: https://arxiv.org/pdf/2409.06857
Argilla docs: https://docs.argilla.io/latest/
Argilla is also very easy to deploy with Hugging Face Spaces (or locally): https://huggingface.co./new-space?template=argilla%2Fargilla-template-space