Spaces:
Running
Running
Make arena great again
#4
by
recoilme
- opened
I really appreciate you spending time on this service, thank you!
But it's quite difficult to enjoy comparing models in the arena because it's boring. Lots of very bad, very old models. New models don't get enough samples. Sampling looks like random? Most of prompts are monotonous and boring
Some suggestions:
- Use any well known algorithm for exploration-exploitation dilemma, for example https://en.wikipedia.org/wiki/Thompson_sampling or ucb1
- Use/add not so boring prompts (tons of datasets on HF)
Feel free to ask if you need some details how implement a/b tests
Also pls, clean colorfulxl cache, i updated the model on HF at same place, sorry for that
https://huggingface.co./recoilme/colorfulxl
@isidentical any news?