Spaces:
Running
on
CPU Upgrade
One user.. dozens of evals.. unintentional abuse?
A user should only have at most 1 slot for running eval indistinctly of the queue length for evaluation.. I understand this is not intentional abusing.. but indeed seems unreasonable for 1 user to push 30 evals of trash and clog the entire pipeline..
I personally disagree on using HF Eval pipeline to push garbage to be evaluated.. instead I prefer to do not push to eval things that I havent tested locally first.. it feels like more civic..
@clefourrier ^^ IMHO tbh.. 1 user 1 run at a time, many queues, but just one slot consumed by HF user handler?
I feel the same way. Currently, FINGU's LLM models are a bottleneck for evaluation.
I feel the same way. Currently, FINGU's LLM models are a bottleneck for evaluation.
Appreciated the fact that you step forward, and understanding the problem in a civic manner. I strongly believe this is not an intentional abuse, is just how things are.. a reasonable QoS/FairQueueing could be in place to "share is care" the evaluation pipe that HF sponsors.
Hi!
Yep, the voting system is supposed to partially mitigate this, but there also used to be a "spam blocking" (maximum amount of submissions per org/user within a time frame) system which I think we forgot to port from the original leaderboard - it's indeed an issue, thanks for flagging.
(cc
@tfrere
)
@clefourrier If I may suggest a better queuing system:
- Users can vote on only one model for each author
- The system should process in this order:
user1/most_voted
user2/most_voted
user3/most_voted
user1/2nd_most_voted
user2/2nd_most_voted
... and so on...
You may also want to allow users to choose their first eval model (e.g. user1 wants 3rd_most_voted to be done first regardless of the votes). I don't know how that affects the queueing though
I just added a system that prevents a user/org from submitting more than 10 models per week. I’m closing the ticket for now—feel free to reopen it if this solution isn’t sufficient!