Scandeval benchmarking error

#1
by sarthu - opened

I tried benchmarking the model on the dutch language benchmarks using scandeval library and I get the following error:
Benchmarking library: https://github.com/ScandEval/ScandEval
Screenshot 2024-10-03 at 14.38.42.png

Language Technologies Unit @ Barcelona Supercomputing Center org
edited about 1 month ago

Hello,

I'm trying to work out exactly what's the problem so I need to reproduce the error. What command are you running exactly? I tried running

 scandeval -m path/to/salamandra-2b-instruct/

But I get the following error:

Traceback (most recent call last):
  File "/venv/bin/scandeval", line 8, in <module>
    sys.exit(benchmark())

  File "python3.11/site-packages/click/core.py", line 1157, in __call__
    return self.main(*args, **kwargs)

  File "python3.11/site-packages/click/core.py", line 1078, in main
    rv = self.invoke(ctx)

  File "python3.11/site-packages/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)

  File "python3.11/site-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)

  File "venv/lib/python3.11/site-packages/scandeval/cli.py", line 342, in benchmark
    benchmarker(model=models)
  File "venv/lib/python3.11/site-packages/scandeval/benchmarker.py", line 781, in __call__
    return self.benchmark(*args, **kwargs)

  File "venv/lib/python3.11/site-packages/scandeval/benchmarker.py", line 597, in benchmark
    benchmark_output = self._benchmark_single(

  File "venv/lib/python3.11/site-packages/scandeval/benchmarker.py", line 730, in _benchmark_single
    dataset = dataset_factory.build_dataset(dataset_config)

  File "venv/lib/python3.11/site-packages/scandeval/dataset_factory.py", line 57, in build_dataset
    raise ValueError(
ValueError: Could not find a benchmark class for any of the following potential names: swerec, sentiment-classification, sequence-classification.

The command I used was

scandeval --model <model-id>  --language nl

Also the error happens in the outlines library but the odd part is all others models are fine, only happens with this model

Language Technologies Unit @ Barcelona Supercomputing Center org

Hello again,

I'm trying this but now I get a similar error in the same place:

ValueError: Could not find a benchmark class for any of the following potential names: dutch-social, sentiment-classification, sequence-classification.

Do you know what could be the problem?

Meanwhile, I found this discussion on a similar issue:
https://github.com/dottxt-ai/outlines/issues/820

Can you try doing what the solution there suggests on your end and see what happens?

What is the version of scandeval you are using?
Mine is 13.0.0

Language Technologies Unit @ Barcelona Supercomputing Center org

I'm using 13.0.0 as well

Language Technologies Unit @ Barcelona Supercomputing Center org

Hi, I seem to have fixed the import error manually, I'm running the tests now.

Language Technologies Unit @ Barcelona Supercomputing Center org

Hello, sorry for the delay.

I have experienced the same error on the conll-nl benchmark.

Please kindly try running

pip install outlines==0.0.36

before executing the evaluation. I have done so and the evaluation seems to be running fine.

Thanks a lot, will be amazing to put the results on the benchmark table. Nice work !!

Language Technologies Unit @ Barcelona Supercomputing Center org

You're welcome. I'm closing this discussion for now. Good luck!

ferran-espuna changed discussion status to closed

Sign up or log in to comment