Spaces:
Running
Running
How to download the table as csv file?
#18
by
zhiminy
- opened
The given Gradio sample code does not work...
from gradio_client import Client
client = Client("https://optimum-llm-perf-leaderboard.hf.space/")
result = client.predict(
"Howdy!", # str in 'Model ๐ค' Textbox component
["pytorch"], # List[str] in 'Backends ๐ญ' Checkboxgroup component
["float32"], # List[str] in 'Load Dtypes ๐ฅ' Checkboxgroup component
["None"], # List[str] in 'Optimizations ๐ ๏ธ' Checkboxgroup component
["None"], # List[str] in 'Quantizations ๐๏ธ' Checkboxgroup component
0, # int | float (numeric value between 0 and 100)
in 'Open LLM Score ๐' Slider component
0, # int | float (numeric value between 0 and 81920)
in 'Peak Memory (MB) ๐' Slider component
"Howdy!", # str in 'parameter_4' Textbox component
fn_index=0
)
print(result)
+1, I would also love to get a way to download the benchmark data programmatically.
I just made both the dataset and the backend (workflow that populates it) public
dataset: https://huggingface.co./datasets/optimum/llm-perf-dataset
backend: https://github.com/IlyasMoutawwakil/llm-perf-backend
This is super useful, many thanks @IlyasMoutawwakil !
Does the following code work for you @IlyasMoutawwakil ?
from datasets import load_dataset
dataset = load_dataset("optimum/llm-perf-dataset")
I got the following error trying it out:
ValueError: Couldn't cast
forward.latency(s): double
forward.throughput(samples/s): double
forward.peak_memory(MB): int64
forward.max_memory_used(MB): int64
forward.max_memory_allocated(MB): int64
forward.max_memory_reserved(MB): int64
forward.energy_consumption(kWh/sample): double
forward.carbon_emissions(kgCO2eq/sample): double
generate.latency(s): double
generate.throughput(tokens/s): double
generate.peak_memory(MB): int64
generate.max_memory_used(MB): int64
generate.max_memory_allocated(MB): int64
generate.max_memory_reserved(MB): int64
generate.energy_consumption(kWh/token): double
generate.carbon_emissions(kgCO2eq/token): double
-- schema metadata --
pandas: '{"index_columns": [{"kind": "range", "name": null, "start": 0, "' + 2804
to
{'timestamp': Value(dtype='string', id=None), 'project_name': Value(dtype='string', id=None), 'run_id': Value(dtype='string', id=None), 'duration': Value(dtype='float64', id=None), 'emissions': Value(dtype='float64', id=None), 'emissions_rate': Value(dtype='float64', id=None), 'cpu_power': Value(dtype='float64', id=None), 'gpu_power': Value(dtype='float64', id=None), 'ram_power': Value(dtype='float64', id=None), 'cpu_energy': Value(dtype='float64', id=None), 'gpu_energy': Value(dtype='float64', id=None), 'ram_energy': Value(dtype='float64', id=None), 'energy_consumed': Value(dtype='float64', id=None), 'country_name': Value(dtype='string', id=None), 'country_iso_code': Value(dtype='string', id=None), 'region': Value(dtype='string', id=None), 'cloud_provider': Value(dtype='float64', id=None), 'cloud_region': Value(dtype='float64', id=None), 'os': Value(dtype='string', id=None), 'python_version': Value(dtype='string', id=None), 'codecarbon_version': Value(dtype='string', id=None), 'cpu_count': Value(dtype='int64', id=None), 'cpu_model': Value(dtype='string', id=None), 'gpu_count': Value(dtype='int64', id=None), 'gpu_model': Value(dtype='string', id=None), 'longitude': Value(dtype='float64', id=None), 'latitude': Value(dtype='float64', id=None), 'ram_total_size': Value(dtype='float64', id=None), 'tracking_mode': Value(dtype='string', id=None), 'on_cloud': Value(dtype='string', id=None), 'pue': Value(dtype='float64', id=None)}
Let me know if I am accessing the dataset the wrong way or if I am doing something silly.
I suggest going directly for the csv file in the dataset instead, there's a csv in every machine-folder
IlyasMoutawwakil
changed discussion status to
closed