hf_name,request_rate,num_prompts,throughput,avg_input,avg_output,output_throughput,avg_e2e_latency,hardware espressor/google.gemma-2-2b-it_W8A8_int8,2.0,500,1.9640842181083509,405.77,925.756,588.8156571953091,1511.3084695767611,NVIDIA H100 NVL espressor/google.gemma-2b-it_W8A8_int8,2.0,500,2.0172286349781228,436.904,880.084,573.7978817020318,1304.3176122009754,NVIDIA H100 NVL espressor/meta-llama.Llama-3.2-3B-Instruct_W8A8_FP8,2.0,500,2.120392686097345,454.602,838.156,574.03741996402,1444.7603891603649,NVIDIA H100 NVL espressor/meta-llama.Llama-2-7b-chat-hf_W8A8_FP8,2.0,500,1.969808576028168,449.136,993.768,634.738238969637,2471.232087118551,NVIDIA H100 NVL espressor/meta-llama.Llama-3.2-1B-Instruct_W4A16,2.0,500,2.0329525020114785,454.602,837.122,549.6864549059602,930.8922665659338,NVIDIA H100 NVL espressor/meta-llama.Meta-Llama-3-8B-Instruct_W4A16,2.0,500,1.8958854271808279,413.724,836.95,512.8511015769211,2179.943430237472,NVIDIA H100 NVL espressor/meta-llama.Llama-3.1-8B-Instruct_W4A16,2.0,500,2.0962791489826045,454.602,838.45,567.7084148787031,2233.4370345342904,NVIDIA H100 NVL google/gemma-2-2b-it,2.0,500,2.0498850766804746,405.77,803.426,533.3325672335127,1667.373894015327,NVIDIA H100 NVL espressor/google.gemma-2-2b-it_W4A16,2.0,500,2.0031635898325852,405.77,804.866,522.1108374009694,1711.2753931432962,NVIDIA H100 NVL google/gemma-2b-it,2.0,500,2.0252192503159003,436.904,749.898,490.85580652016586,1366.5225361473858,NVIDIA H100 NVL espressor/meta-llama.Llama-3.2-1B-Instruct_W8A8_int8,2.0,500,2.0167589035048956,454.602,837.376,545.4733538699339,844.4496539887041,NVIDIA H100 NVL espressor/google.gemma-2b-it_W8A8_FP8,2.0,500,2.068527934303381,436.904,713.428,476.9701832948262,1399.403179064393,NVIDIA H100 NVL espressor/meta-llama.Llama-2-7b-chat-hf_W4A16,2.0,500,2.029198163689896,449.136,993.766,653.8742355179809,2769.8905616998672,NVIDIA H100 NVL espressor/google.gemma-7b-it_W8A8_FP8,2.0,500,2.050673497214249,436.904,820.396,543.7505928961154,2592.220456339419,NVIDIA H100 NVL espressor/meta-llama.Llama-3.2-3B-Instruct_W4A16,2.0,500,1.9562196881490193,454.602,837.68,529.2913786720512,1604.609966976568,NVIDIA H100 NVL espressor/meta-llama.Llama-3.1-8B-Instruct_W8A8_int8,2.0,500,2.0689264000398224,454.602,838.528,560.3529445647908,2066.567000001669,NVIDIA H100 NVL google/gemma-7b-it,2.0,500,1.9043993891283246,436.904,840.76,517.4992987729574,3303.9836273528636,NVIDIA H100 NVL google/gemma-2-9b-it,2.0,500,1.9699641300032775,405.77,759.328,484.40703462018416,3724.076159996912,NVIDIA H100 NVL espressor/meta-llama.Meta-Llama-3-8B-Instruct_W8A8_FP8,2.0,500,2.007851595825549,413.724,836.868,543.0855686151698,2019.4484647363424,NVIDIA H100 NVL espressor/meta-llama.Llama-2-7b-chat-hf_W8A8_int8,2.0,500,1.9710735996556337,449.136,993.388,634.9030029230579,2437.2015884146094,NVIDIA H100 NVL espressor/google.gemma-2-9b-it_W8A8_FP8,2.0,500,1.9974011339359488,405.77,752.778,486.91697888666954,3197.9047160129994,NVIDIA H100 NVL espressor/google.gemma-2-2b-it_W8A8_FP8,2.0,500,1.9571385134002302,405.77,801.082,507.71646197917204,1492.8587670437992,NVIDIA H100 NVL espressor/google.gemma-2b-it_W4A16,2.0,500,1.9964711404938675,436.904,719.24,464.1053339007141,1273.8769329153001,NVIDIA H100 NVL espressor/meta-llama.Meta-Llama-3-8B-Instruct_W8A8_int8,2.0,500,1.939103885164283,413.724,836.716,524.3953608206588,2055.081178434193,NVIDIA H100 NVL espressor/meta-llama.Llama-3.1-8B-Instruct_W8A8_FP8,2.0,500,1.8945874597227732,454.602,838.55,513.1480343509469,2013.8781589921564,NVIDIA H100 NVL meta-llama/Llama-3.2-1B-Instruct,2.0,500,1.9664043731682552,454.602,837.212,531.7497861979784,899.324984755367,NVIDIA H100 NVL