Spaces:

vectara
/

leaderboard

Running on CPU Upgrade

App Files Files Community

Minseok Bae commited on Jan 19, 2024

Commit

818ee3d

1 Parent(s): 5bcc476

modified about.py

Browse files

Files changed (1) hide show

src/display/about.py +6 -6

src/display/about.py CHANGED Viewed

@@ -19,7 +19,7 @@ class Tasks(Enum):
 # Your leaderboard name
-TITLE = """<h1 align="center" id="space-title">Hughes Hallucination Evaluation (H2EM) Model leaderboard</h1>"""
 # What does your leaderboard evaluate?
 INTRODUCTION_TEXT = """
@@ -32,15 +32,15 @@ This leaderboard evaluates how often an LLM introduces hallucinations when summa
 LLM_BENCHMARKS_TEXT = """
 ## Introduction
-The Hughes Hallucination Evaluation Model (H2EM) Leaderboard is dedicated to assessing the frequency of hallucinations in document summaries generated by Large Language Models (LLMs).
 Hallucinations refer to instances where a model introduces factually incorrect or unrelated content in its summaries.
 ## How it works
-Using Vectara's H2EM, we measure the occurrence of hallucinations in generated summaries.
-Given a source document and a summary generated by an LLM, H2EM outputs a hallucination score between 0 and 1, with 0 indicating complete hallucination and 1 representing perfect factual consistency.
-The model card for H2EM can be found [here](https://huggingface.co/vectara/hallucination_evaluation_model).
 ## Evaluation Dataset
@@ -54,7 +54,7 @@ We generate summaries for each of these documents using submitted LLMs and compu
 - Average Summary Length: The average word count of generated summaries
 ## Note on non-Hugging Face models
-On H2EM leaderboard, There are currently models such as GPT variants that are not available on the Hugging Face model hub. We ran the evaluations for these models on our own and uploaded the results to the leaderboard.
 If you would like to submit your model that is not available on the Hugging Face model hub, please contact us at [email protected].
 ## Model Submissions and Reproducibility

 # Your leaderboard name
+TITLE = """<h1 align="center" id="space-title">Hughes Hallucination Evaluation Model (HHEM) leaderboard</h1>"""
 # What does your leaderboard evaluate?
 INTRODUCTION_TEXT = """
 LLM_BENCHMARKS_TEXT = """
 ## Introduction
+The Hughes Hallucination Evaluation Model (HHEM) Leaderboard is dedicated to assessing the frequency of hallucinations in document summaries generated by Large Language Models (LLMs).
 Hallucinations refer to instances where a model introduces factually incorrect or unrelated content in its summaries.
 ## How it works
+Using Vectara's HHEM, we measure the occurrence of hallucinations in generated summaries.
+Given a source document and a summary generated by an LLM, HHEM outputs a hallucination score between 0 and 1, with 0 indicating complete hallucination and 1 representing perfect factual consistency.
+The model card for HHEM can be found [here](https://huggingface.co/vectara/hallucination_evaluation_model).
 ## Evaluation Dataset
 - Average Summary Length: The average word count of generated summaries
 ## Note on non-Hugging Face models
+On HHEM leaderboard, There are currently models such as GPT variants that are not available on the Hugging Face model hub. We ran the evaluations for these models on our own and uploaded the results to the leaderboard.
 If you would like to submit your model that is not available on the Hugging Face model hub, please contact us at [email protected].
 ## Model Submissions and Reproducibility