Minseok Bae commited on
Commit
818ee3d
1 Parent(s): 5bcc476

modified about.py

Browse files
Files changed (1) hide show
  1. src/display/about.py +6 -6
src/display/about.py CHANGED
@@ -19,7 +19,7 @@ class Tasks(Enum):
19
 
20
 
21
  # Your leaderboard name
22
- TITLE = """<h1 align="center" id="space-title">Hughes Hallucination Evaluation (H2EM) Model leaderboard</h1>"""
23
 
24
  # What does your leaderboard evaluate?
25
  INTRODUCTION_TEXT = """
@@ -32,15 +32,15 @@ This leaderboard evaluates how often an LLM introduces hallucinations when summa
32
  LLM_BENCHMARKS_TEXT = """
33
  ## Introduction
34
 
35
- The Hughes Hallucination Evaluation Model (H2EM) Leaderboard is dedicated to assessing the frequency of hallucinations in document summaries generated by Large Language Models (LLMs).
36
 
37
  Hallucinations refer to instances where a model introduces factually incorrect or unrelated content in its summaries.
38
 
39
  ## How it works
40
 
41
- Using Vectara's H2EM, we measure the occurrence of hallucinations in generated summaries.
42
- Given a source document and a summary generated by an LLM, H2EM outputs a hallucination score between 0 and 1, with 0 indicating complete hallucination and 1 representing perfect factual consistency.
43
- The model card for H2EM can be found [here](https://huggingface.co/vectara/hallucination_evaluation_model).
44
 
45
  ## Evaluation Dataset
46
 
@@ -54,7 +54,7 @@ We generate summaries for each of these documents using submitted LLMs and compu
54
  - Average Summary Length: The average word count of generated summaries
55
 
56
  ## Note on non-Hugging Face models
57
- On H2EM leaderboard, There are currently models such as GPT variants that are not available on the Hugging Face model hub. We ran the evaluations for these models on our own and uploaded the results to the leaderboard.
58
  If you would like to submit your model that is not available on the Hugging Face model hub, please contact us at [email protected].
59
 
60
  ## Model Submissions and Reproducibility
 
19
 
20
 
21
  # Your leaderboard name
22
+ TITLE = """<h1 align="center" id="space-title">Hughes Hallucination Evaluation Model (HHEM) leaderboard</h1>"""
23
 
24
  # What does your leaderboard evaluate?
25
  INTRODUCTION_TEXT = """
 
32
  LLM_BENCHMARKS_TEXT = """
33
  ## Introduction
34
 
35
+ The Hughes Hallucination Evaluation Model (HHEM) Leaderboard is dedicated to assessing the frequency of hallucinations in document summaries generated by Large Language Models (LLMs).
36
 
37
  Hallucinations refer to instances where a model introduces factually incorrect or unrelated content in its summaries.
38
 
39
  ## How it works
40
 
41
+ Using Vectara's HHEM, we measure the occurrence of hallucinations in generated summaries.
42
+ Given a source document and a summary generated by an LLM, HHEM outputs a hallucination score between 0 and 1, with 0 indicating complete hallucination and 1 representing perfect factual consistency.
43
+ The model card for HHEM can be found [here](https://huggingface.co/vectara/hallucination_evaluation_model).
44
 
45
  ## Evaluation Dataset
46
 
 
54
  - Average Summary Length: The average word count of generated summaries
55
 
56
  ## Note on non-Hugging Face models
57
+ On HHEM leaderboard, There are currently models such as GPT variants that are not available on the Hugging Face model hub. We ran the evaluations for these models on our own and uploaded the results to the leaderboard.
58
  If you would like to submit your model that is not available on the Hugging Face model hub, please contact us at [email protected].
59
 
60
  ## Model Submissions and Reproducibility