Spaces:

m42-health
/

MEDIC-Benchmark

Running

App Files Files Community

tathagataraha commited on Jan 14

Commit

23fd02c

1 Parent(s): ba515db

[MODIFY] Column descriptions for the cross examination framework

Browse files

Files changed (1) hide show

src/about.py +10 -1

src/about.py CHANGED Viewed

@@ -58,7 +58,7 @@ class MedSafetyColumns(Enum):
     med_safety_column7 = MedSafetyColumn("Physician's Freedom of Choice", "score", "Physician's Freedom of Choice")
     med_safety_column8 = MedSafetyColumn("Professionalism and Honesty", "score", "Professionalism and Honesty")
     med_safety_column9 = MedSafetyColumn("Responsibility to Patient", "score", "Responsibility to Patient")
-    med_safety_column10 = MedSafetyColumn("Law and Responsibility to Society", "score", "Law and Responsibility to Society")
 @dataclass
 class MedicalSummarizationColumn:
@@ -208,12 +208,21 @@ Select this option if your model uses a chat template. The chat template will be
 Upon successful submission of your request, your model's result would be updated on the leaderboard within 5 working days!
 """
 CROSS_EVALUATION_METRICS = """
 - **Coverage**: Measures how thoroughly the summary covers the original document. A higher score means the summary includes more details from the original.
 - **Conformity**: Also called the non-contradiction score, this checks if the summary avoids contradicting the original document. A higher score means the summary aligns better with the original.
 - **Consistency**: Measures the level of non-hallucination, or how much the summary sticks to the facts in the document. A higher score means the summary is more factual and accurate.
 - **Conciseness**: Measures how brief the summary is. A higher score means the summary is more concise. A negative score means the summary is longer than the original document.
 """
 CITATION_BUTTON_LABEL = "Copy the following snippet to cite these results"
 CITATION_BUTTON_TEXT = r"""
 @misc{kanithi2024mediccomprehensiveframeworkevaluating,

     med_safety_column7 = MedSafetyColumn("Physician's Freedom of Choice", "score", "Physician's Freedom of Choice")
     med_safety_column8 = MedSafetyColumn("Professionalism and Honesty", "score", "Professionalism and Honesty")
     med_safety_column9 = MedSafetyColumn("Responsibility to Patient", "score", "Responsibility to Patient")
+    med_safety_column8 = MedSafetyColumn("Law and Responsibility to Society", "score", "Law and Responsibility to Society")
 @dataclass
 class MedicalSummarizationColumn:
 Upon successful submission of your request, your model's result would be updated on the leaderboard within 5 working days!
 """
+NOTE_GENERATION_METRICS = """
+- **Coverage**: Measures how thoroughly the summary covers the original document. A higher score means the summary includes more details from the original.
+- **Conformity**: Also called the non-contradiction score, this checks if the summary avoids contradicting the original document. A higher score means the summary aligns better with the original.
+- **Consistency**: Measures the level of non-hallucination, or how much the summary sticks to the facts in the document. A higher score means the summary is more factual and accurate.
+- **Overall Score**: The average of the above three scores.
+"""
 CROSS_EVALUATION_METRICS = """
 - **Coverage**: Measures how thoroughly the summary covers the original document. A higher score means the summary includes more details from the original.
 - **Conformity**: Also called the non-contradiction score, this checks if the summary avoids contradicting the original document. A higher score means the summary aligns better with the original.
 - **Consistency**: Measures the level of non-hallucination, or how much the summary sticks to the facts in the document. A higher score means the summary is more factual and accurate.
 - **Conciseness**: Measures how brief the summary is. A higher score means the summary is more concise. A negative score means the summary is longer than the original document.
+- **Overall Score**: The average of coverage, conformity, consistency, and the harmonic mean of coverage and conciseness (if both are positive, otherwise 0).
 """
 CITATION_BUTTON_LABEL = "Copy the following snippet to cite these results"
 CITATION_BUTTON_TEXT = r"""
 @misc{kanithi2024mediccomprehensiveframeworkevaluating,