sunitha-ravi commited on
Commit
00de9f9
1 Parent(s): 7883729

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -3
README.md CHANGED
@@ -98,8 +98,8 @@ The model was evaluated on [PatronusAI/HaluBench](https://huggingface.co/dataset
98
 
99
  | Model | HaluEval | RAGTruth | FinanceBench | DROP | CovidQA | PubmedQA | Overall
100
  | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
101
- | GPT-4o | <ins>87.9%</ins> | 84.3% | 85.3% | 84.3% | 95.0% | 82.1% | 86.5% |
102
- | GPT-4-Turbo | 86.0% | 85.0% | 82.2% | 84.8% | 90.6% | 83.5% | 85.0% |
103
  | GPT-3.5-Turbo | 62.2% | 50.7% | 60.9% | 57.2% | 56.7% | 62.8% | 58.7% |
104
  | Claude-3.5-Sonnet | 84.5% | 79.1% | 69.3% | 69.7% | 70.8% |84.8% |83.7%|
105
  | RAGAS Faithfulness | 70.6% | 75.8% | 59.5% | 59.6% | 75.0% | 67.7% | 66.9% |
@@ -107,7 +107,7 @@ The model was evaluated on [PatronusAI/HaluBench](https://huggingface.co/dataset
107
  | Llama-3-Instruct-8B | 83.1% | 80.0% | 55.0% | 58.2% | 75.2% | 70.7% | 70.4% |
108
  | Llama-3-Instruct-70B | 87.0% | **83.8%** | 72.7% | 69.4% | 85.0% | 82.6% | 80.1% |
109
  | Lynx (8B) | 85.7% | 80.0% | 72.5% | **77.8%** | 96.3% | 85.2% | 82.9% |
110
- | Lynx v1.1 (8B) | **87.3%** | 79.9% | **75.6%** | 77.5% | **96.9%** | **88.9%** | **84.3%** |
111
 
112
  ## Citation
113
  If you are using the model, cite using
 
98
 
99
  | Model | HaluEval | RAGTruth | FinanceBench | DROP | CovidQA | PubmedQA | Overall
100
  | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
101
+ | GPT-4o | <ins>87.9%</ins> | 84.3% | <ins>85.3%</ins> | 84.3% | 95.0% | 82.1% | <ins>86.5%</ins> |
102
+ | GPT-4-Turbo | 86.0% | <ins>85.0%</ins> | 82.2% | <ins>84.8%</ins> | 90.6% | 83.5% | 85.0% |
103
  | GPT-3.5-Turbo | 62.2% | 50.7% | 60.9% | 57.2% | 56.7% | 62.8% | 58.7% |
104
  | Claude-3.5-Sonnet | 84.5% | 79.1% | 69.3% | 69.7% | 70.8% |84.8% |83.7%|
105
  | RAGAS Faithfulness | 70.6% | 75.8% | 59.5% | 59.6% | 75.0% | 67.7% | 66.9% |
 
107
  | Llama-3-Instruct-8B | 83.1% | 80.0% | 55.0% | 58.2% | 75.2% | 70.7% | 70.4% |
108
  | Llama-3-Instruct-70B | 87.0% | **83.8%** | 72.7% | 69.4% | 85.0% | 82.6% | 80.1% |
109
  | Lynx (8B) | 85.7% | 80.0% | 72.5% | **77.8%** | 96.3% | 85.2% | 82.9% |
110
+ | Lynx v1.1 (8B) | **87.3%** | 79.9% | **75.6%** | 77.5% | <ins>**96.9%**</ins> |<ins> **88.9%**</ins> | **84.3%** |
111
 
112
  ## Citation
113
  If you are using the model, cite using