Update README.md
Browse files
README.md
CHANGED
@@ -14,7 +14,11 @@ datasets:
|
|
14 |
|
15 |
DSE-Phi3-Docmatix-V1 is a bi-encoder model designed to encode document screenshots into dense vectors for document retrieval. The Document Screenshot Embedding ([DSE](https://arxiv.org/abs/2406.11251)) approach captures documents in their original visual format, preserving all information such as text, images, and layout, thus avoiding tedious parsing and potential information loss.
|
16 |
|
17 |
-
The model, `Tevatron/dse-phi3-docmatix-v1`, is trained using the `Tevatron/docmatix-ir` dataset, a variant of `HuggingFaceM4/Docmatix` specifically adapted for training PDF retrievers with Vision Language Models in open-domain question answering scenarios. For more information on dataset filtering and hard negative mining, refer to the [docmatix-ir](https://huggingface.co/datasets/Tevatron/docmatix-ir/blob/main/README.md) dataset page.
|
|
|
|
|
|
|
|
|
18 |
|
19 |
## How to Use the Model
|
20 |
|
|
|
14 |
|
15 |
DSE-Phi3-Docmatix-V1 is a bi-encoder model designed to encode document screenshots into dense vectors for document retrieval. The Document Screenshot Embedding ([DSE](https://arxiv.org/abs/2406.11251)) approach captures documents in their original visual format, preserving all information such as text, images, and layout, thus avoiding tedious parsing and potential information loss.
|
16 |
|
17 |
+
The model, `Tevatron/dse-phi3-docmatix-v1`, is trained using 1/10 of the `Tevatron/docmatix-ir` dataset, a variant of `HuggingFaceM4/Docmatix` specifically adapted for training PDF retrievers with Vision Language Models in open-domain question answering scenarios. For more information on dataset filtering and hard negative mining, refer to the [docmatix-ir](https://huggingface.co/datasets/Tevatron/docmatix-ir/blob/main/README.md) dataset page.
|
18 |
+
|
19 |
+
DSE has strong zero-shot effectiveness for document retrieval both with visual input and text input.
|
20 |
+
For example, DSE-Phi3-Docmatix-V1 achieves 74.1 nDCG@5 on [ViDoRE](https://huggingface.co/spaces/vidore/vidore-leaderboard) leaderboard in **zero-shot setting** (without finetuning with ViDoRe training data).
|
21 |
+
|
22 |
|
23 |
## How to Use the Model
|
24 |
|