PyTorch
English
Tevatron
phi3_v
vidore
custom_code
MrLight commited on
Commit
58b17d0
1 Parent(s): c6695ce

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +7 -7
README.md CHANGED
@@ -10,11 +10,11 @@ datasets:
10
  - HuggingFaceM4/Docmatix
11
  ---
12
 
13
- # DSE-Phi3-Docmatix-V1.0
14
 
15
- DSE-Phi3-Docmatix-V1.0 is a bi-encoder model designed to encode document screenshots into dense vectors for document retrieval. The Document Screenshot Embedding ([DSE](https://arxiv.org/abs/2406.11251)) approach captures documents in their original visual format, preserving all information such as text, images, and layout, thus avoiding tedious parsing and potential information loss.
16
 
17
- The model, `Tevatron/dse-phi3-docmatix-v1.0`, is trained using the `Tevatron/docmatix-ir` dataset, a variant of `HuggingFaceM4/Docmatix` specifically adapted for training PDF retrievers with Vision Language Models in open-domain question answering scenarios. For more information on dataset filtering and hard negative mining, refer to the [docmatix-ir](https://huggingface.co/datasets/Tevatron/docmatix-ir/blob/main/README.md) dataset page.
18
 
19
  ## How to Use the Model
20
 
@@ -24,8 +24,8 @@ The model, `Tevatron/dse-phi3-docmatix-v1.0`, is trained using the `Tevatron/doc
24
  import torch
25
  from transformers import AutoProcessor, AutoModelForCausalLM
26
 
27
- processor = AutoProcessor.from_pretrained('Tevatron/dse-phi3-docmatix-v1.0', trust_remote_code=True)
28
- model = AutoModelForCausalLM.from_pretrained('Tevatron/dse-phi3-docmatix-v1.0', trust_remote_code=True, attn_implementation="flash_attention_2", torch_dtype=torch.bfloat16, use_cache=False).to('cuda:0')
29
 
30
  def get_embedding(last_hidden_state: torch.Tensor, attention_mask: torch.Tensor) -> torch.Tensor:
31
  sequence_lengths = attention_mask.sum(dim=1) - 1
@@ -53,8 +53,8 @@ import requests
53
  from io import BytesIO
54
 
55
  # URLs of the images
56
- url1 = "https://huggingface.co/Tevatron/dse-phi3-docmatix-v1.0/resolve/main/animal-llama.png"
57
- url2 = "https://huggingface.co/Tevatron/dse-phi3-docmatix-v1.0/resolve/main/meta-llama.png"
58
 
59
  # Download and open images
60
  response1 = requests.get(url1)
 
10
  - HuggingFaceM4/Docmatix
11
  ---
12
 
13
+ # DSE-Phi3-Docmatix-V1
14
 
15
+ DSE-Phi3-Docmatix-V1 is a bi-encoder model designed to encode document screenshots into dense vectors for document retrieval. The Document Screenshot Embedding ([DSE](https://arxiv.org/abs/2406.11251)) approach captures documents in their original visual format, preserving all information such as text, images, and layout, thus avoiding tedious parsing and potential information loss.
16
 
17
+ The model, `Tevatron/dse-phi3-docmatix-v1`, is trained using the `Tevatron/docmatix-ir` dataset, a variant of `HuggingFaceM4/Docmatix` specifically adapted for training PDF retrievers with Vision Language Models in open-domain question answering scenarios. For more information on dataset filtering and hard negative mining, refer to the [docmatix-ir](https://huggingface.co/datasets/Tevatron/docmatix-ir/blob/main/README.md) dataset page.
18
 
19
  ## How to Use the Model
20
 
 
24
  import torch
25
  from transformers import AutoProcessor, AutoModelForCausalLM
26
 
27
+ processor = AutoProcessor.from_pretrained('Tevatron/dse-phi3-docmatix-v1', trust_remote_code=True)
28
+ model = AutoModelForCausalLM.from_pretrained('Tevatron/dse-phi3-docmatix-v1', trust_remote_code=True, attn_implementation="flash_attention_2", torch_dtype=torch.bfloat16, use_cache=False).to('cuda:0')
29
 
30
  def get_embedding(last_hidden_state: torch.Tensor, attention_mask: torch.Tensor) -> torch.Tensor:
31
  sequence_lengths = attention_mask.sum(dim=1) - 1
 
53
  from io import BytesIO
54
 
55
  # URLs of the images
56
+ url1 = "https://huggingface.co/Tevatron/dse-phi3-docmatix-v1/resolve/main/animal-llama.png"
57
+ url2 = "https://huggingface.co/Tevatron/dse-phi3-docmatix-v1/resolve/main/meta-llama.png"
58
 
59
  # Download and open images
60
  response1 = requests.get(url1)