Update README.md
Browse files
README.md
CHANGED
@@ -11,73 +11,79 @@ pipeline_tag: object-detection
|
|
11 |
|
12 |
# Model Card for Model ID
|
13 |
|
14 |
-
<!-- Provide a quick summary of what the model is/does. -->
|
15 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
16 |
|
17 |
|
18 |
## Model Details
|
19 |
|
|
|
|
|
20 |
### Model Description
|
21 |
|
22 |
<!-- Provide a longer summary of what this model is. -->
|
23 |
|
24 |
This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
|
25 |
|
26 |
-
- **Developed by:**
|
27 |
-
- **Funded by
|
28 |
-
- **Shared by
|
29 |
-
- **Model type:**
|
30 |
-
- **
|
31 |
-
-
|
32 |
-
|
33 |
-
|
34 |
-
### Model Sources [optional]
|
35 |
|
36 |
<!-- Provide the basic links for the model. -->
|
37 |
|
38 |
-
- **Repository:**
|
39 |
-
- **Paper
|
40 |
-
- **Demo [optional]:** [More Information Needed]
|
41 |
-
|
42 |
-
## Uses
|
43 |
-
|
44 |
-
<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
|
45 |
-
|
46 |
-
### Direct Use
|
47 |
-
|
48 |
-
<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
|
49 |
-
|
50 |
-
[More Information Needed]
|
51 |
-
|
52 |
-
### Downstream Use [optional]
|
53 |
-
|
54 |
-
<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
|
55 |
-
|
56 |
-
[More Information Needed]
|
57 |
-
|
58 |
-
### Out-of-Scope Use
|
59 |
|
60 |
-
|
61 |
|
62 |
-
|
63 |
|
64 |
-
|
|
|
|
|
65 |
|
66 |
-
|
|
|
67 |
|
68 |
-
|
|
|
69 |
|
70 |
-
|
|
|
71 |
|
72 |
-
|
73 |
|
74 |
-
|
|
|
75 |
|
76 |
-
|
77 |
-
|
78 |
-
Use the code below to get started with the model.
|
79 |
|
80 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
81 |
|
82 |
## Training Details
|
83 |
|
@@ -85,15 +91,15 @@ Use the code below to get started with the model.
|
|
85 |
|
86 |
<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
|
87 |
|
88 |
-
[
|
89 |
|
90 |
### Training Procedure
|
91 |
|
92 |
<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
|
93 |
|
94 |
-
#### Preprocessing
|
95 |
|
96 |
-
|
97 |
|
98 |
|
99 |
#### Training Hyperparameters
|
@@ -174,7 +180,7 @@ Carbon emissions can be estimated using the [Machine Learning Impact calculator]
|
|
174 |
|
175 |
[More Information Needed]
|
176 |
|
177 |
-
## Citation
|
178 |
|
179 |
<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
|
180 |
|
@@ -192,7 +198,7 @@ Carbon emissions can be estimated using the [Machine Learning Impact calculator]
|
|
192 |
```
|
193 |
|
194 |
|
195 |
-
## Model Card Authors
|
196 |
|
197 |
[David Hajdu](https://huggingface.co/davidhajdu)
|
198 |
|
|
|
11 |
|
12 |
# Model Card for Model ID
|
13 |
|
|
|
14 |
|
15 |
+
## Table of Contents
|
16 |
+
|
17 |
+
1. [Model Details](#model-details)
|
18 |
+
2. [Model Sources](#model-sources)
|
19 |
+
3. [How to Get Started with the Model](#how-to-get-started-with-the-model)
|
20 |
+
4. [Training Details](#training-details)
|
21 |
+
5. [Evaluation](#evaluation)
|
22 |
+
6. [Model Architecture and Objective](#model-architecture-and-objective)
|
23 |
+
7. [Citation](#citation)
|
24 |
|
25 |
|
26 |
## Model Details
|
27 |
|
28 |
+
We present in this paper a novel query formulation using dynamic anchor boxes for DETR (DEtection TRansformer) and offer a deeper understanding of the role of queries in DETR. This new formulation directly uses box coordinates as queries in Transformer decoders and dynamically updates them layer-by-layer. Using box coordinates not only helps using explicit positional priors to improve the query-to-feature similarity and eliminate the slow training convergence issue in DETR, but also allows us to modulate the positional attention map using the box width and height information. Such a design makes it clear that queries in DETR can be implemented as performing soft ROI pooling layer-by-layer in a cascade manner. As a result, it leads to the best performance on MS-COCO benchmark among the DETR-like detection models under the same setting, e.g., AP 45.7\% using ResNet50-DC5 as backbone trained in 50 epochs. We also conducted extensive experiments to confirm our analysis and verify the effectiveness of our methods.
|
29 |
+
|
30 |
### Model Description
|
31 |
|
32 |
<!-- Provide a longer summary of what this model is. -->
|
33 |
|
34 |
This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
|
35 |
|
36 |
+
- **Developed by:** Shilong Liu, Feng Li, Hao Zhang, Xiao Yang, Xianbiao Qi, Hang Su, Jun Zhu, Lei Zhang
|
37 |
+
- **Funded by:** IDEA-Research
|
38 |
+
- **Shared by:** David Hajdu
|
39 |
+
- **Model type:** DAB-DETR
|
40 |
+
- **License:** Apache-2.0
|
41 |
+
-
|
42 |
+
### Model Sources
|
|
|
|
|
43 |
|
44 |
<!-- Provide the basic links for the model. -->
|
45 |
|
46 |
+
- **Repository:** https://github.com/IDEA-Research/DAB-DETR
|
47 |
+
- **Paper:** https://arxiv.org/abs/2201.12329
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
48 |
|
49 |
+
## How to Get Started with the Model
|
50 |
|
51 |
+
Use the code below to get started with the model.
|
52 |
|
53 |
+
```python
|
54 |
+
import torch
|
55 |
+
import requests
|
56 |
|
57 |
+
from PIL import Image
|
58 |
+
from transformers import AutoModelForObjectDetection, AutoImageProcessor
|
59 |
|
60 |
+
url = 'http://images.cocodataset.org/val2017/000000039769.jpg'
|
61 |
+
image = Image.open(requests.get(url, stream=True).raw)
|
62 |
|
63 |
+
image_processor = AutoImageProcessor.from_pretrained("davidhajdu/dab-detr-resnet-50")
|
64 |
+
model = AutoModelForObjectDetection.from_pretrained("davidhajdu/dab-detr-resnet-50")
|
65 |
|
66 |
+
inputs = image_processor(images=image, return_tensors="pt")
|
67 |
|
68 |
+
with torch.no_grad():
|
69 |
+
outputs = model(**inputs)
|
70 |
|
71 |
+
results = image_processor.post_process_object_detection(outputs, target_sizes=torch.tensor([image.size[::-1]]), threshold=0.3)
|
|
|
|
|
72 |
|
73 |
+
for result in results:
|
74 |
+
for score, label_id, box in zip(result["scores"], result["labels"], result["boxes"]):
|
75 |
+
score, label = score.item(), label_id.item()
|
76 |
+
box = [round(i, 2) for i in box.tolist()]
|
77 |
+
print(f"{model.config.id2label[label]}: {score:.2f} {box}")
|
78 |
+
```
|
79 |
+
This should output
|
80 |
+
```
|
81 |
+
cat: 0.87 [14.7, 49.39, 320.52, 469.28]
|
82 |
+
remote: 0.86 [41.08, 72.37, 173.39, 117.2]
|
83 |
+
cat: 0.86 [344.45, 19.43, 639.85, 367.86]
|
84 |
+
remote: 0.61 [334.27, 75.93, 367.92, 188.81]
|
85 |
+
couch: 0.59 [-0.04, 1.34, 639.9, 477.09]
|
86 |
+
```
|
87 |
|
88 |
## Training Details
|
89 |
|
|
|
91 |
|
92 |
<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
|
93 |
|
94 |
+
The DAB-DETR model was trained on [COCO 2017 object detection](https://cocodataset.org/#download), a dataset consisting of 118k/5k annotated images for training/validation respectively.
|
95 |
|
96 |
### Training Procedure
|
97 |
|
98 |
<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
|
99 |
|
100 |
+
#### Preprocessing
|
101 |
|
102 |
+
Images are resized/rescaled such that the shortest side is at least 480 and at most 800 pixels and the long size is at most 1333 pixels, and normalized across the RGB channels with the ImageNet mean (0.485, 0.456, 0.406) and standard deviation (0.229, 0.224, 0.225).
|
103 |
|
104 |
|
105 |
#### Training Hyperparameters
|
|
|
180 |
|
181 |
[More Information Needed]
|
182 |
|
183 |
+
## Citation
|
184 |
|
185 |
<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
|
186 |
|
|
|
198 |
```
|
199 |
|
200 |
|
201 |
+
## Model Card Authors
|
202 |
|
203 |
[David Hajdu](https://huggingface.co/davidhajdu)
|
204 |
|