Upload folder using huggingface_hub
c985ba4
12 months ago
A newer version of the Gradio SDK is available:
5.6.0
Upgrade
Model Zoo and Results
Environment and Settings
4/1 NVIDIA V100 GPUs for training/evaluation.
Auto-mixed precision was enabled in training but disabled in evaluation.
Test-time augmentations were not used.
The inference resolution of DAVIS/YouTube-VOS was 480p/1.3x480p as CFBI .
Fully online inference. We passed all the modules frame by frame.
Multi-object FPS was recorded instead of single-object one.
Pre-trained Models
Stages:
To use our pre-trained model to infer, a simple way is to set --model
and --ckpt_path
to your downloaded checkpoint's model type and file path when running eval.py
.
YouTube-VOS 2018 val
ALL-F
: all frames. The default evaluation setting of YouTube-VOS is 6fps, but 30fps sequences (all the frames) are also supplied by the dataset organizers. We noticed that many VOS methods prefer to evaluate with 30fps videos. Thus, we also supply our results here. Denser video sequences can significantly improve VOS performance when using the memory reading strategy (like AOTL, R50-AOTL, and SwinB-AOTL), but the efficiency will be influenced since more memorized frames are stored for object matching.
Model
Stage
FPS
All-F
Mean
J Seen
F Seen
J Unseen
F Unseen
Predictions
AOTT
PRE_YTB_DAV
41.0
80.2
80.4
85.0
73.6
81.7
gdrive
AOTT
PRE_YTB_DAV
41.0
β
80.9
80.0
84.7
75.2
83.5
gdrive
DeAOTT
PRE_YTB_DAV
53.4
82.0
81.6
86.3
75.8
84.2
-
AOTS
PRE_YTB_DAV
27.1
82.9
82.3
87.0
77.1
85.1
gdrive
AOTS
PRE_YTB_DAV
27.1
β
83.0
82.2
87.0
77.3
85.7
gdrive
DeAOTS
PRE_YTB_DAV
38.7
84.0
83.3
88.3
77.9
86.6
-
AOTB
PRE_YTB_DAV
20.5
84.0
83.2
88.1
78.0
86.5
gdrive
AOTB
PRE_YTB_DAV
20.5
β
84.1
83.6
88.5
78.0
86.5
gdrive
DeAOTB
PRE_YTB_DAV
30.4
84.6
83.9
88.9
78.5
87.0
-
AOTL
PRE_YTB_DAV
16.0
84.1
83.2
88.2
78.2
86.8
gdrive
AOTL
PRE_YTB_DAV
6.5
β
84.5
83.7
88.8
78.4
87.1
gdrive
DeAOTL
PRE_YTB_DAV
24.7
84.8
84.2
89.4
78.6
87.0
-
R50-AOTL
PRE_YTB_DAV
14.9
84.6
83.7
88.5
78.8
87.3
gdrive
R50-AOTL
PRE_YTB_DAV
6.4
β
85.5
84.5
89.5
79.6
88.2
gdrive
R50-DeAOTL
PRE_YTB_DAV
22.4
86.0
84.9
89.9
80.4
88.7
-
SwinB-AOTL
PRE_YTB_DAV
9.3
84.7
84.5
89.5
78.1
86.7
gdrive
SwinB-AOTL
PRE_YTB_DAV
5.2
β
85.1
85.1
90.1
78.4
86.9
gdrive
SwinB-DeAOTL
PRE_YTB_DAV
11.9
86.2
85.6
90.6
80.0
88.4
-
YouTube-VOS 2019 val
Model
Stage
FPS
All-F
Mean
J Seen
F Seen
J Unseen
F Unseen
Predictions
AOTT
PRE_YTB_DAV
41.0
80.0
79.8
84.2
74.1
82.1
gdrive
AOTT
PRE_YTB_DAV
41.0
β
80.9
79.9
84.4
75.6
83.8
gdrive
DeAOTT
PRE_YTB_DAV
53.4
82.0
81.2
85.6
76.4
84.7
-
AOTS
PRE_YTB_DAV
27.1
82.7
81.9
86.5
77.3
85.2
gdrive
AOTS
PRE_YTB_DAV
27.1
β
82.8
81.9
86.5
77.3
85.6
gdrive
DeAOTS
PRE_YTB_DAV
38.7
83.8
82.8
87.5
78.1
86.8
-
AOTB
PRE_YTB_DAV
20.5
84.0
83.1
87.7
78.5
86.8
gdrive
AOTB
PRE_YTB_DAV
20.5
β
84.1
83.3
88.0
78.2
86.7
gdrive
DeAOTB
PRE_YTB_DAV
30.4
84.6
83.5
88.3
79.1
87.5
-
AOTL
PRE_YTB_DAV
16.0
84.0
82.8
87.6
78.6
87.1
gdrive
AOTL
PRE_YTB_DAV
6.5
β
84.2
83.0
87.8
78.7
87.3
gdrive
DeAOTL
PRE_YTB_DAV
24.7
84.7
83.8
88.8
79.0
87.2
-
R50-AOTL
PRE_YTB_DAV
14.9
84.4
83.4
88.1
78.7
87.2
gdrive
R50-AOTL
PRE_YTB_DAV
6.4
β
85.3
83.9
88.8
79.9
88.5
gdrive
R50-DeAOTL
PRE_YTB_DAV
22.4
85.9
84.6
89.4
80.8
88.9
-
SwinB-AOTL
PRE_YTB_DAV
9.3
84.7
84.0
88.8
78.7
87.1
gdrive
SwinB-AOTL
PRE_YTB_DAV
5.2
β
85.3
84.6
89.5
79.3
87.7
gdrive
SwinB-DeAOTL
PRE_YTB_DAV
11.9
86.1
85.3
90.2
80.4
88.6
-
DAVIS-2017 test
Model
Stage
FPS
Mean
J Score
F Score
Predictions
AOTT
PRE_YTB_DAV
51.4
73.7
70.0
77.3
gdrive
AOTS
PRE_YTB_DAV
40.0
75.2
71.4
78.9
gdrive
AOTB
PRE_YTB_DAV
29.6
77.4
73.7
81.1
gdrive
AOTL
PRE_YTB_DAV
18.7
79.3
75.5
83.2
gdrive
R50-AOTL
PRE_YTB_DAV
18.0
79.5
76.0
83.0
gdrive
SwinB-AOTL
PRE_YTB_DAV
12.1
82.1
78.2
85.9
gdrive
DAVIS-2017 val
Model
Stage
FPS
Mean
J Score
F Score
Predictions
AOTT
PRE_YTB_DAV
51.4
79.2
76.5
81.9
gdrive
AOTS
PRE_YTB_DAV
40.0
82.1
79.3
84.8
gdrive
AOTB
PRE_YTB_DAV
29.6
83.3
80.6
85.9
gdrive
AOTL
PRE_YTB_DAV
18.7
83.6
80.8
86.3
gdrive
R50-AOTL
PRE_YTB_DAV
18.0
85.2
82.5
87.9
gdrive
SwinB-AOTL
PRE_YTB_DAV
12.1
85.9
82.9
88.9
gdrive
DAVIS-2016 val
Model
Stage
FPS
Mean
J Score
F Score
Predictions
AOTT
PRE_YTB_DAV
51.4
87.5
86.5
88.4
gdrive
AOTS
PRE_YTB_DAV
40.0
89.6
88.6
90.5
gdrive
AOTB
PRE_YTB_DAV
29.6
90.9
89.6
92.1
gdrive
AOTL
PRE_YTB_DAV
18.7
91.1
89.5
92.7
gdrive
R50-AOTL
PRE_YTB_DAV
18.0
91.7
90.4
93.0
gdrive
SwinB-AOTL
PRE_YTB_DAV
12.1
92.2
90.6
93.8
gdrive