Data Preparation
We have successfully pre-trained and fine-tuned our SIGMA on Kinetics400, Something-Something-V2, UCF101 and HMDB51 with this codebase.
The pre-processing of Something-Something-V2 can be summarized into 3 steps:
Download the dataset from official website.
Preprocess the dataset by changing the video extension from
webm
to.mp4
with the original height of 240px.Generate annotations needed for dataloader (" " in annotations). The annotation usually includes
train.csv
,val.csv
andtest.csv
( heretest.csv
is the same asval.csv
). We share our annotation files (train.csv, val.csv, test.csv) via Google Drive. The format of*.csv
file is like:dataset_root/video_1.mp4 label_1 dataset_root/video_2.mp4 label_2 dataset_root/video_3.mp4 label_3 ... dataset_root/video_N.mp4 label_N
The pre-processing of Kinetics400 can be summarized into 3 steps:
Download the dataset from official website.
Preprocess the dataset by resizing the short edge of video to 320px. You can refer to MMAction2 Data Benchmark for TSN and SlowOnly.
Generate annotations needed for dataloader (" " in annotations). The annotation usually includes
train.csv
,val.csv
andtest.csv
( heretest.csv
is the same asval.csv
). The format of*.csv
file is like:dataset_root/video_1.mp4 label_1 dataset_root/video_2.mp4 label_2 dataset_root/video_3.mp4 label_3 ... dataset_root/video_N.mp4 label_N
Note:
We use decord to decode the videos on the fly during both pre-training and fine-tuning phases.