Recently, we've received some questions for our data annotations and this is one for the mismatched video duration.
Dear organizer, I have a question about the raw video segmentation in your Kinetics-GEBD. Your dataset is from Kinetics-400 and you provide your training and val files for us. However, I find that the durations of some videos in your files do not match with the durations in Kinetics gt file (I downloaded them from https://storage.googleapis.com/deepmind-media/Datasets/kinetics400.tar.gz). For example, the duration of video zumba/ODsFDvcEXh0_000002_000012.mp4 is 0.5 in your file while its duration in Kinetics gt file is from 2s to 12s. It means I do not know the positions of video segments you used and it will lead to failed localization.
We have checked the original video 'ODsFDvcEXh0' on YouTube and found that the video duration is actually 2.5s, which is conflict with the Kinetic-400 annotations -- therefore we are not able to cut a video clip from 2s to 12s of the original video. We have verified that the original video is 2.5s long and therefore when one uses ffmpeg to trim this one, he finally obtains a 0.5s long clip -- it turns out that this is an issue in the original Kinetics-400 dataset.
I think it is ok to include these videos into your training data since we trimmed the original kinetics-400 videos according to their annotations and our annotators worked on these trimmed videos. For this example, since everyone will get a 0.5s clip, this shall be no problem.
Many thanks to our participants.
StanPosted by: leiwx52 @ April 11, 2021, 1:06 a.m.