If you have question please email to email@example.com
Natural language-based video and image search has been a long standing topic of research among information retrieval, multimedia, and computer vision communities. Several existing on-line platforms (e.g. Youtube) rely on massive human curation efforts, manually assigned tags, however as the amount of unlabeled video content grows, with advent of inexpensive mobile recording devices (e.g. smart phones), the focus is rapidly shifting to automated understand, tagging and search. In this challenge, we would like to explore a variety of different joint language-visual learning models for video annotation and retrieval task, which is based on a unified version of the recently published large-scale movie datasets (M-VAD and MPII-MD). More information about the datasets and challenge can be found here.
Movie Retrieval: We compute Recall@1, Recall@5, Recall@10, and Median Rank for video retrieval (given caption rank videos). The evaluation is only on 1000 samples of public test set.
To participate, you should first create an account on CodaLab. In order to submit your results, please, perform these steps:
Note, that we allow up to 10 submission per day. In total maximum submission per team is 100.
The evaluation is based on Recall@1, Recall@5, Recall@10, and MedR on subset 1000 samples of public test set provided in challenge website for movie retreival here. Recall@k means the percentage of ground-truth videos in the first k videos and MedR means the median rank of ground-truth videos.
Winners will be selected based on a highest Recall@10.
Start: Aug. 25, 2016, midnight
You must be logged in to participate in competitions.Sign In