The Large Scale Movie Description Challenge (LSMDC) v2: Fill-in the Characters

Organized by arohrbach - Current server time: Jan. 18, 2021, 7:07 a.m. UTC


Aug. 1, 2019, midnight UTC


Competition Ends


Automatically describing open-domain videos using rich natural sentences is among the most challenging tasks of computer vision, natural language processing and machine learning. When describing sequences of events, it is important to distinguish "who is who" in order to provide a coherent and informative narrative. In this challenge track we focus on locally identifying characters, given the rest of a description.

Predicting local character IDs means that it is not required to recognize each character globally (in an entire movie), but locally (within a set of 5 clips). Practically, this means that the submissions should predict unique character IDs that are consistent within a given set of 5 clips, that is perform a local character re-identification.

Note, that while the provided annotations do contain global character IDs for completeness, it is not required to generate such global IDs, but only to predict consistent IDs within each set of 5 clips. The segmentation of annotations into sets of 5 clips will be simply performed sequentially.


To participate, you should first create an account on CodaLab. In order to submit your results, please, perform these steps:

      • All the predicted IDs should be submitted in one file in the following format (tab separated):

        1020_01.34.22.115- [1020_PERSON1]
        1020_01.34.25.586- [1020_PERSON2]
        1020_01.34.27.729- [1020_PERSON1]
        1020_01.34.32.867- [1020_PERSON2]
        1020_01.34.35.540- [1020_PERSON1],[1020_PERSON3]

      • where multiple IDs are separated with ",";

      • the specific choice of ID naming is not important, we will only check for IDs being identical/distinct within consecutive sets of 5 clips.
      • Name your JSON file test_[your_algorithm_name]_results.csv and zip it in an archive.

      • Go to "Participate" tab, click "Submit / View Results" and select the respective challenge phase.
      • Fill in the form (specify any external training data used by your algorithm in the "Method description" field) and upload your ZIP archive.
      • Click "Refresh Status" to see how your submission is being processed. In case of errors, please, check and correct your submission.
      • Once the submission is successfully processed, you can view your scores via "View scoring output log" and click "Post to leaderboard" to make your results publicly available. You can  access the detailed evaluation output via "Download evaluation output from scoring step".

Note, that we allow up to 5 submissions per day / 100 in total.

For consecutive sets of 5 clips (in the test set) we construct lists of all occuring ground-truth IDs, e.g: [1020_PERSON11], [1020_PERSON5], [1020_PERSON11], [1020_PERSON6],[1020_PERSON5]. If a set of clips contains none or a single ID, we skip such sets as they are trivial. For sets with 2 or more IDs we construct an upper triangular matrix of pairwise comparisons between IDs, where 1 is a "match" and 0 is "not a match", skipping the diagonal (which always consists of 1s). Same is done for the submitted predicted IDs. The two upper triangular matrices are then compared to each other, and the accuracy is obtained as the ratio of correct correspondances to the total number of elements. The final accuracy is averaged over all considered sets of clips. 

The evaluation script is provided here for your convenience to enable offline evaluation on the validation set.


Start: Aug. 1, 2019, midnight

Description: Test set

Competition Ends


You must be logged in to participate in competitions.

Sign In
# Username Score
1 JiwanChung 0.673
2 YASA 0.648