The Large Scale Movie Description Challenge (LSMDC) v2: Fill-in the Characters

Organized by arohrbach - Current server time: Aug. 25, 2019, 3:21 p.m. UTC


Public Test
Aug. 1, 2019, midnight UTC


Competition Ends


Automatically describing open-domain videos using rich natural sentences is among the most challenging tasks of computer vision, natural language processing and machine learning. When describing sequences of events, it is important to distinguish "who is who" in order to provide a coherent and informative narrative. In this challenge track we focus on locally identifying characters, given the rest of a description.

Predicting local character IDs means that it is not required to recognize each character globally (in an entire movie), but locally (within a set of 5 clips). Practically, this means that the submissions should predict unique character IDs that are consistent within a given set of 5 clips, that is perform a local character re-identification.

Note, that while the provided annotations do contain global character IDs for completeness, it is not required to generate such global IDs, but only to predict consistent IDs within each set of 5 clips. The segmentation of annotations into sets of 5 clips will be simply performed sequentially.


To participate, you should first create an account on CodaLab. In order to submit your results, please, perform these steps:

      • All the predicted IDs should be submitted in one file in the following format (tab separated):

        1020_01.34.22.115- [1020_PERSON1]
        1020_01.34.25.586- [1020_PERSON2]
        1020_01.34.27.729- [1020_PERSON1]
        1020_01.34.32.867- [1020_PERSON2]
        1020_01.34.35.540- [1020_PERSON1],[1020_PERSON3]

      • where multiple IDs are separated with ",";

      • the specific choice of ID naming is not important, we will only check for IDs being identical/distinct within consecutive sets of 5 clips.
      • Name your JSON file test_[your_algorithm_name]_results.csv and zip it in an archive.

      • Go to "Participate" tab, click "Submit / View Results" and select the respective challenge phase.
      • Fill in the form (specify any external training data used by your algorithm in the "Method description" field) and upload your ZIP archive.
      • Click "Refresh Status" to see how your submission is being processed. In case of errors, please, check and correct your submission.
      • Once the submission is successfully processed, you can view your scores via "View scoring output log" and click "Post to leaderboard" to make your results publicly available. You can  access the detailed evaluation output via "Download evaluation output from scoring step".

Note, that we allow up to 5 submissions per day / 100 in total.

For consecutive sets of 5 clips (for the entire test set) we construct lists of all occuring ground-truth IDs, e.g: [1020_PERSON11], [1020_PERSON5], [1020_PERSON11], [1020_PERSON6],[1020_PERSON5]. We transform this list into "local" IDs: 1, 2, 1, 3, 2. Similarly, we transofrm the predicted IDs, obtaining another list of local IDs. Both lists are compared and an accuracy is computed. The accuracy is then averaged over all sets of 5 clips. We report the final (average) accuracy.

Public Test

Start: Aug. 1, 2019, midnight

Description: Public Test set

Competition Ends


You must be logged in to participate in competitions.

Sign In