TVQA Test Public Evaluation (w/ timestamp at inference) Beta

Organized by TVQA - Current server time: Jan. 18, 2021, 7:54 a.m. UTC


Test-Public (w/ ts)
Nov. 16, 2018, midnight UTC


Competition Ends

Note, this portal is only used for models that used ground-truth 'ts' at inference.

TVQA is a large-scale video QA dataset based on 6 popular TV shows (Friends, The Big Bang Theory, How I Met Your Mother, House M.D., Grey's Anatomy, Castle). It consists of 152.5K QA pairs from 21.8K video clips, spanning over 460 hours of video. The questions are designed to be compositional, requiring systems to jointly localize relevant moments within a clip, comprehend subtitles-based dialogue, and recognize relevant visual concepts.

More info


Jie Lei, Licheng Yu, Mohit Bansal, Tamara L. Berg
UNC Chapel Hill


Following a major crash of Codalab (in July 2019), some user data could not restored.

Contact us

Send emails to

The submissions are evaluated using classification accuracy, which is #(correct predicted QAs) / #(all QAs)

Submission format

A valid submission file is a .zip file containing the following 3 json files (no additional enclosing folder):

  • prediction_val.json: model predictions for each question in validation set
  • prediction_test_public.json: model predictions for each question in test_public set
  • meta.json: description of the submission

prediction_val.json and prediction_test_public.json are organized as {QID: ANSWER_IDX, ...}, ANSWER_IDX is an integer in the range [0, 4]. For example:

    "1108": 2,
    "1006": 0,

meta.json file contains the following entries´╝Ü

model_namestrName of you model, which will be shown in the leaderboard
is_ensembleboolfalse for single model, true for ensemble
with_tsboolIs timestamp annotation used?
show_on_leaderboardboolDo you want to show your results on TVQA leaderboard?
authorstrName of the author(s), separated by comma
institutionstrName of your institution(s)
descriptionstrBrief description of your model
paper_linkstrlink to your paper
code_linkstrlink to your code


    "model_name": "multi-stream model",
    "is_ensemble": false,
    "with_ts": true,
    "show_on_leaderboard": true,
    "author": "Jie Lei, Licheng Yu, Mohit Bansal, Tamara L. Berg",
    "institution": "UNC Chapel Hill",
    "description": "We introduce a multi-stream end-to-end trainable neural network ...",
    "paper_link": "",
    "code_link": ""

We suggest using online json editors or validators, such as JSONLint to validate your json files before submitting.


  • Registration is needed to participate in this challenge, please allow at least 3 days for us to approve your registration. We advise you to register early in case missing possible deadlines. There is no need to send us emails, the system will notify us after your registration.
  • Number of submission to the server is limited to 5 per user to prevent overfitting. It is not acceptable to create multiple accounts for a single project to circumvent this limit. However, if your group has multiple papers describing unrelated methods, you are allowed to submit results from all of them.

This page enumerated the terms and conditions of the competition.

Test-Public (w/ ts)

Start: Nov. 16, 2018, midnight

Description: val and test_public evaluation for models that used ground-truth 'ts' at inference

Competition Ends


You must be logged in to participate in competitions.

Sign In