TVQAplus Evaluation

Organized by TVQA - Current server time: Sept. 27, 2020, 1:03 p.m. UTC

Current

test
Jan. 17, 2020, midnight UTC

End

Competition Ends
Never

The submissions are evaluated against various metrics as described in our paper, see the paper for details.

Submission format

A valid submission file is a .zip file containing the following 2 json files (no additional enclosing folder):

  • tvqa_plus_val_submission.json: val set predictions
  • tvqa_plus_test_submission.json: test set predictions

The two files are of the same format, below gives a simple example of how the file may look like, you can get more details by read the sample here. You can also see how this file is generated by looking into the baseline code for TVQAplus.

{
    "ts_answer": {
        "141290": [[12.3, 16.4]., 2],  # [[st, ed], pred_ans_idx]
        ...
    },
    "raw_bbox": [
        {"0": [{"word": 5297, # word id, as specified here.
                "pred": [0.3094744086265564, 0.33220770955085754],  # prediction scores.
                "img_idx": 11,  # image id from TVQA dataset.
                "bbox": [[160.25, 54.09375, 501.75, 359.25], [208.0, 5.48046875, 474.5, 359.25]],  # predicted boxes associated with the word.
                "qid": 141862,
                "vid_name": "s01e02_seg02_clip_03"
                },
                ...
               ],
         "1": [...],
         ...,
         "4": [...]
        },  # the keys are question-answer indices, which contains box predictions associated with the question and answer i. The program will only evaluate the one associated with GT answer. ... ] }

Before submitting, please make sure you are able to evaluate your tvqa_plus_val_submission.json using the evaluation script here.

Note

  • Registration is needed to participate in this challenge, please allow at least 3 days for us to approve your registration. We advise you to register early in case missing possible deadlines. There is no need to send us emails, the system will notify us after your registration.
  • Number of submission to the server is limited to 5 per user to prevent overfitting. It is not acceptable to create multiple accounts for a single project to circumvent this limit. However, if your group has multiple papers describing unrelated methods, you are allowed to submit results from all of them.

This page enumerated the terms and conditions of the competition.

test

Start: Jan. 17, 2020, midnight

Description: test evaluation

Competition Ends

Never

You must be logged in to participate in competitions.

Sign In