EPIC-KITCHENS-100 Action Anticipation

Organized by antoninofurnari - Current server time: Sept. 23, 2020, 11:08 p.m. UTC

Current

CVPR 2021 Challenge
Aug. 5, 2020, midnight UTC

End

Competition Ends
May 28, 2021, 11:59 p.m. UTC

EPIC-KITCHENS-100 Action Anticipation Challenge

Welcome to the EPIC-KITCHENS-100 Action Anticipation Challenge.

Description

The challenge requires the anticipation of a future action from the observation of a preceding video segment. The challenge will be carried out on the EPIC-KITCHENS-100 dataset. More information on the dataset & downloads can be found at https://epic-kitchens.github.io/2020-100.

Goal

Let Ta be the "anticipation time", i.e. how far in advance to anticipate the action, and To be the "observation time", i.e. the length of the observed video segment preceding the action. Given an action video segment Ai = [tsi, tei], the goal is to predict the verb/noun/action class of Ai by observing the video segment preceding the action start time tsi by Ta, that is [tsi-(Ta+To),tsi-Ta]. The anticipation time Ta is set to Ta = 1 second for this challenge. Participants are allowed to set the observation time To to whatever they find convenient. Please keep in mind that the developed algorithms are not allowed to observe any visual content temporally located after time tsi-Ta.

Please refer to Section 4.4 of [1] for more details on baselines and results and see Sec. 4.3 of [2] for more details about the challenge definition.

Dataset details

EPIC-KITCHENS-100 is an unscripted egocentric action dataset collected from 45 kitchens from 4 cities across the world.

  • 100 hours of video
  • 20M frames
  • Full HD
  • 90k action segments
  • 20k unique narrations
  • 97 verb classes, 300 noun classes

References

[1] Dima Damen, Hazel Doughty, Giovanni Maria Farinella, Antonino Furnari, Vangelis Kazakos, Davide Moltisanti, Jonathan Munro, Will Price, Michael Wray. Rescaling Egocentric Vision. ArXiv, 2020. [arXiv]
[2] Dima Damen, Hazel Doughty, Giovanni Maria Farinella, Sanja Fidler, Antonino Furnari, Evangelos Kazakos, Davide Moltisanti, Jonathan Munro and Toby Perrett, Will Price, Michael Wray. The EPIC-KITCHENS Dataset: Collection, Challenges and Baselines. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 2020. [arXiv]

Evaluation Criteria

Submissions are evaluated on the test set. We report Mean Top-5 Recall (MT5R) on the following subsets of the test set:

  • Overall: All instances in the test set.
  • Unseen Participants: Instances coming from participants that are not in the training set.
  • Tail Classes: Instances labelled with tail classes only. Tail classes are defined as the set of smallest classes (i.e. those with fewest instances) whose total number of instances accounts for 20% of the training data. We define a tail action class as one where either the verb or noun is a tail class.

For a definition of Top-5 Recall, see Section 3.2 of [1]. Mean Top-5 Recall is obtained by averaging Top-5 Recall values computed for each class appearing in the test set.

References

[1] Antonino Furnari, Sebastiano Battiato, Giovanni Maria Farinella. Leveraging Uncertainty to Rethink Loss Functions and Evaluation Measures for Egocentric Action Anticipation . In International Workshop on Egocentric Perception, Interaction and Computing (EPIC) in conjunction with ECCV, 2018. [pdf]

Terms and Conditions

  • You agree to us storing your submission results for evaluation purposes.
  • You agree that if you place in the top-10 at the end of the challenge you will submit your code so that we can verify that you have not cheated.
  • You agree not to distribute the EPIC-KITCHENS-100 dataset without prior written permission.

Submissions

To submit your results to the leaderboard you must construct a submission zip file containing a single file test.json containing the model’s results on the test set. This file should follow format detailed in the subsequent section.

JSON Submission Format

The JSON submission format is composed of a single JSON object containing entries for every action in the test set. Specifically, the JSON file should contain:

  • a 'version' property, set to '0.2'
  • a 'challenge' property, which can assume the following values, depending on the challenge: ['action_recognition', 'action_anticipation'];
  • a set of sls properties (see the Supervision Levels Scale (SLS) page for more details):
    • sls_pt: SLS Pretraining level.
    • sls_tl: SLS Training Labels level.
    • sls_td: SLS Training Data level.
  • a 'results' object containing entries for every action in the test set (e.g . 'P01_101_0' is the first narration ID in the test set).

Each action segment entry is a nested object composed of two entries: 'verb', specifying the class score for every verb class and the other, 'noun' specifying the score for every noun class. Action scores are automatically computed by applying softmax to the verb and noun scores and computing the probability of each possible action.

{
  "version": "0.2",
  "challenge": "action_recognition",
  "sls_pt": -1,
  "sls_tl": -1,
  "sls_td": -1,
  "results": {
    "P01_101_0": {
      "verb": {
        "0": 1.223,
        "1": 4.278,
        ...
        "96": 0.023
      },
      "noun": {
        "0": 0.804,
        "1": 1.870,
        ...
        "299": 0.023
      }
    },
    "P01_101_1": { ... },
    ...
  }
}

If you wish to compute your own action scores, you can augment each segment submission with exactly 100 action scores with the key 'action'

{
  ...
  "results": {
    "P01_101_0": {
      "verb": {
        "0": 1.223,
        "1": 4.278,
        ...
        "96": 0.023
      },
      "noun": {
        "0": 0.804,
        "1": 1.870,
        ...
        "299": 0.023
      },
      "action": {
        "0,1": 1.083,
        ...
        "96,299": 0.002
      }
    },
    "P01_101_1": { ... },
    ...
  }
}

The keys of the action object are of the form <verb_class>,<noun_class>.

You can provide scores in any float format that numpy is capable of reading (i.e. you do not need to stick to 3 decimal places).

If you fail to provide your own action scores we will compute them by

  1. Obtaining softmax probabilites from your verb and noun scores
  2. Find the top 100 action probabilities where p(a = (v, n)) = p(v) * p(n)

Submission archive

To upload your results to CodaLab you have to zip the test file into a flat zip archive (it can’t be inside a folder within the archive).

You can create a flat archive using the command providing the JSON file is in your current directory.

$ zip -j my-submission.zip test.json

CVPR 2021 Challenge

Start: Aug. 5, 2020, midnight

Description: CVPR 2021 Action Anticipation Challenge

Competition Ends

May 28, 2021, 11:59 p.m.

You must be logged in to participate in competitions.

Sign In