HACS Temporal Action Localization Challenge - Supervised Learning Track

Organized by zhaohang0124 - Current server time: Jan. 21, 2021, 4:01 p.m. UTC

First phase

April 13, 2020, midnight UTC


Competition Ends
June 3, 2020, 11:59 p.m. UTC

Challenge Overview

The goal of this challenge is to temporally localize actions in untrimmed videos. We will host HACS Temporal Action Localization Challenge in the CVPR'20 International Challenge on Activity Recognition Workshop.
More information can be found at HACS Challenge 2020 site.

Supervised Learning Track

For this track, participants will use HACS Segments, a video dataset carefully annotated with a complete set of temporal action segments for the temporal action localization task. Each video can contain multiple action segments. The task is to localize these action segments by predicting the start and end times of each action as well as the action label. Participants are allowed to leverage multi-modalities (e.g. audio/video). External datasets for pre-training are allowed, but it needs to be clearly documented. Training and testing will be performed on the following dataset:

HACS Segments ONLY

  • Temporal annotations on action segment type, start time, end time.
  • 200 action classes, nearly 140K action segments annotated in nearly 50K videos.
  • 37.6Ktraining videos, 6K validation videos, 6K testing videos.
  • * HACS Clips dataset is NOT permitted in this track. *

Important Dates

  • March 1, 2020: Challenge is announced, Train/Val/Test sets are made available.
  • April 13, 2020: Evaluation server opened.
  • May 29, 2020: Evaluation server closed.
  • June 1, 2020: Deadline for submitting the report.
  • June 14, 2020: Full-day challenge workshop at CVPR 2020.

Evaluation Metric

We use mAP as our evaluation metric, which is the same as ActivityNet localization metric.

Interpolated Average Precision (AP) is used as the metric for evaluating the results on each activity category. Then, the AP is averaged over all the activity categories (mAP). To determine if a detection is a true positive, we inspect the temporal intersection over union (tIoU) with a ground truth segment, and check whether or not it is greater or equal to a given threshold (e.g. tIoU > 0.5). The official metric used in this task is the average mAP, which is defined as the mean of all mAP values computed with tIoU thresholds between 0.5 and 0.95 (inclusive) with a step size of 0.05.

Submission Format

You should submit a JSON file (and then ZIP into .zip) in the following format, where each video ID has a list of predicted action segments. Submission portal will be available on August 1st.

  "results": {
    "--0edUL8zmA": [
        "label": "Dodgeball",
        "score": 0.84,
        "segment": [5.40, 11.60]
        "label": "Dodgeball",
        "score": 0.71,
        "segment": [12.60, 88.16]

Challenge Rules

You may submit up to once a day and 30 times total.


Start: April 13, 2020, midnight

Description: Challege Phase: Please ZIP your .json file to .zip for submission. The results on the test set will be revealed when the organizers make them available.

Competition Ends

June 3, 2020, 11:59 p.m.

You must be logged in to participate in competitions.

Sign In