Welcome to the ICCV DeeperAction Challenge - Kinetics-TPS Track on Part-level Action Parsing and Action Recognition.
The challenge is Track 3 at ICCV DeeperAction Challenge. This track is to recognize a human action by compositional learning of body part state in videos. The challenge will be carried out on the Kinetics-TPS dataset. More information on the dataset and downloads can be found at Kinetics-TPS Page.
Our goal is part state parsing for boosting action recognition. Hence, the participants should predict human location, body part location, part state in the frame level, and then integrate these results together to predict human action in the video level.
The participants are required to provide two types of results. (1) Part State Parsing Result: For each frame in a test video, the participants should provide the predicted boxes of human instances, the predicted boxes of body parts as well as the predicted part state of each body part box. Note that, to reduce uploading burden, we will evaluate these results on the sampled frames of each test video (where the sampling interval is 5-frame). Hence, we encourage participants to provide the results on these frames. (2) Action Recognition Result: The participants should also provide the predicted action for each test video.
Since our goal is to leverage part state parsing for action recognition, we develop a new evaluation metric for this task, where we use Part State Correctness as condition for evaluating action recognition accuracy in a test video.
Definition of Part State Correctness (PSC)
Action Recognition Conditioned on PSC
You can find our implementation of evaluation criteria at Kinetics-TPS-evaluation, which is the same as the online evaluation system.
To submit your results to the leaderboard, you must construct a submission zip file that contains the files with the following format.
{
"video_name": {
"label_name": {
"humans": [
{
"number": 1,
"parts": {
"part_name" : {
"number": 1,
"box": [BOX1, BOX2, ...],
"verb": [PART STATE1, PART STATE2, ...],
"name": part_name,
}
}
},
...
]
},
...
},
...
}
{
"video_name": {
"img_00001.json": {
"humans": [
{
"number": 1,
"parts": {
"left_arm": {
"number": 1,
"box": [
[275,91,300,165],
[260,85,310,155]
],
"verb": [
"unbend",
"bend"
],
"name": "left_arm",
},
"right_leg": {
"number": 2,
"box": [
[296,188,317,264],
[266,199,312,254]
],
"verb": [
"step_on",
"unbend"
],
"name": "right_leg",
},
...
}
}
...
]
},
"img_00006.json": {
...
},
...
},
"video_name": {
...
},
...
}
{
"video_name": "predicted_class",
...
}
{
"Y0-KQvJjKAw_000057_000067": "predicted_class",
"hRisIK4NSds_000096_000106": "predicted_class",
"sFWnb5LJEbw_000000_000010": "predicted_class",
"95vwx9AidR8_000205_000215": "predicted_class",
...
}
Start: June 1, 2021, midnight
Description: The Development Leaderboard is based on a fixed random subset of 50% of the test dataset. To submit, upload a .zip file containing a pred_part_result.json and a pred_vid_result.json.
Start: Sept. 1, 2021, midnight
Description: To submit, upload a .zip file containing a pred_part_result.json and a pred_vid_result.json. The leaderboard will not be public until the end of the testing competition. If submission beats old score, codalab will put submission on leaderboard.
Sept. 12, 2021, 11:59 p.m.
You must be logged in to participate in competitions.
Sign In