The number of research papers on 3D pose estimation, and in particular in un-controlled settings has dramatically increased recently. Most papers demonstrate performance qualitatively, showing images and the corresponding 3D body model or stick figure side by side. Quantitative evaluation is limited to indoor datasets such as H3.6M or Human-Eva, or outdoor datasets with a limited recording volume like MuPoTs-3D or MPI-INF-3DHP.
This changed with 3DPW (https://virtualhumans.mpi-inf.mpg.de/3DPW/), which constitutes the only dataset with accurate reference 3D poses in natural scenes (e.g., people shopping in the city, having coffee, or doing sports recorded with a moving hand-held camera). The purpose of this challenge is to standardize protocols and metrics so that researchers compare their methods in a consistent manner in future publications, ultimately to advance the state of the art in 3D human pose estimation in the wild.
This competition is split into two parallel phases/tracks:
Ground truth Gender, Camera Intrinsics and Extrinsics cannot be used during inference for both tracks.
You are free to participate in either one or both.
About the data -- the 3DPW Dataset:
For evaluation, we make use of the 3DPW dataset - the first dataset annotated with 3D poses of images collected in the wild. The dataset has been collected by using a hand-held video camera and IMUs to record the activites of people. The IMUs are used to estimate the 3D pose of the people in the image and then these 3D poses are assigned to the 2D poses detected in the image using a state-or-the art pose estimator. In this challenge, we do not use the original splits in the dataset; and we use the entire dataset including its train, validation and test splits for evaluation. Your algorithm MUST NOT use any part of the 3DPW dataset for training. You may use any of the other datasets such as Human3.6M, AMASS or MPI-INF-3DHP for training.
The 3D human performance is evaluated according to these metrics:
1) MPJPE Mean Per Joint Position Error (in mm). It measures the average Euclidean distance from prediction to ground truth joint positions. The evaluation adjusts the translation (tx,ty,tz) of the prediction to match the ground truth.
2) MPJPE_PA: Mean Per Joint Position Error (in mm) after procrustes analysis. (Rotation, translation and scale are adjusted).
3) PCK: percentage of correct joints. A joint is considered correct when it is less than 50mm away from the ground truth. The joints considered for PCK are: shoulders, elbows, wrists, hips, knees and ankles.
4) AUC: the total area under the PCK-threshold curve. Calculated by computing PCKs by varying from 0 to 200 mm the threshold at which a predicted joint is considered correct
5) MPJAE. It measures the angle in degrees between the predicted part orientation and the ground truth orientation. The orientation difference is measured as the geodesic distance in SO(3). The 9 parts considered are: left/right upper arm, left/right lower arm, left/right upper leg, left/right lower leg and root.
6) MPJAE_PA. It measures the angle in degrees between the predicted part orientation and the ground truth orientation after rotating all predicted orientations by the rotation matrix obtained from the procrustes matching step.
Although the main purpose of the challenge is not to rank or order algorithms, it is better to standarize rankings than having multiple criteria in papers.
We will invite 3 winners of the challenge to give a talk at the workshop. We will invite:
To merge several metrics into one scalar, we average the ranking order in each metric. E.g., if I am 1st in one metric and 3rd in a second metric, the average will be 2.
To participate in this challenge you need to agree with the following conditions: https://virtualhumans.mpi-inf.mpg.de/3DPW/license.html
Start: May 1, 2020, midnight
Description: In this track the association is assumed known. You can use the ground truth 2D poses to determine the ID of the subject in the scene. The ground truth 2D poses MUST NOT be used to crop the image, the bounding box crop should come from your algorithm or an existing off-shelf person detector.
Start: May 10, 2020, midnight
Description: In this track you are not allowed the ground truth data in any form. Only to ascertain the number and identity of people being tracked, the ground truth data of only and only the first frame of each sequence can be used.
Dec. 31, 2022, 2 a.m.
You must be logged in to participate in competitions.Sign In