Is there any restriction regarding using the 2D joint detections in the 3DPW dataset (i.e. 'poses2d') or other 2D pose detectors (e.g. OpenPose)?
Posted by: xuchen @ July 19, 2020, 10:30 p.m.For the phase 'Estimation with known association', you can use the GT 2D-pose for each image to identify the person being tracked. You must not use the GT-pose to get the bounding box around the person but you can use a standard detection algorithm to detect all the people in the image and then use the GT - 2d poses to eliminate all the detected people not being tracked.
In the other phase, only and only the ground truth 2D pose of the first image in each sequence can be used to obtain the identity and the number of people being tracked. The GT pose of no other image can be used.
There is not restriction on the use of OpenPose or any other 2d pose estimation algorithm
Posted by: aymen @ July 20, 2020, 8:49 a.m.Thank you very much for your clarification. If I understand correctly, we can run OpenPose and use the 'poses2d' in the 3DPW dataset to filter out unrelated persons. But then wouldn't this just give us the same thing as 'poses2d'? SInce 'poses2d' is also detected by OpenPose as reported in the 3DPW paper. So I am wondering:
- Does GT 2D poses refer to 'poses2d' in the dataset or the actual ground-truth 2D pose which can be obtained by projecting 3D joint ground truth to 2D?
- Is there any difference between 'poses2d' and standard OpenPose detections after unrelated persons being removed?
We are particularly interested in this because our method is optimization-based (fitting SMPL to 2d joint detections, similar to SMPLify), and we are wondering if we could use 'poses2d' as the input to our algorithm. Thank you very much for your time and help!
Posted by: xuchen @ July 20, 2020, 10:59 a.m.GT 2D poses refer to 'poses2d' in the dataset. The dataset uses a specific version of openpose which may not be exactly same as the one publicly available.
Posted by: aymen @ July 20, 2020, 11:18 a.m.I see. Many thanks for your quick response and for the clarification.
Posted by: xuchen @ July 20, 2020, 11:31 a.m.By the way, can we know the differences between that specific OpenPose version and the standard one?
Posted by: xuchen @ July 20, 2020, 11:38 a.m.Unfortunately I do not have that information.
Posted by: aymen @ July 20, 2020, 11:40 a.m.Okay. That's a pity, but thanks!
Posted by: xuchen @ July 20, 2020, 11:46 a.m.An addendum to this thread: Using the version of OpenPose that gives the exact same output as 'poses2D' will be within the letter of the rules of this competition but against the spirit of these rules. We decided not disallow the use of OpenPose because of its ease-of-use and the fact that it produces near state-of-the-art results. Using 'poses2D' directly will provide an unfair advantage to that method because 'poses2D' is an input to the original algorithm used to generate ground-truth 3D poses for the 3DPW dataset. If a submission is found to violate the spirit of the rules, we reserve the right to exclude the submission from the competition.
Posted by: aymen @ July 20, 2020, 5:09 p.m.Thank you for your further clarification.
As we mentioned in our previous post, our method fits the SMPL model to 2D joint detections, similar to the very first single-view SMPL mesh recovery method SMPLify. Such methods inherently require a 2D pose detector, and OpenPose is a natural choice.
Before submitting any results, we would like to explicitly make sure:
** Would the following process be considered as “violating the spirit of the rules”? - We use the latest and unmodified OpenPose to detect 2D joints, then use ‘poses2d’ to eliminate all the detected people not being tracked. Then our core algorithm takes the 2D detections as input.
We can also provide additional information to address fairness concerns. Besides (1) the main result using the latest and unmodified OpenPose, we can also provide (2) result using ‘poses2d’ and (3) result using the 2D projection of ground-truth 3D pose. From our experiments, (1) is noticeably better than (2), and (3) is significantly better than (1) and (2). This should suggest that our method simply needs a reasonable 2D pose detector and is not taking advantage of the dependency between the pseudo ground-truth 3D pose and the 2D pose detection from that specific pose detector (‘poses2d’). If there is any information you would like to know we are always happy to supply.
Thank you for your time. Look forward to your response.
Posted by: xuchen @ July 21, 2020, 3:17 p.m.Thank you for the note. Submitting results of 1) to this competition is perfectly fine. If you are also submitting a paper to the workshop via CMT and think that results of 2) and 3) will aid the reader in understanding your method, you are of course welcome to include them in your paper.
Posted by: aymen @ July 21, 2020, 5:16 p.m.Great to know! Thank you.
Posted by: xuchen @ July 23, 2020, 7:18 p.m.