CodaLab - Competition

HANDS19 Challenge: Task 3 - RGB-Based 3D Hand Pose Estimation while Interacting with Objects

Organized by guiggh - Current server time: April 21, 2025, 4:10 p.m. UTC

Current

Task 3

July 22, 2019, midnight UTC

End

Competition Ends

Oct. 9, 2025, 11:59 p.m. UTC

Overview
Evaluation
Terms and Conditions

Introduction

We present the HANDS19 Challenge, a public competition hosted by the HANDS 2019 workshop, ICCV 2019, designed for the evaluation of the task of 3D hand pose estimation in both depth and colour modalities in the presence and absence of objects. The main goals of this challenge are to assess the performance of state of the art approaches in terms of interpolation-extrapolation capabilities of hand variations in their main four axes (shapes, articulations, viewpoints, objects), and the use of synthetic data to fill the gaps of current datasets on these axes. The challenge builds on recent datasets BigHand2.2M, F-PHAB and HO-3D datasets, which have been designed to exhaustively cover multiple hand shapes, viewpoints, articulations and both self-occlusion and occlusion from objects using both depth and RGB cameras. Despite being the most exhaustive available datasets for their respective tasks, they lack full coverage of the hand variability. In order to fill these gaps, parameters of a fitted hand model (MANO) and a toolkit to synthesize data are provided to participants. Training and test splits are carefully designed to study the interpolation and extrapolation capabilities of participants' techniques on these mentioned axes and the potential benefit of using such synthetic data. The challenge consists of a standardized dataset, an evaluation protocol for three different tasks and a public competition. Participating methods will be analyzed and ranked according to their performance on the mentioned axes. Winners and prizes will be announced and awarded during the workshop and results will be disseminated in a subsequent challenge publication.

Challenge overview

In each task the aim is to predictthe 21 joints’ 3D locations for each given image (details on annotation below). In training both hand pose annotations and MANO fitting parameters are provided for each image. For inference, only depth/RGB images and hand's bounding boxes are provided.

Task 1: Depth-Based 3D Hand Pose Estimation. This task builds on BigHand2.2M dataset in a similar format to HANDS 2017 challenge. Some hand shapes, articulations and viewpoints are strategically excluded from the training set in order to measure interpolation and extrapolation capabilities of submissions. No objects appear in this task. Hands appear in both 3rd person and egocentric viewpoints.
Task 2: Depth-Based 3D Hand Pose Estimation while Interacting with Objects: This task builds on F-PHAB dataset. Objects appear being manipulated by a subject in an egocentric viewpoint. Some hand shapes and objects are strategically excluded from the training set in order to measure interpolation and extrapolation capabilities of submissions.
Task 3: RGB-Based 3D Hand Pose Estimation while Interacting with Objects: This task builds on HO-3D dataset. Objects appear being manipulated by a subject in a 3rd person viewpoint. Some hand shapes and objects are strategically excluded from the training set in order to measure interpolation and extrapolation capabilities of submissions.

Task 3: RGB-Based 3D Hand Pose Estimation while Interacting with Objects

This task builds on HO-3D dataset with the following characteristics:

Hands appear in third-person viewpoint and interacting with different objects in continuous sequences. The user moves the object in the scene by changing the relative viewpoint between hand and camera. The relative position between hand and object does not change during the sequence.
Training set: Contains images from 3 different subjects manipulating 4 different objects. In total there are 12 sequences.
Test set: Contains images from 5 different subjects (2 appear in the training set) manipulating 5 different objects (3 appear in the training set). In total there are 5 full sequences and some frames sampled from the training sequences.
The following performance scores (as mean joint error mm) will be evaluated:
- Interpolation (INTERP.): performance on test frames sampled from training sequences (not present in training set).
- Extrapolation:
  - Total (EXTRAP.): performance on test samples that have hand shapes and objects not present in the training set.
  - Shape (SHAPE): performance on test samples that have hand shapes not present in the training set. Objects are presents in the training set.
  - Object (OBJECT): performance on test samples involving objects not present in the training set. Hand shapes are present in the training set.
Use of object and MANO models for synthesizing data is encouraged. 6D object pose is available (in training) for all images.
Both RGB and Depth images are available for training. Only RGB images are available on the test set.
Images are captured with Intel RealSense SR300 camera at 640 × 480-pixel resolution.
Use of other labelled datasets (either real or synthetic) is not allowed. Use of fitted MANO model for synthesizing data is encouraged. Use of external unlabelled data is allowed (self-supervised and unsupervised methods).

Visit the challenge website to learn how to obtain the data and participate!

The following performance scores (as mean joint error mm) will be evaluated:

Interpolation (INTERP.): performance on test frames sampled from training sequences (not present in training set).
Extrapolation:
- Total (EXTRAP.): performance on test samples that have hand shapes and objects not present in the training set.
- Shape (SHAPE): performance on test samples that have hand shapes not present in the training set. Objects are presents in the training set.
- Object (OBJECT): performance on test samples involving objects not present in the training set. Hand shapes are present in the training set.

Submission deadline is 1st October 2019.
To participate fill this form and accept the terms and conditions.
In order for participants to be eligible for competition prizes and be included in the official rankings (to be presented during the workshop and subsequent publications), information about their submission must be provided to organizers. Information may include, but not limited to, details on their method, synthetic and real data use, architecture and training details. Check previous challenge publication to have an idea of the information needed.
Winning methods may be asked to provide their source code to reproduce their results under strict confidentiality rules if requested by participants.
For each submission, participants must keep the parameters of their method constant across all testing data for a given task.
Use of object and MANO models for synthesizing data is encouraged. 6D object pose is available (in training) for all images.
Both RGB and Depth images are available for training. Only RGB images are available on the test set.
Use of other labelled datasets (either real or synthetic) is not allowed. Use of fitted MANO model for synthesizing data is encouraged. Use of external unlabelled data is allowed (self-supervised and unsupervised methods).

Task 3

Start: July 22, 2019, midnight

Description: Task 3

Competition Ends

Oct. 9, 2025, 11:59 p.m.

You must be logged in to participate in competitions.

#	Username	Score
1	potato	19.06
2	sbaek	23.63
3	Fractality	28.81