DeepFashion2 Challenge 2019 - Track 1 Clothes Landmark Estimation

Organized by tttoaster - Current server time: June 19, 2019, 3:36 p.m. UTC


May 27, 2019, midnight UTC


July 1, 2019, midnight UTC


Competition Ends
July 30, 2019, midnight UTC


This task aims to predict landmarks for each detected clothing item in an image. We adopt the same evaluation metric employed in the cocodataset. Different from coco dataset, where only one category has key-points, a total of 294 landmarks of 13 categories in DeepFashion2 are presented. Besides the coordinates of 294 landmarks of a detected clothing item, its category should also be included in final prediction files.

Data Description

  Train Validation Test
Images 191,961 32,153 62,629
Labels 312,186 51,490  

Each image in seperate image set has a unique six-digit number such as 000001.jpg. A corresponding annotation file in json format is provided in annotation set such as 000001.json.

Label Description

Each annotation file is organized as below:

source: a string, where 'shop' indicates that the image is from commercial store while 'user' indicates that the image is taken by users.
        pair_id: a number. Images from the same shop and their corresponding consumer-taken images have the same pair id.

                 category_name: a string which indicates the category of the item.
                 category_id: a number which corresponds to the category name. In category_id, 1 represents short sleeve top, 2 represents long sleeve top, 3 represents short sleeve outwear, 4 represents long sleeve outwear, 5 represents vest, 6 represents sling, 7 represents shorts, 8 represents trousers, 9 represents skirt, 10 represents short sleeve dress, 11 represents long sleeve dress, 12 represents vest dress and 13 represents sling dress.
                 style: a number to distinguish between clothing items from images with the same pair id. Clothing items with different style numbers from images with the same pair id have different styles such as color, printing, and logo. In this way, a clothing item from shop images and a clothing item from user image are positive commercial-consumer pair if they have the same style number greater than 0 and they are from images with the same pair id.
                 bounding_box: [x1,y1,x2,y2], where x1 and y1 represent the lower left point coordinate of bounding box, x2 and y2 represent the upper right point coordinate of bounding box.
                 landmarks: [x1,y1,v1,...,xn,yn,vn], where v represents the visibility: v=2 visible; v=1 occlusion; v=0 not labeled. We have different definitions of landmarks for different categories. The orders of landmark annotations are listed in figure below.
                 segmentation: [[x1,y1,...xn,yn],[ ]], where [x1,y1,xn,yn] represents a polygon and a single clothing item may contain more than one polygon.
                 scale: a number, where 1 represents small scale, 2 represents modest scale and 3 represents large scale.
                 occlusion: a number, where 1 represents slight occlusion(including no occlusion), 2 represents medium occlusion and 3 represents heavy occlusion.
                 zoom_in: a number, where 1 represents no zoom-in, 2 represents medium zoom-in and 3 represents lagre zoom-in.
                 viewpoint: a number, where 1 represents no wear, 2 represents frontal viewpoint and 3 represents side or back viewpoint.




The definition of landmarks and skeletons of 13 categories are shown below. The numbers in the figure represent the order of landmark annotations of each category in annotation file. A total of 294 landmarks covering 13 categories are defined.


We provide code to generate coco-type annotations from our dataset in

Evaluation Metric

We employ the evaluation metrics used by COCO for human pose estimation by calculating the average precision for keypoints APOKS=.50:.05:.95 , APOKS=0.50 , APOKS=0.75 where OKS indicates the object landmark similarity. Different from coco dataset where only one category has keypoints, a total of 294 landmarks on 13 categories are defined. Besides the coordinates of 294 landmarks of a detected clothing item, its category should also be included in the results for evaluation. Please note that after the category of a detected clothing item is predicted, only predicted landmarks pretaining to this category will actually be evaluated instead of all the 294 landmarks.(For example, if a detected clothing item is predicted as trousers, evaluation will be done between predicted landmarks and groundtruth landmarks. For trousers,the 169th to 182th landmarks of all 294 groundtruth landmarks are non-zero, thus only the 169th to 182th predicted landmarks will be evaluated.)

Submission Format

Please use the following format: [{ "image_id" : int, "category_id" : int, "keypoints" : [x1,y1,v1,...,xk,yk,vk], "score" : float },{},{},...,{}]

Note: image_id refers to the digit number of image name.(For example, the image_id of image 000001.jpg is 1.) Keypoint coordinates are floats measured from the top left image corner (and are 0-indexed).Note also that the visibility flags vi are not currently used (except for controlling visualization), we recommend simply setting vi=1. That is the keypoint detector is not required to predict per-keypoint visibilities or confidences. Example result JSON files are available in example_keys_results.json

Results of all validation images or all test images should be written in a single json file. If you upload validation results in phase 1, it should be named as val_keypoints.json. If you upload test results in phase 2, it should be named as test_keypoints.json. A zip file, which is named as should be generated to pack the json file. Each zip file should contain only one evaluation result. Do not pack multiple submissions into a single zip file. The evaluation server only accepts a zip as valid input.

Please note that when you upload results to the evaluation server, you may need to refresh the page to see the score after the evaluation is done. 


Validation results of evaluation on visible landmarks only and evaluation on both visible and occlusion landmarks are shown below. When you submit results, you will get score of evaluation on both visible and occlusion landmarks and this score determines the challenge winner. During validation, you can either evaluate your results locally or upload your results to the evaluation server to get the score.

  APOKS=.50:.05:.95 APOKS=0.50 APOKS=0.75
only vis 0.605 0.790 0.684
vis && hide 0.529 0.775 0.596

Terms and Conditions

General Rules

      Participants are recommended but not restricted to train their algorithms on the provided train and val sets. The CodaLab page of each track has links to the respective data. When participating in the task, please be reminded that:

1.Any and all external data used for training must be specified in the "method description" when uploading results to the evaluation server.

2.Results in the correct format must be uploaded to the evaluation server. The evaluation page on the individual site of each challenge track lists detailed information regarding how results will be evaluated.

3.Each entry much be associated to a team and provide its affiliation.

4.The results must be submitted through the CodaLab competition site of each challenge track. 

5.The organizer reserves the absolute right to disqualify entries which is incomplete or illegible, late entries or entries that violate the rules.

6.The best entry of each team will be public in the leaderboard at all time.

7.To compete for awards, the participants must fill out a fact sheet briefly describing their methods. There is no other publication requirement. Please download the fact sheet in fact_sheet.

Datasets and Annotations

The datasets are released for academic research only and it is free to researchers from educational or research institutions for non-commercial purposes. When downloading the dataset you agree not to reproduce, duplicate, copy, sell, trade, resell or exploit for any commercial purposes, any portion of the images and any portion of derived data.

Contact Us

For more information, please contact us at


Start: May 27, 2019, midnight

Description: In this phase, you can submit the result of validation set and see your rank in leaderboard.


Start: July 1, 2019, midnight

Description: In this phase, we will release testing set and the leaderboard will show the result of testing set.

Competition Ends

July 30, 2019, midnight

You must be logged in to participate in competitions.

Sign In