DeepFashion2 Challenge 2019 - Track 2 Clothes Retrieval

Organized by tttoaster - Current server time: June 19, 2019, 3:18 p.m. UTC

Current

Development
May 27, 2019, midnight UTC

Next

Final
July 1, 2019, midnight UTC

End

Competition Ends
July 30, 2019, midnight UTC

Overview

Given a detected item from a consumer-taken photo, this task aims to search the commercial images in the gallery for the items that are corresponding to this detected item. In this challenge, we provide a more realistic setting: Instead of being provided the ground truth query clothing item, participants should detect clothing items in images from consumers. For each detected clothing item, participants need to submit the top-10 retrieved clothing items detected from shop images. In this task, top-10 retrieval accuracy is employed as the evaluation metric. We emphasize the retrieval performance while still consider the influence of detector. If a clothing item fails to be detected, this query item is counted as missed.

In particular, we have 337K commercial-consumer clothes pairs in the training set. In the validation set, there are 10,844 consumer images with 12,377 query items, and 21,309 commercial images with 36,961 items in the gallery. In the test set, there are 20,681 consumer images with 23,390 query items, and 41,948 commercial images with 72,337 items in the gallery.

Data Description

  Train Validation Test
Images 191,961 32,153 62,629
Labels 312,186 51,490  
Pairs 337,293 query: 10,844
gallery: 21,309
query: 20,681
gallery: 41,948


Each image in seperate image set has a unique six-digit number such as 000001.jpg. A corresponding annotation file in json format is provided in annotation set such as 000001.json.

Label Description

Each annotation file is organized as below:

source: a string, where 'shop' indicates that the image is from commercial store while 'user' indicates that the image is taken by users.
        pair_id: a number. Images from the same shop and their corresponding consumer-taken images have the same pair id.

item1:
                 category_name: a string which indicates the category of the item.
                 category_id: a number which corresponds to the category name. In category_id, 1 represents short sleeve top, 2 represents long sleeve top, 3 represents short sleeve outwear, 4 represents long sleeve outwear, 5 represents vest, 6 represents sling, 7 represents shorts, 8 represents trousers, 9 represents skirt, 10 represents short sleeve dress, 11 represents long sleeve dress, 12 represents vest dress and 13 represents sling dress.
                 style: a number to distinguish between clothing items from images with the same pair id. Clothing items with different style numbers from images with the same pair id have different styles such as color, printing, and logo. In this way, a clothing item from shop images and a clothing item from consumer image are positive commercial-consumer pair if they have the same style number greater than 0 and they are from images with the same pair id. (If you are confused with style, please refer to issue #10.) 
                 bounding_box: [x1,y1,x2,y2], where x1 and y1 represent the lower left point coordinate of bounding box, x2 and y2 represent the upper right point coordinate of bounding box.
                 landmarks: [x1,y1,v1,...,xn,yn,vn], where v represents the visibility: v=2 visible; v=1 occlusion; v=0 not labeled. We have different definitions of landmarks for different categories. The orders of landmark annotations are listed in figure below.
                 segmentation: [[x1,y1,...xn,yn],[ ]], where [x1,y1,xn,yn] represents a polygon and a single clothing item may contain more than one polygon.
                 scale: a number, where 1 represents small scale, 2 represents modest scale and 3 represents large scale.
                 occlusion: a number, where 1 represents slight occlusion(including no occlusion), 2 represents medium occlusion and 3 represents heavy occlusion.
                 zoom_in: a number, where 1 represents no zoom-in, 2 represents medium zoom-in and 3 represents lagre zoom-in.
                 viewpoint: a number, where 1 represents no wear, 2 represents frontal viewpoint and 3 represents side or back viewpoint.

item2:

......

itemn:

The definition of landmarks and skeletons of 13 categories are shown below. The numbers in the figure represent the order of landmark annotations of each category in annotation file. A total of 294 landmarks covering 13 categories are defined.

cls

We provide code to generate coco-type annotations from our dataset in deepfashion2_to_coco.py

Please note that we do not provide data in pairs. In training dataset, images are organized with continuous 'pair_id' including images from consumers and images from shops. (For example: 000001.jpg(pair_id:1; from consumer), 000002.jpg(pair_id:1; from shop),000003.jpg(pair_id:2; from consumer),000004.jpg(pair_id:2; from consumer),000005.jpg(pair_id:2; from consumer), 000006.jpg(pair_id:2; from consumer),000007.jpg(pair_id:2; from shop),000008.jpg(pair_id:2; from shop)...) A clothing item from shop images and a clothing item from consumer image are positive commercial-consumer pair if they have the same style number which is greater than 0 and they are from images with the same pair id, otherwise they are negative pairs. In this way, you can consruct training positive pairs and negative pairs in instance-level.

As is shown in the figure below, the first three images are from consumers and the last two images are from shops. These five images have the same 'pair_id'. Clothing items in orange bounding box have the same 'style':1. Clothing items in green bounding box have the same 'style': 2. 'Style' of other clothing items whose bouding boxes are not drawn in the figure is 0 and they can not construct positive commercial-consumer pairs. One positive commercial-consumer pair is the annotated short sleeve top in the first image and the annotated short sleeve top in the last image. Our dataset makes it possbile to construct instance-level pairs in a flexible way.

Please note that 'pair_id' is an image-level label. All clothing items in an image share the same 'pair_id'.

pair

Evaluation Metric

For clothes retrieval task, we provide a more realistic setting for evaluation: Instead of being provided the ground truth query clothing item, you should detect clothing items in images from consumers. For each detected clothing item, you need to submit the top-10 retrieved clothing items detected from shop images. When evaluation,for each ground truth query item(whose style is greater than 0), we will select a detected item on behalf of it for retrieval: First, a ground truth label will be assigned to each detected query clothing item according to its IoU with all the ground truth items. Then find out all detected items which are assigned the given ground truth label and are classified correctly. Finally select the detected item with the highest score among these detected items. The retrieved results of this selected query item will be evaluated. If IoU between retrieved item from shop images and one of the ground truth corresponding gallery item is over the thresh(we set thresh as 0.5), the retrieved result is positive.(If none detected item is assigned the given query item label, this query item is counted as missed.) Evaluation code is available in deepfashion2_retrieval_test.py. For detailed evaluation information, please refer to Evaluation Code.

Submission Format

Please use the following format: [{ "query_image_id" : int, "query_bbox" : [x1,y1,x2,y2], "query_cls" : int, "query_score" : float, "gallery_image_id" : [int,int,int,int,int,int,int,int,int,int], "gallery_bbox":[ [x1,y1,x2,y2],...[] ] },{},{},...,{}]

For a detected clothing item from consumers, the top-10 retrieved clothing items from shops should be included in the results. Note: image_id refers to the digit number of image name.(For example, the image_id of image 000001.jpg is 1.) Example result JSON files are available in example_retrieval_results.json

Results of all validation images or all test images should be written in a single json file. If you upload validation results in phase 1, it should be named as val_retrieval.json. If you upload test results in phase 2, it should be named as test_retrieval.json. A zip file, which is named as submission.zip should be generated to pack the json file. Each zip file should contain only one evaluation result. Do not pack multiple submissions into a single zip file. The evaluation server only accepts a zip as valid input.

Please note that when you upload results to the evaluation server, you may need to refresh the page to see the score after the evaluation is done. 

Baseline

Validation results of clothes retrieval using different features are shown below. When you submit results, you will get top-10 as score and this score determines the challenge winner. During validation, you can either evaluate your results locally or upload your results to the evaluation server to get the score.

  top-1 top-10 top-20
class 0.079 0.273 0.366
keypoints 0.182 0.416 0.510
segmentation 0.135 0.350 0.447
keys+class 0.192 0.435 0.524
seg+class 0.152 0.379 0.477

Terms and Conditions

General Rules

Participants are recommended but not restricted to train their algorithms on the provided train and val sets. The CodaLab page of each track has links to the respective data. When participating in the task, please be reminded that:

1.Any and all external data used for training must be specified in the "method description" when uploading results to the evaluation server.

2.Results in the correct format must be uploaded to the evaluation server. The evaluation page on the individual site of each challenge track lists detailed information regarding how results will be evaluated.

3.Each entry much be associated to a team and provide its affiliation.

4.The results must be submitted through the CodaLab competition site of each challenge track.

5.The organizer reserves the absolute right to disqualify entries which is incomplete or illegible, late entries or entries that violate the rules.

6.The best entry of each team will be public in the leaderboard at all time.

7.To compete for awards, the participants must fill out a fact sheet briefly describing their methods. There is no other publication requirement. Please download the fact sheet in fact_sheet.

Datasets and Annotations

The datasets are released for academic research only and it is free to researchers from educational or research institutions for non-commercial purposes. When downloading the dataset you agree not to reproduce, duplicate, copy, sell, trade, resell or exploit for any commercial purposes, any portion of the images and any portion of derived data.

Contact Us

For more information, please contact us at yyge13@gmail.com

Development

Start: May 27, 2019, midnight

Description: In this phase, you can submit the result of validation set and see your rank in leaderboard.

Final

Start: July 1, 2019, midnight

Description: In this phase, we will release testing set and the leaderboard will show the result of testing set.

Competition Ends

July 30, 2019, midnight

You must be logged in to participate in competitions.

Sign In