WIDER Face & Person Challenge 2019 - Track 4: Person Search by Language

Organized by wider - Current server time: Dec. 5, 2019, 5:43 p.m. UTC

Previous

Final Test
June 11, 2019, 6:59 a.m. UTC

Current

Development
May 10, 2019, 6:59 a.m. UTC

End

Competition Ends
Aug. 2, 2019, 6:59 a.m. UTC

Overview

Searching persons in large-scale image databases with the query of natural language description has important applications in video surveillance. Given the textual description of a person, your algorithm of the person search is required to rank all the samples in the person database then retrieve the most relevant sample corresponding to the queried description. The dataset adopted here is a large-scale person description dataset with detailed natural language annotations and person samples from various sources, termed as CUHK Person Description Dataset (CUHK-PEDES) [1]. Please be noted that the validation data on original CUHK-PEDES dataset will be added into train set and the test data on original CUHK-PEDES dataset will be used as validation set. New test data will be collected from MSMT17 [7]. You can also find an example code for training and validation on original CUHK-PEDES dataset from here.

Data Description

We collected 43,264 images of 14,533 persons from six existing person re-identification datasets, CUHK03 [2], Market-1501 [3], SSM [4], VIPER [5], CUHK01 [6] and MSMT17[7] as the subjects for language descriptions. Since persons in Market-1501 and CUHK03 have many similar samples, to balance the number of persons from different domains, we randomly selected four images for each person in these two datasets. All the image were labeled by crowd workers from Amazon Mechanical Turk (AMT), where each image was annotated with two sentence descriptions and a total of 86,528 sentences were collected. The dataset incorporates rich details about person appearances, actions, poses and interactions with other objects. The sentence descriptions are generally long (> 23 words in average), and has abundant vocabulary and little repetitive information.

We provide all the meta data for this task in a JSON file. The structure of the JSON file is:


[
    {
        "file_path": "train_query/p11600_s14885.jpg", 
        "captions": [
            "A woman is wearing a gray shirt, a pair of brown pants and a pair of shoes.",
            "She is wearing a dark grey top and light colored pants."
        ],
        "id": 11003,
        "processed_tokens": [
            ["a", "woman", "wearing", "a", "gray", "shirt", "a", "pair", "of", "brown", "pants", "and", "a", "pair", "of", "shoes"],
            ["she", "has", "shoulder", "length", "brown", "hair", "she", "is", "wearing", "a", "dark", "grey", "top", "and", "light", "colored", "pants"]
        ],
        "split": "train"
    },

    ...
]

"split" Belonging to train or test set.
"captions" Two natural language descriptions.
"file_path" The save path of the image.
"processed_tokens" The processed sentence.
"id" Person ID of the image. There are 13,003 persons, so the "id" ranges from 1 to 13,003.

 

Submission Format

The submission file should be a zipped txt file. Please do not put the txt file in a folder, you should zip it directly.

For each query sentence in the test set, you must predict a comma-delimited list of candidates. The list should be sorted, such that the first candidate is considered the most relevant one, and the last the least relevant one. The file should contain the lauguage description as the query and the candidates list, which is delimited by '#'. An example is shown below:

    Woman with her hair up is wearing a dark blue jacket with a bag on her right arm, black veiled stockings and black shoes. # 2376.jpg,2318.jpg,2060.jpg,2481.jpg,86.jpg,742.jpg,2710.jpg,1398.jpg,1614.jpg,1821.jpg

 

General Rules

Please check the terms and conditions for further details.

Reference

[1] Li S, Xiao T, Li H, Zhou B, Yue D, Wang X. Person search with natural language description. InProceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2017.
[2] W. Li, R. Zhao, T. Xiao, and X. Wang. Deepreid: Deep filter pairing neural network for person re-identification. In CVPR, pages 152–159, 2014.
[3] L. Zheng, L. Shen, L. Tian, S. Wang, J. Bu, and Q. Tian. Person re-identification meets image search. arXiv preprint arXiv:1502.02171, 2015.
[4] Xiao T, Li S, Wang B, Lin L, Wang X. Joint detection and identification feature learning for person search. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2017.
[5] D. Gray, S. Brennan, and H. Tao. Evaluating appearance models for recognition, reacquisition, and tracking. In Proc. IEEE International Workshop on Performance Evaluation for Tracking and Surveillance (PETS), number 5, 2007.
[6] W. Li, R. Zhao, and X. Wang. Human reidentification with transferred metric learning. In ACCV, pages 31–44, 2012.
[7] L. Wei, S. Zhang, W. Gao and Q. Tian} Person Trasfer GAN to Bridge Domain Gap for Person Re-Identification, In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2018.

Evaluation Criteria

We adopt the top-1 accuracy to evaluate the performance of person retrieval. Given a query sentence, all test images are ranked according to their affinities with the query.

Terms and Conditions

General Rules

Participants are recommended but not restricted to train their algorithms on the provided train and val sets. The CodaLab page of each track has links to the respective data. The test set is divided into two splits: test-dev and test-challenge. Test-dev is as the default test set for testing under general circumstances and is used to maintain a public leaderboard. Test-challenge is used for the workshop competition; results will be revealed at the workshop. When participating in the task, please be reminded that:

  • Any and all external data used for training must be specified in the "method description" when uploading results to the evaluation server.
  • Results in the correct format must be uploaded to the evaluation server. The evaluation page on the individual site of each challenge track lists detailed information regarding how results will be evaluated.
  • Each entry much be associated to a team and provide its affiliation.
  • The results must be submitted through the CodaLab competition site of each challenge track. The participants can make up to 5 submissions per day in the development phases. A total of 5 submissions are allowed during the final test phase. Using multiple accounts to increase the number of submissions is strictly prohibited.
  • The organizer reserves the absolute right to disqualify entries which is incomplete or illegible, late entries or entries that violate the rules.
  • The best entry of each team will be public in the leaderboard at all time.
  • To compete for awards, the participants must fill out a fact sheet briefly describing their methods. There is no other publication requirement.

Datasets and Annotations

The datasets are released for academic research only and it is free to researchers from educational or research institutions for non-commercial purposes. When downloading the dataset you agree not to reproduce, duplicate, copy, sell, trade, resell or exploit for any commercial purposes, any portion of the images and any portion of derived data.

Software

Copyright © 2019, WIDER Consortium. All rights reserved. Redistribution and use software in source and binary form, with or without modification, are permitted provided that the following conditions are met:

  • Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
  • Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
  • Neither the name of the WIDER Consortium nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.

THIS SOFTWARE AND ANNOTATIONS ARE PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

Contact Us

For more information, please refer to the challenge webpage or contact us at wider-challenge@ie.cuhk.edu.hk.

Development

Start: May 10, 2019, 6:59 a.m.

Description: In this phase, you can submit the result of validation set and see your rank in leaderboard.

Final Test

Start: June 11, 2019, 6:59 a.m.

Description: In this phase, we will release testing set and the leaderboard will show the result of testing set.

Competition Ends

Aug. 2, 2019, 6:59 a.m.

You must be logged in to participate in competitions.

Sign In
# Username Score
1 Xiaojing 0.5106
2 Nuanyang 0.5049
3 ac5462 0.4431