Million-AID is a large-scale dataset for scene parsing in aerial images. It can be used to develop and evaluate aerial scene classification algorithms. The past years have witnessed great progress on aerial image interpretation and its wide applications. With aerial images becoming more accessible than ever before, there is an increasing demand for automatic interpretation of these images. This challenge aims to develop and test intelligent interpretation algorithms for the task of multi-class aerial scene classification, which requires identification of semantic categories of aerial images in Million-AID.
You can find detailed information about Million-AID employed in the challenge. In particular, visit the following pages for FAQ:
If you make use of Million-AID, please cite our following papers:
@article{Long2021DiRS, title={On Creating Benchmark Dataset for Aerial Image Interpretation: Reviews, Guidances and Million-AID}, author={Yang Long and Gui-Song Xia and Shengyang Li and Wen Yang and Michael Ying Yang and Xiao Xiang Zhu and Liangpei Zhang and Deren Li}, journal={IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing}, year={2021}, volume={14}, pages={4205-4230} }
@misc{Long2022ASP, title={Aerial Scene Parsing: From Tile-level Scene Classification to Pixel-wise Semantic Labeling}, author={Yang Long and Gui-Song Xia and Liangpei Zhang and Gong Cheng and Deren Li}, year={2022}, eprint={2201.01953}, archivePrefix={arXiv}, primaryClass={cs.CV} }
The multi-class scene classification for this challenge requires participants to distinguish images with similar semantic content from the massive images, and assign a scene category label to each aerial image in Million-AID.
There are over 1M scene instances with 51 semantic scene categories in Million-AID. The scene names and corresponding indices for this challenge include: (0) dry_field, (1) greenhouse, (2) paddy_field, (3) terraced_field, (4) meadow, (5) forest, (6) orchard, (7) commercial_area, (8) oil_field, (9) storage_tank, (10) wastewater_plant, (11) works, (12) mine, (13) quarry, (14) solar_power_plant, (15) substation, (16) wind_turbine, (17) swimming_pool, (18) church, (19) cemetery, (20) baseball_field, (21) basketball_court, (22) golf_course, (23) ground_track_field, (24) stadium, (25) tennis_court, (26) apartment, (27) detached_house, (28) mobile_home_park, (29) apron, (30) helipad, (31) runway, (32) bridge, (33) intersection, (34) parking_lot, (35) road, (36) roundabout, (37) viaduct, (38) pier, (39) railway, (40) train_station, (41) bare_land, (42) desert, (43) ice_land, (44) island, (45) rock_land, (46) sparse_shrub_land, (47) beach, (48) dam, (49) lake, (50) river.
OneDrive: Million-AID Download
The Million-AID images are collected from the Google Earth. All the images are sampled with RGB channels stored in "jpg" format. The use of the Google Earth images must respect the "Google Earth" terms of use .
The multi-class scene classification for this challenge requires participants to distinguish images with similar semantic content from the massive images, and assign a scene category label to each aerial image in Million-AID.
The participant need to submit a zip file containing classification results for all test images in Million-AID. The classification results are stored in a text file named "answer.txt", where the names of test images and predicted category indices are indicated and separated by the space(s). The results are organized in the following format:
image_name category_index image_name category_index image_name category_index ...
An submission example for multi-class scene classification on Million-AID
The performance of multi-class scene classification is evaluated by the commonly used overall accuracy (OA), average accuracy (AA), and Kappa coefficient (Kappa). Let cij denotes the number of images of class i predicted to be class j. Let si=Σjcij be the total number of images that belong to class i and sj=Σicij the total number of images that are predicted to be class j. N is the total number of classes in the dataset. The OA is defined as the number of correctly predicted images divided by the total number of predicted images in the test dataset: OA = Σicii / Σisj . The AA is calculated by the mean value of classification accuracy of all classes: AA = Σi(cii / sj) / N . The Kappa is calculated on the basis of confusion matrix. Each row of the confusion matrix represents the actual instances in a predicted class while each column reveals the predicted instances in an actual class. Thus, each item in a confusion matrix can be denoted as xij = cij / si . The Kappa is then calculated as: Kappa =(p0-pe) / (1-pe), where p0= Σicii / Σisi, pe=∑n(Σicin / Σjcnj) , and 1≤n≤n . Results in the leaderboard are ranked by the Kappa metric.
Creative Commons Licence This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License. Feel free to contact us if you have any questions or need clarification regarding the rules of the challenge or the licensing of the data.
Start: Oct. 10, 2021, midnight
Description: 51 semantic scene categories. Submit your model prediction results and the performance will be evaluated.
Never
You must be logged in to participate in competitions.
Sign In# | Username | Score |
---|---|---|
1 | YangLong | 0.5029 |