I tried to reproduce the Match RCNN for instance segmentation, and I am not getting the reported results on SegmentationAP.
Namely I am getting much smaller APsmall (42.5 vs 63.4)and APmoderate(34.5 vs 70.0). Does anyone have the same issue?
Any specific hyperparameters that help you to deal with smaller objects (say Image resizing, and anchor sizes) ?
Below are my model's detection and segmentation metrics on the Val split:
-------------------------------------------------------------------------------------
('bbox', {'AP': 67.86324832728609,
'AP-long_sleeved_dress': 56.541174163962225,
'AP-long_sleeved_outwear': 74.16449293306097,
'AP-long_sleeved_shirt': 73.75971339840385,
'AP-short_sleeved_dress': 72.8306767642245,
'AP-short_sleeved_outwear': 43.61379907240868,
'AP-short_sleeved_shirt': 81.9145365194973,
'AP-shorts': 73.37486624090114,
'AP-skirt': 75.49207732009415,
'AP-sling': 45.75785464789856,
'AP-sling_dress': 67.33678391960328,
'AP-trousers': 76.54638488474107,
'AP-vest': 67.84181992205902,
'AP-vest_dress': 73.04804846786432,
'AP50': 81.15095138585319,
'AP75': 77.16118635751069,
'APl': 68.08711310612466,
'APm': 46.284572835916634,
'APs': 40.04950495049505}),
('segm',
{'AP': 65.48012105746339,
'AP-long_sleeved_dress': 55.73186013884395,
'AP-long_sleeved_outwear': 65.80836195914864,
'AP-long_sleeved_shirt': 73.21708850317896,
'AP-short_sleeved_dress': 69.94720795596251,
'AP-short_sleeved_outwear': 39.933479439930345,
'AP-short_sleeved_shirt': 81.9019089354212,
'AP-shorts': 67.99202822354722,
'AP-skirt': 72.64990467430529,
'AP-sling': 45.614371104823185,
'AP-sling_dress': 68.66755489278036,
'AP-trousers': 73.53251749175784,
'AP-vest': 66.23205415164291,
'AP-vest_dress': 70.01323627568168,
'AP50': 80.736844713428,
'AP75': 76.3392207370798,
'APl': 65.88659721926655,
'APm': 34.59617538897983,
'APs': 42.57425742574257})])
Hi, below is our config file for training segmentation baseline:
MODEL:
TYPE: generalized_rcnn
CONV_BODY: FPN.add_fpn_ResNet50_conv5_body
NUM_CLASSES: 14
FASTER_RCNN: True
MASK_ON: True
NUM_GPUS: 8
SOLVER:
WEIGHT_DECAY: 0.0001
LR_POLICY: steps_with_decay
BASE_LR: 0.01
GAMMA: 0.1
MAX_ITER: 300000
STEPS: [0, 200000, 270000]
FPN:
FPN_ON: True
MULTILEVEL_ROIS: True
MULTILEVEL_RPN: True
FAST_RCNN:
ROI_BOX_HEAD: fast_rcnn_heads.add_roi_2mlp_head
ROI_XFORM_METHOD: RoIAlign
ROI_XFORM_RESOLUTION: 7
ROI_XFORM_SAMPLING_RATIO: 2
MRCNN:
ROI_MASK_HEAD: mask_rcnn_heads.mask_rcnn_fcn_head_v1up4convs
RESOLUTION: 28 # (output mask resolution) default 14
ROI_XFORM_METHOD: RoIAlign
ROI_XFORM_RESOLUTION: 14 # default 7
ROI_XFORM_SAMPLING_RATIO: 2 # default 0
DILATION: 1 # default 2
CONV_INIT: MSRAFill # default GaussianFill
TRAIN:
IMS_PER_BATCH: 2
USE_FLIPPED: True
AUTO_RESUME: True
SNAPSHOT_ITERS: 5000
WEIGHTS: ../R-50.pkl
DATASETS: ('seg_deepfashion2_train_192k',)
SCALES: (800,)
MAX_SIZE: 1333
BATCH_SIZE_PER_IM: 512
RPN_PRE_NMS_TOP_N: 2000 # Per FPN level
Also, the performance in paper is trained with full deepfashion2 training set. For performance trained with released deepfashion2 training set, please refer to https://github.com/switchablenorms/DeepFashion2.
Posted by: geyuying @ April 9, 2020, 6:17 a.m.