CodaLab -

> 10th place method

First of all big thanks to the MAFAT for organizing the competition! Had so much fun in the process and for me personally it is an important achievement, as it was first such competition for me. I like the overall flow of this competition and how organizers handled their part. The starter notebook with the baseline model and submission generation code really made the learning curve smoother. Again, don't know if it is the usual practice, but I never felt left solving the unimportant bits and could concentrate on the main problem.

For legal reasons I can not share the code, but can tell more about methods used...

Data:
Entire training set was used. Additionally to this, some small amount of synthetic test data and big chunk of the synthetic aux data were used to add many "human" instances. Final dataset aside from the public test contained 15809 entries with 8398 being "human", which is my idea of balanced.
Same preprocessing steps were used aside from not using max_value_on_doppler. I did shift the zero velocity to the middle of array (mostly for the convenience) but don't think, it should have change anything.

Augmentation:
Width shift 0.05 (this shifts along the velocity axis, I think)
Height shift 0.25 (slow time shift)
Horizontal and vertical flip with "nearest". "Wrap" was giving me worst LB scores constantly.
So very small amount of the augmentation, I guess.

Models:
The final submission was generated using two types of models. The main model was modified ResNet50. I reduced number of filters for all conv blocks 2 times, but never properly tested, how this impacts results. I added dropout layers after few first conv blocks.
The second model was evolved original baseline model. The final version of it contained around 13-15 conv layers split in three blocks with the batch normalization after each block. GlobalAveragePooling2D was used instead of Flatten before the final Dense layer. Kernel regularizers and bias regularizers were used on most of the layers to reduce overfitting. Weight decay was set to 1e-3.
I used two types of models based purely on the observation, that one of them had much higher accuracy on "numan" instances, but other was much more accurate on "animal".

Training method and ensembling:
The final submission was generated using an ensemble of models created during the process of training of Cyclical Learning. I used 5 and 4 cycles for the main and the secondary model respectively, saving weights after each cycle and repeating the process several times. This way I had several tens of models for the ensemble. Final result was just an average of results for all these models.

All decisions were mostly based on the LB results as I had no other indication, that my model performs better or worse. Don't know the usual case with such big public test and training set mismatch, but in this case it was frustrating to not have clear way to tell if ones model is improving.

What worked (at least for the public phase):
Batch normalization layers after the convolutions for the baseline model
Explicit regularization methods - kernel regularization (l1, l2)
Implicit regularization - augmentation
Adding aux synthetic data
Ensembling (duh!)
Some of the above while using deeper and deeper architectures
Cyclic Learning as an ensembling method

What didn't work:
All the creative things I hoped to work: adding noise to data, removing some obviously bad data, cropping only part of the spectrum, splitting into more classes (by SNR), adding random background samples to the data.
Many times the above could have not worked due to my inexperience with Keras (or TF) interfaces in general, as I was struggling while implementing something outside of basic tutorials. ;)
Bigger augmentation ranges
More or less dropout
Cross validation (at least I couldn't make any decisions based on it)
Wrapping method for data shifts while augmenting.
Enriching the dataset by combining tracks from the training set and creating shifted samples.

There are so many things I didn't try, which, as I later found out, are the standard things to try in Kaggle competitions! But I think, it was too much to handle for the first try. Deterministic results where one of the many before unknown topics to me.
Again, thanks to the organizers and all the competitors who motivated me to improve.
Hope to see some more thoughts!

Posted by: michaild @ Oct. 21, 2020, 9:01 a.m.

Thank you very much for sharing!,

We also kinda implemented a modified mini Resnet with a some augmentation steps with extra synthesized data by resampling. We ended up in the 18th position. But I would like to recreate or integrate your solution into ours and try to achieve your accuracy. I hope I could maybe contact you if I have any doubts with the steps you mentioned.

Regards,
Shankar

Posted by: shankarkumarj @ Oct. 21, 2020, 1:11 p.m.

Sure, you can drop me an e-mail at michail.drozdov@gmail.com.

Posted by: michaild @ Oct. 22, 2020, 10 a.m.

Post in this thread

Forums

MAFAT Radar Challenge - Can you distinguish between humans and animals in radar tracks? Forum

> 10th place method