Among the evaluation criteria, I have some questions about "activations" and "depth".
1) I'm wondering why the "activation" function is included in the evaluation criteria. Could you explain it briefly?
2) and, is it evaluated better as the number of "activation" functions is smaller?
3) How do you measure "depth"? For example, in the case of a model structure that is divided into two branches each having a depth of 16 and then merged, is the depth 16 or 32?
Thank you in advance.
You do not need to optimize all the aspects.
We do have the evaluation criteria to define efficiency.
Try your best to improve the efficiency with your own understanding of efficiency.
For the activations, please refer to the paper entiled "Designing Network Design Spaces".
It means the size of the output tensors of all conv layers.
The number of parameters and runtime are more important than depth.