Predicting Generalization in Deep Learning Forum

Go back to competition Back to thread list Post in this thread

> About the weak and strong baselines

In the competition description document (NeurIPS 2020 Competition: Predicting Generalization in Deep Learning (Version 1.0)), section 1.6. It says:
We will be providing baselines in the form of 2 different measures. First measure is the VC-dimension of the models and the second measure is the true generalization gap of the models with added noises. The VC-dimension of convolutional neural networks can be found in [13]. The former is meant to be a weak baseline from classical machine learning literature, and the latter is meant to be a strong baseline, which we expect few solution to beat since it is essentially a noisy version of the true quantity of interest.

For now, there are 3 baseline codes in starting kit: distance_from_init; jacobian_norm; sharpness.
I would like to ask where the two baselines described in the document are. Are they in the starting kit zip file?

Posted by: zhanyu @ July 21, 2020, 5 p.m.

Also in the 3 baseline examples, the codes for sharpness seems to be wrong in the loss_fn part.
# =================================
@tf.function
def optimize(loss_fn, optimizer, data):
x, y = data
y = tf.one_hot(y, 10)
with tf.GradientTape() as tape:
logits = model(x)
loss = loss_fn(logits, y)
variables = model.trainable_variables
gradients = tape.gradient(loss, variables)
optimizer.apply_gradients(zip(gradients, variables))
# =================================

Here it is using loss_fn(logits, y) instead of loss_fn(y, logits).
From the official usage of tf2, https://www.tensorflow.org/api_docs/python/tf/keras/losses/CategoricalCrossentropy, we know loss_fn(y, logits) should be the correct usage.

Posted by: zhanyu @ July 21, 2020, 11:46 p.m.

Thank you spotting the errors! I have corrected it and also made a version of VC-dimension.
You can find them at: https://drive.google.com/file/d/1MAH-usCGX-rLwGw11pwWunLzkxj62d2P/view?usp=sharing
Alternative, you can also go to the "get data" tab in participate to download them.
Regarding oracles, we realized that letting the submission to have access to the ground truth is not ideal for a competition thus we cannot actually provide a functioning baseline where you can get a noisy version of the actual test accuracy; however, you should be able to do this quite easily locally on the public data by indexing their test accuracy in the model_configs.json.

Posted by: ydjiang @ July 22, 2020, 7:01 p.m.

If you are interested for a quick number, an oracle with uniform noise of plus minus 1 percent achieves around the score of 60 on the public data.

Posted by: ydjiang @ July 22, 2020, 7:02 p.m.

Thanks for updating the sharpness baselines ydjiang! However, I am getting the following error while running sharpness_v2
"""NameError: name 'outer_n' is not defined"""
Should we set outer_n=20 just as in first version sharpness/complexity.py?

Posted by: graddescent @ July 23, 2020, 10:53 a.m.

Yes, that should work.

Posted by: ydjiang @ July 23, 2020, 4:59 p.m.

Hi Yiding, sorry to bother you again. But I have another submission (07/23/2020 06:04:09) which is still running (for about 11 hours) while it can finish in my local environment (V100) within 10 minutes. Could you help me check the status or just terminate the running? Thank you!

Posted by: zhanyu @ July 23, 2020, 5:03 p.m.

Rerunning it right now. The ingestion only took 40 minutes, but scoring is not getting scheduled properly...

Posted by: ydjiang @ July 23, 2020, 5:26 p.m.

It's done. Do you see it on your end?

Posted by: ydjiang @ July 23, 2020, 6:07 p.m.

Yes. I see. Thank you very much!! Really appreciate it!

Posted by: zhanyu @ July 23, 2020, 6:08 p.m.

What exactly should these paths **path/to/prediction** and **path/to/output**, point to if we want to run scoring script locally?

Posted by: deepq-lab @ July 30, 2020, 2:27 p.m.
Post in this thread