Hi our submission with timestamp: 11/02/2020 00:52:18 failed but the error logs are empty! Can you share more any information on this?Posted by: vamshichowdary @ Nov. 2, 2020, 4:59 a.m.
it ran into a memory error. I'll run it again on the private server for you so it doesn't take up your submission count.
It crashed the docker container. This is the first time I am seeing an error like this.Posted by: ydjiang @ Nov. 2, 2020, 6:04 a.m.
Thanks for the reply. Wow not sure what caused the docker crash. It is working fine on our local system. Are you able to re-run it successfully? If not, can we make another submission?
I am re-running it right now. The failed one shouldn't count towards the limit. You should be able to resubmit.Posted by: ydjiang @ Nov. 2, 2020, 6:20 a.m.
Any update on this? Also our third submission is still showing as "Running" on the submissions page (timestamp: 11/02/2020 00:53:20 server time but 11/01/2020 16:53:20 PST local time). Can you give an update on that too?Posted by: vamshichowdary @ Nov. 2, 2020, 5:31 p.m.
Both of them still appear to be running, but I am pretty sure the one I ran for you failed.
Is it possible that you are running close to memory limit. Previously some of the previous solutions failed the same way because it filled up the memory and crashed the docker.
After which the status cannot be updated anymore.
All of the submissions ran without any problems on our machine which is a Xeon 6138 + Nvidia Titan RTX (24Gb). Not really sure what is happening here. Can we submit same code again if these fail?Posted by: vamshichowdary @ Nov. 2, 2020, 6:12 p.m.
We also tested the same code on an NVIDIA 1080Ti (11Gb) and it is working fine. Is it possible that you are setting any host memory limits when you are running docker?Posted by: vamshichowdary @ Nov. 2, 2020, 6:26 p.m.
I was referring to the system memory not GPU memory.Posted by: ydjiang @ Nov. 2, 2020, 6:27 p.m.
Oh I see. Our model is consuming ~3.9G system memory. Is it more than the limit?Posted by: vamshichowdary @ Nov. 2, 2020, 6:37 p.m.
That shouldn't be a problem. Maybe there are some other problems then.Posted by: ydjiang @ Nov. 2, 2020, 7:50 p.m.