Welcome to the official page of the Project module of the course on Introduction to Machine Learning taught to the 3rd year bachelor students of the Innopolis University. The project will be run in the form of a competition. Details are as follows.
The problem that you will solve here is digits classification for digits images. You will build a model that takes as input a batch of grayscale images of digits with shape (28, 28). Please take a look at the figure below. Note that the batch will be of shape (n_samples, 28, 28, 1).
In this competition, we are interested in the generalization of your model so there will be 3 datasets here: train set, public test set, and private test set.
The trainset is accessible to you. But both the public and private test sets are hidden.
And the competition is divided into two phases:
- Development phase: You will develop your model then train it on the train set. After training, you will test your model on the public test set by submitting the model following the instructions given in the "Instructions" menu. This would give you the score for your model on the public test set. You can use this score to tune your model again on the train set. You can repeat this process 100 times in total (no more than 10 times per day) until the final submission. For submission instructions, please visit the Instructions menu.
- Evaluation phase: After the submission deadline of the first phase is over, the submission system will stop taking any further submissions. We will take your last model on the leaderboard and evaluate it on the private test set and get the true performance of your model on this dataset that wasn't seen/revealed before to your model. Both the score and the rank among the total number of students on this dataset (private test set) will be used to calculate your grade. This will help us to assess the generalization of your model. You may read further on this in the evaluation menu.
- The F1-score is used in the evaluation.
- The first phase of the competition will be evaluated on the public test set.
- The second phase of the competition will be offline, where we will evaluate your last model on another dataset (private test set).
So, the first phase is for you to tune your model and the second phase for us to evaluate it.
The final grade will be a combination of your score and your rank:
Let your score on the private test set be a ∈ [0, 1] and your rank is b ∈ [1, S], while S is the number of participants or students.
So, there are 120 students and:
- if your score is 0.75 and your rank is the 20th, then the grade will be = 0.65 + (1 - (19/120)) * (1 - 0.65) = 0.944 = 94.4%
- if your score is 0.85 and your rank is the 100th, then the grade will be = 0.75 + (1 - (99/120)) * (1 - 0.75) = 0.794 = 79.4%
- Cheating is a serious academic offense and will be strictly treated for all parties involved. So delivering nothing is always better than delivering a copy.
- You should develop your model using Keras in the Tensorflow library. So please, DONOT use "
from keras import ...", instead USE "
from tensorflow.keras import ...".
- Both public and private test sets are of shape (n_samples, 28, 28, 1) representing grayscale images with values in the range [0, 255], same as the training set.
- The submitted model will be used as follows "
model.predict(x)", while x is of shape (n_samples, 28, 28, 1).
- You can load the data using numpy function: "
x = np.load('x.npy')".
- After developing and training your model, save it using the function: "
model.save('model.h5')", Take care, it should have the name 'model.h5'.
- Zip the model to get 'model.h5.zip'. DONOT put it in a folder then zip the folder, just zip the h5 file as it is.
- Submit the file 'model.h5.zip' and wait for the response.
* Special thanks to AbdelRahman Abounegm for his help in developing the docker image for this competition.