ChaLearn Automatic Machine Learning Challenge (AutoML) :: Round 0

Secret url:
Organized by lukasz.romaszko - Current server time: June 25, 2018, 3:52 p.m. UTC


April 1, 2015, midnight UTC


[+] Final0
Jan. 24, 2016, 8 p.m. UTC


Competition Ends

AutoML hackathon, Paris, April 22, 2015


This is a "supervised learning" challenge in machine learning. We are making available 30 datasets, all pre-formatted in given feature representations (this means that each example consists of a fixed number of numerical coefficients). The challenge is to solve classification and regression problems, without any further human intervention.

The difficulty is that there is a broad diversity of data types and distributions (including balanced or unbalanced classes, sparse or dense feature representations, with or without missing values or categorical variables, various metrics of evaluation, various proportions of number of features and number of examples). The problems are drawn from a wide variety of domains and include medical diagnosis from laboratory analyses, speech recognition, credit rating, prediction or drug toxicity or efficacy, classification of text, prediction of customer satisfaction, object recognition, protein structure prediction, action recognition in video data, etc. While there exist machine learning toolkits including methods that can solve all these problems, it is still considerable human effort to find, for a given combination of dataset, task, metric of evaluation, and available computational time, the combination of methods and hyper-parameter setting that is best suited. Your challenge is to create the "perfect black box" eliminating the human in the loop.


This challenge is brought to you by ChaLearn. Contact the organizers.



This challenge is concerned with regression and classification problems (binary, multi-class, or multi-label) from data already formatted in fixed-length feature-vector representations. Each task is associated with a dataset coming from a real application. The domains of application are very diverse and are drawn from: biology and medicine, ecology, energy and sustainability management, image, text, audio, speech, video and other sensor data processing, internet social media management and advertising, market analysis and financial prediction.
All datasets present themselves in the form of data matrices with samples in lines and features (or variables) in columns. For instance, in a medical application, the samples may represent patient records and the features may represent results of laboratory analyses. The goal is to predict a target value, for instance the diagnosis "diseased" or "healthy" in the case of a medical diagnosis problem.
The identity of the datasets and the features is concealed (except in round 0) to avoid the use of domain knowledge and push the participants to design fully automated machine learning solutions.
In addition, the tasks are constrained by:

  • A Time Budget.
  • A Scoring Metric.

Task, scoring metric and time budget are provided with the data, in a special "info" file.

Time Budget

The Codalab platform provides computational resources shared by all participants. To ensure the fairness of the evaluation, when a code submission is evaluated, its execution time is limited to a given Time Budget, which varies from dataset to dataset. The time budget is provided with each dataset in its "info" file. The organizers reserve the right to adjust the time budget by supplying the participants with new info files.
The participants who submit results (instead of code) are NOT constrained by the Time Budget, since they can run their code on their own platform. This may be advantageous for entries counting towards the Final phases (immediately following a Tweakathon). The participants wishing to also enter the AutoML phases, which require submitting code, can submit BOTH results and code (simultaneously). See the Instructions for details.

Scoring Metrics

The scoring program computes a score by comparing submitted predictions with reference "target values". For each sample i, i=1:P, the target value is:

  • a continuous numeric coefficient yi, for regression problem;
  • a vector of binary indicators [yik] in {0, 1}, for multi-class or multi-label classification problems (one per class k);
  • a single binary indicator yi in {0, 1}, for binary classification problems.

The participants must turn in prediction values matching as closely as possible the target value, in the form of:

  • a continuous numeric coefficient qi for regression problem;
  • a vector of numeric coefficients [qik] in the range [0, 1] for multi-class or multi-label classification problems (one per class k);
  • a single numeric coefficients qi in the range [0, 1] for binary classification problems.

The Starting Kit contains the Python implementation of all scoring metrics used to evaluate the entries. Each dataset has its own metric (scoring criterion), specified in its "info" file. All scores are re-normalized such that the expected value of the score for a "trivial guess" based on class prior probabilities is 0 and the optimal score is 1. Multi-label problems are treated as multiple binary classification problems and are evaluated by the average of the scores of each binary classification sub-problem.
The scores are taken from the following list:

  • R2: R-square or "coefficient of determination" used for regression problems: R2 = 1-MSE/VAR, where MSE=< (yi - qi)2> is the mean-square-error and VAR= < (yi - m)2> is the variance, with m=< yi >.
  • ABS: A coefficient similar to the R2 but based on mean absolute error (MAE) and mean absolute deviation (MAD): ABS =  1-MAE/MAD, with MAE=< abs(yi - qi) > and MAD=< abs(yi - m) >.
  • BAC: Balanced accuracy, which is the average of class-wise accuracy for classification problems (or the average of sensitivity (true positive rate) and specificity (true negative rate) for the special case of binary classification). For binary classification problems, the class-wise accuracy is the fraction of correct class predictions when qi is thresholded at 0.5, for each class. The class-wise accuracy is averaged over all classes for multi-label problems. For multi-class classification problems, the predictions are binarized by selecting the class with maximum prediction value argmaxk qik before computing the class-wise accuracy. We normalize the BAC with the formula BAC := (BAC-R)/(1-R), where R is the expected value of BAC for random predictions (i.e. R=0.5 for binary classification and R=(1/C) for C-class classification problems).
  • AUC: Area under the ROC curve, used for ranking and for binary classification problems. The ROC curve is the curve of sensitivity vs. 1-specificity, when a threshold is varied on the predictions. The AUC is identical to the BAC for binary predictions. The AUC is calculated for each class separately before averaging over all classes. We normalize it with the formula: AUC := 2AUC-1, making it de-facto identical to the so-called Gini index.
  • F1 score: The harmonic mean of precision and recall. Precision=positive predictive value=true_positive/all_called_positive. Recall=sensitivity=true positive rate=true_positive/all_real_positive. Prediction thresholding and class averaging is handled similarly as in the case of the BAC. We also normalize F1 with F1 := (F1-R)/(1-R), where R is the expected value of F1 for random predictions (i.e. R=0.5 for binary classification and R=(1/C) for C-class classification problems).
  • PAC: Probabilistic accuracy PAC = exp(- CE) based on the cross-entropy or log loss, CE = - < sumk log(qik) > for multi-class classification and CE = - <yi log(qi) + (1-yi) log(1-qi)> for binary classification and multi-label problems. Class averaging is performed after taking the exponential in the multi-label case. We normalize with PAC := (PAC-R)/(1-R), where R is the score obtained using qi =< yi > or qik=< yik > (i.e. using as predictions the fraction of positive class examples as an estimate of the prior probability).

We note that for R2, ABS, and PAC the normalization uses a "trivial guess" corresponding to the average target value qi =< yi > or qik=< yik >. In contrast, for BAC, AUC, and F1 the "trivial guess" is a random prediction of one of the classes with uniform probability.
In all formulas the brackets < . > designates the average over all P samples indexed by i: < yi > = (1/P) sumi (yi). Only R2 and ABS make sense for regression; we compute the other scores for completeness by replacing the target values by binary values after thresholding them in the mid-range.

Leaderboard score calculation

Each round includes five datasets from different application domains, spanning various levels of difficulty. The participants (or their submitted programs) provide prediction results for the withheld target values (called "solution"), for all 5 datasets. Independently of any intervention of the participants, the original version of the scoring program supplied by the organizers is run on the server to compute the scores. For each dataset, the participants are ranked in decreasing order of performance for the prescribed scoring metric associated with the given task. The overall score is computed by averaging the ranks over all 5 datasets and shown in the column <rank> on the leaderboard.

We ask the participants to test their systems regularly while training to produce intermediate prediction results, which will allow us to make learning curves (performance as a function of training time). Using such learning curves, we will adjust the "time budget" in subsequent rounds (eventually giving you more computational time!). But only the last point (corresponding to the file with the largest order number) is used for leaderboard calculations.

The results of the LAST submission made are used to compute the leaderboard results (so you must re-submit an older entry that you prefer if you want it to count as your final entry). This is what is meant by “Leaderboard modifying disallowed”. 

Training, validation and test sets

For each dataset, a labeled training set is provided for training and two unlabeled sets (validation set and test set) are provided for testing.

Phases and rounds

The challenge is run in multiple Phases grouped in rounds, alternating AutoML contests and Tweakathons. There are 6 six rounds: Round 0 (Preparation round), followed by 5 rounds of progressive difficulty (Novice, Intermediate, Advanced, Expert, and Master). Except for round 0 (preparation) and round 5 (termination), all rounds include 3 phases, alternating Tweakathons and AutoML contests (NOTE: THERE ARE NO PRIZES FOR THE HACKATHONS):

Phase in round [n] Goal Duration Submissions Data Leaderboard scores Prizes
[+] AutoML[n] Blind test of code Short NONE (code migrated) New datasets, not downloadable Test set results Yes
Tweakathon[n] Manual tweaking 1 month Code and/or results Datasets downloadable Validation set results No
[+] Final[n] Results of Tweakathon revealed Short NONE (results migrated) NA Test set results Yes

The results of the last submission made are shown on the leaderboard. Submissions are made in Tweakathon phases only. The last submission of one phase migrates automatically to the next one. If code is submitted, this makes it possible to participate to subsequent phases without making new submissions.

Code vs. result submission

To participate in the AutoML[n] phase, code must be submitted in Tweakathon[n-1]. To participate in the Final[n], code or results must be submitted in Tweakathon[n]. If both code and (well-formatted) results are submitted, in  Tweakathon[n] the results are used for scoring rather than re-running the code in Tweakathon[n] and Final[n]. The code is executed when results are unavailable or not well formatted. Hence there is no disadvantage to submitting both results and code. There is no obligation to submit the code, which has produced the results provided. Using mixed submissions of results and code, different methods can be used to enter the Tweakathon/Final phases and to enter the AutoML phases.


There are 5 datasets in each round spanning a range of difficulties:

  • Different tasks: regression, binary classification, multi-class classification, multi-label classification.
  • Class balance: Balanced or unbalanced class proportions.
  • Sparsity: Full matrices or sparse matrices.
  • Missing values: Presence or absence of missing values.
  • Categorical variables: Presence or absence of categorical variables.
  • Irrelevant variables: Presence or absence of additional irrelevant variables (distractors).
  • Number Ptr of training examples: Small or large number of training examples.
  • Number N of variables/features: Small or large number of variables.
  • Aspect ratio Ptr/N of the training data matrix: Ptr>>N, Ptr~=N or Ptr<<N.

We will progressively introduce difficulties from round to round (each round cumulating all the difficulties of the previous ones plus new ones): Some datasets may be recycled from previous challenges, but reformatted into new representations, except for the final MASTER round, which includes only completely new data.

  1. NOVICE: Binary classification problems only; no missing data; no categorical variables; moderate number of features (<2000); balanced classes; BUT sparse and full matrices; presence of irrelevant variables; various Ptr/N.
  2. INTERMEDIATE: Multi-class and binary classification problems + additional difficulties including: unbalanced classes; small and large number of classes (several hundred); some missing values; some categorical variables; up to 5000 features.
  3. ADVANCED: All types of classification problems, including multi-label + additional difficulties including: up to 300,000 features.
  4. EXPERT: Classification and regression problems, all difficulties.
  5. MASTER: Classification and regression problems, all difficulties, completely new datasets.



This challenge is brought to you by ChaLearn. Contact the organizers.

Challenge Rules

  • General Terms: This challenge is governed by the General ChaLearn Contest Rule Terms, the Codalab Terms and Conditions, and the specific rules set forth.
  • Announcements: To receive announcements and be informed of any change in rules, the participants must provide a valid email.
  • Conditions of participation: Participation requires complying with the rules of the challenge. THERE ARE NO PRIZES FOR THE HACKATHONS.
  • Dissemination: The participants will be invited to attend a workshop organized in conjunction with a major machine learning conference and contribute to the proceedings. The challenge is part of the competition program of the IJCNN 2015 conference.
  • Registration: The participants must register to Codalab and provide a valid email address. Teams must register only once and provide a group email, which is forwarded to all team members. Teams or solo participants registering multiple times to gain an advantage in the competition may be disqualified.
  • Anonymity: The participants who do not present their results at the workshop can elect to remain anonymous by using a pseudonym. Their results will be published on the leaderboard under that pseudonym, and their real name will remain confidential. See our privacy policy for details.
  • Submission method: The results must be submitted through this CodaLab competition site. The participants can make up to 5 submissions per day in the Tweakathon phases. Using multiple accounts to increase the number of submissions in NOT permitted. There are NO submissions in the Final and AutoML phases (the submissions from the previous Tweakathon phase migrate automatically). In case of problem, send email to The entries must be formatted as specified on the Evaluation page.
  • Awards: There are no awards for the hackathon/bootcamp, enter the AutoML challenge if you want to earn awards. 


This challenge is brought to you by ChaLearn. Contact the organizers.



The datasets are downloadable from the Dataset page.

Code or result submission

The participants must submit a zip file with their code and/or results via the Submission page. Get started in minutes: we provide a kit including sample submissions and step-by-step instructions. Starting Kit

Participation does not require submitting code, but, if you submit code for evaluation in a given AutoML phase, it must be submitted during the Tweakathon of the PREVIOUS round. ONLY TWEAKATHON PHASES TAKE SUBMISSIONS. Phases marked with a [+] report results on submissions that are forwarded automatically from the previous phase.

The sample submission can be used to submit results, code, or both:

  • Result submission: To submit prediction results, you must run your code on your own machine. You will need first to download the Datasets and the Starting Kit. Always submit both validation and test set results simultaneously, to be ranked on the leaderboard during the "Tweakathon" phase (using the validation set) and during the "Final" phase (using the test set). Result submissions will NOT allow you to participate in the "AutoML" phase.
  • Code submission: We presently support submission of Python code. An example is given in the Starting Kit. If you want to make entries with other languages, please contact us. In principle, the Codalab platform can accept submissions of any Linux executable, but this has not been test yet. If you submit code, make sure it produces results on both validation and test data. It will be used for training and testing in all subsequent phases and rounds until you submit new code.
  • Result and code submission: If you submit both results and code, your results will be used for the Tweakathon and Final phases of the present round; your code will be used for the next AutoML phase (and all subsequent phases and rounds), unless you submit new code.

There is no disadvantage to submit both results and code. The results do not need to have been produced by the code you submit. For instance, you can submit the sample code together with your results if you do not want to submit your own code. You can submit results of models manually tweaked during the Tweakathon phases.

Input format and computational restrictions

The input format is specified on the Dataset page. It includes the prescribed "time budget" for each task (in seconds), which is different for each dataset. In round 0, the total time allowed for all tasks is about half an hour, so BE PATIENT this is how long it will take for the sample code we provide to run when you submit it. Submissions of results are processed much faster, in a few minutes.

Result submission format

A sample result submission is provided with the Starting Kit. All result files should be formatted as text files ending with a ".predict" extension, with one result per sample per line, in the order of the samples:

  • Regression problems: one numeric value per line.
  • Binary classification problems: one numeric value between 0 and 1 to per line, indicating a score of class 1 membership (1 is certainty of class 1, 0.5 is a random guess, 0 is certainty of class 0).
  • Multiclass or multilabel problems: for C classes, C numeric values between 0 and 1 per line, indicating the scores of membership of the C classes. The scores add up to 1 for multiclass problems only.

We ask the participants to test their models regularly and produce intermediate prediction results, numbered from num=0 to n. The following naming convention of the files should be respected:
where "basename" is the dataset name (e.g. adult, cadata, digits, dorothea, or newsgroups, in the first round), "setname" is either "valid" (validation set) or "test" (test set) and "num" is the order number of prediction results submitted. Please use the format 03d to number your submissions because we sort the file names in alphabetical order to determine the result order.

For example, in the first round, you would bundle for submission the following files in a zip archive (no directory structure):

  • adult_valid_000.predict
  • adult_valid_001.predict
  • adult_valid_002.predict
  • ...
  • adult_test_000.predict
  • adult_test_001.predict
  • adult_test_002.predict
  • ...
  • cadata_valid_000.predict
  • cadata_valid_001.predict
  • cadata_valid_002.predict
  • ...
  • cadata_test_000.predict
  • cadata_test_001.predict
  • cadata_test_002.predict
  • ...
  • etc.

The last result file for each set (with largest number num) is used for scoring. It is useful however to provide intermediate results: ALL the results are used by the organizers to make learning curves and infer whether performance improvements could be gained by increasing the time budget. This will affect the time budget allotted in subsequent rounds.


This challenge is brought to you by ChaLearn. Contact the organizers.


Please subscribe to our Google group to post messages on the forum send email to


This challenge is brought to you by ChaLearn. Contact the organizers.


Can I enter the challenge if I did not make submissions to previous rounds?


Can I enter during a Final or AutoML phase?

No: all entries must be made during the Tweakathon phases.

Where can I download the data?

From the Data page, under the Participate tab. You first need to register to have acces to it.

How do I make submissions?

Register and go to the Participate tab where you find data, and a submission form.

Do you provide tips on how to get started?

We provide a Starting Kit, see Step-by-step instructions.

Are there prizes?

NO. There are no prizes for the hackathons.

Do I need to submit code to participate?

No. You can submit prediction results only, if you don't want to submit code. This will allow you to see your performances on the leaderboard during Tweakathon and Final phases, provided that you submit results both on validation data and on test data during the Tweakathon phase.

If I submit code, do I surrender all rights to that code to the sponsors or organizers?

No. You just grant to the organizers a license to use your code for evaluation purposes during the challenge. You retain all other rights.

How much computational power and memory are available?

You are initially sharing with other users an 8 core x86_64 machine with 16 GB RAM. We will ramp up the compute power as needed. You can get the specifics of the current system when you make a submission and look at the scoring error log.

The sample result submission includes code, why?

All submissions are in fact code submissions, but you do not need to supply your own code, you can keep the sample code. All you need to do is to include your own prediction results in the "res/" subdirectory. In this way, the platform will use those results to compute the scores. Respect the file name convention because the scoring program looks for files with dataset names corresponding to the datasets of the corresponding phase.

If I submit both results and code, what is taken into account, results or code?

The sample code is written such that the program first searches for results in the "res/" subdirectory. If it finds files named after the datasets of the current phase, the results are copied to the output directory so they can directly be processed by the scoring program. If there is at least one missing file, the program proceeds with training models and making predictions to produce results.

Why do I need to submit results both on validation and test data?

Validation results are used to rank the participants on the leaderboard during the Tweakathon phases. Test results are used during the Final phases to determine Tweakathon winners. But, we do not let participants make any submission during Final phases. So you must submit results both on validation and test data during the Tweakathon phases (or submit code that produces those results). In this way, the results will quickly appear during the Final phase (lasting 1 day) because they will be precomputed during the Tweakathon (lasting 4 weeks).

Can I use the unlabeled validation and test data for training?

This is not explicitly forbidden, but it is discouraged. Likewise, we prefer if you preprocess data in a way that validation and test data are not preprocessed together with the training data, but the preprocessing parameters are obtained from training data only, then applied to preprocess the validation and test data.

Can I submit results that were not generated by the code submitted?


Does is make sense to migrate result submissions to the next round?

No. The datasets change between rounds. The results of one round are useless for the next one. Submission migration from round are only useful for code submission. If you submit results and do not make changes to the sample code, your performance in the next round will be that of the sample code.

What is the unit of the time budget?


Does the time budget correspond to wall time or CPU time?

CPU time.

My submission seems stuck, how long will it run?

In round 0, in principle no more than 300+200+300+100+300=1200 seconds. We kill the process after 1 hour if something goes wrong.

If I under use my time budget in one phase, can I use it later?

No. Each code run has its own time budget that is the sum of the budgets allocated to the five tasks. You can make 5 runs per day at most.

What happens if I exceed my time budget?

There is some mild tolerance during Tweakathon phases, but eventually your process gets killed. To avoid loosing all your results, save intermediate results regularly.

How can "anytime" phases be comparable to phases with a time budget?

They are not directly comparable. In Tweakathon phases (and their associated Final phases), you may submit results computed using your own system with unbounded computational resources. Thus people having large systems are at an advantage. In AutoML phases, your code runs on the challenge platform. All users are compared in the same fair way.

Will people submitting code be disadvantaged during Tweakathon phases?

No because they can submit both results and code with their last submission. The results will be taken into account in the Final phase, allowing them to get all the advantages of using their own system. But the code will be migrated to the AutoML phase, allowing them to enter the AutoML contest as well.

The time budget is too small, can you increase it?

We may eventually increase it if we see that the learning curves of the participants are far from reaching an asymptote. This is why it is so important that you compute and save predictions regularly during your model search.

Why are you switching metrics all the time?

This is part of the AutoML problem: each task has its own metric. However, we compute also all the other metrics so you can see how robust your method is against metric change. You can of course tune your method (automatically) to the particular metric of the task.

Can I use something else than Python code?

In theory yes: any Linux executable can run on the system. However, we only prepared a starting kit with Python at this stage and have not tested this option. If you are interested, please contact us.

Are there publication opportunities?

Yes, we are part of the IJCNN 2015 competition program and we are planning one or several workshops in conjunction with major machine learning conference (IJCNN, ICML, or NIPS) and proceedings in JMLR Workshop and Conference Proceedings (pending acceptance).

What is meant by "Leaderboard modifying disallowed"?

Your last submission is shown automatically on the leaderboard. You cannot choose which submission to select. Your last submission before the Tweakathon phase ends is your final submission and the submission that will be forwarded to the next round.

What is the file called metadata?

This is a file that you should have in your submitted bundle to indicate to the platform which program must be executed.

How can I debug my code?

Install on your local computer the exact same version of Python and libraries that are installed on Codalab: Anaconda 2.0 for Python 2.7. This should be sufficient to troubleshoot most problems. In particular, check that you are using the following library versions:

  • scikit-learn 0.15.2
  • numpy 1.9.1
  • scipi 0.14

To exactly reproduce the environment used on Codalab, the participants can perform the following steps:

  • Create a Windows Azure account.
  • Login to management portal.
  • Create a new virtual machines (Quick Create, Linux Server and medium size. Small size is fine too but CodaLab uses Medium.)
  • Go to dashboard of new VM and connect via RDP.
  • Install Anaconda.
  • Verify your code runs in this environment.

Can I register multiple times?

No. If you accidentally register multiple times or have multiple accounts from members of the same team, please notify the organizers. Teams or solo participants with multiple accounts may be disqualified.

But I want both to submit code and not to be limited by the time budget in the Final phases. How can I do that with a single account?

This is easy: you can submit both code and results. In the Final phase, the results of that phase will be used to compute your score. In the AutoML phase, the code will be run on the new datasets. This way you get the best of both!

Can I give an arbitrary hard time to the organizers?


Where can I get additional help?

For questions of general interest, the participants may subscribe to our Google group to post messages on the forum send email to


This challenge is brought to you by ChaLearn. Contact the organizers.


The organization of this challenge would not have been possible without the help of many people who are gratefully acknowledged.


Any opinions, findings, and conclusions or recommendations expressed in material found on this website are those of their respective authors and do not necessarily reflect the views of the sponsors. The support of the sponsors does not give them any particular right to the software and findings of the participants.


Microsoft supported the organization of this challenge and donated the prizes for the main challenge. There are no prizes for the hackathons.


This challenge is part of the official selection of IJCNN competitions.

LIF Archimede AMU ETH

This project received additional support from the Laboratoire d'Informatique Fondamentale (LIF, UMR CNRS 7279) of the University of Aix Marseille, France, via the LabeX Archimede program. Computing resources were provided generously by Joachim Buhmann, ETH Zuerich.


Isabelle Guyon, ChaLearn, Berkeley, California, USA
Evelyne Viegas, Microsoft Research, Redmond, Washington, USA

Data providers:

We selected the 30 datasets used in the challenge among 72 datasets that were donated or formatted using data publicly available by:
Yindalon Aphinyanaphongs, New-York University, New-York, USA
Olivier Chapelle, Criteo, California, USA
Hugo Jair Escalante, INAOE, Puebla, Mexico    
Sergio Escalera, University of Barcelona, Catalonia, Spain
Isabelle Guyon, ChaLearn, Berkeley, California, USA
Zainab Iftikhar Malhi, University of Lahore, Pakistan
Vincent Lemaire, Orange research, Lannion, Britany, France
Chih Jen Lin, National Taiwan University, Taiwan
Meysam Madani, University of Barcelona, Catalonia, Spain
Bisakha Ray, New-York University, New-York, USA
Mehreen Saeed, University of Lahore, Pakistan
Alexander Statnikov, American Express, New-York, USA
Gustavo Stolovitzky, IBM Computational Biology Center, Yorktown Heights, New York, USA
Hans-Jürgen Thiesen, Universität Rostock, Germany
Ioannis Tsamardinos, University of Crete, Greece

Committee members, advisors and beta testers:

Kristin Bennett, RPI, New-York, USA
Marc Boullé, Orange research, Lannion, Britany, France
Cecile Capponi, University of Aix-Marseille, France
Richard Caruana, Microsoft Research, Redmond, Washington, USA
Gavin Cawley, University of East Anglia, UK
Gideon Dror, Yahoo!, Haifa, Israel
Hugo Jair Escalante, INAOE, Puebla, Mexico
Sergio Escalera, University of Barcelona, Catalonia, Spain
Cécile Germain, Université de Paris Sud, France
Tin Kam Ho, IBM Watson Group, Yorktown Heights, New-York, USA
Balázs Kégl, Université de Paris Sud, France
Hugo Larochelle, Université de Sherbrooke, Canada
Vincent Lemaire, Orange research, Lannion, Britany, France
Chih Jen Lin, National Taiwan University, Taiwan
Víctor Ponce López, University of Barcelona, Catalonia, Spain
Nuria Macia, Universitat Ramon Llull, Barcelona, Spain
Simon Mercer, Microsoft, Redmond, Washington, USA
Florin Popescu, Fraunhofer First, Berkin, Germany
Mehreen Saeed, University of Lahore, Pakistan
Michèle Sebag, Université de Paris Sud, France
Danny Silver, Acadia University, Wolfville, Nova Scotia, Canada
Alexander Statnikov, American Express, New-York, USA
Ioannis Tsamardinos, University of Crete, Greece

Codalab and other software development

Eric Carmichael, Tivix, San Francisco, California, USA
Isabelle Guyon, ChaLearn, Berkeley, California, USA
Ivan Judson, Microsoft, Redmond, Washington, USA
Christophe Poulain, Microsoft Research, Redmond, Washington, USA
Percy Liang, Stanford University, Palo Alto, California, USA
Arthur Pesah, Lycée Henri IV, Paris, France
Xavier Baro Sole, University of Barcelona, Barcelona, Spain
Lukasz Romaszko, ChaLearn, California, USA
Erick Watson, Sabthok International, Redmond, Washington, USA
Michael Zyskowski, Microsoft Research, Redmond, Washington, USA



This challenge is brought to you by ChaLearn. Contact the organizers.


Start: April 1, 2015, midnight

Description: Practice phase on toy data drawn from well-known publicly available data. In preparation for phase 1, submit code capable of producing predictions on both VALIDATION AND TEST DATA. The phase 0 data are available from the 'Get Data' page. The leaderboard shows scores on phase 0 validation data only.

[+] Final0

Start: Jan. 24, 2016, 8 p.m.

Description: Results on test data of phase 0. There is NO NEW SUBMISSION. The results on test data of the last submission are shown.

Competition Ends


You must be logged in to participate in competitions.

Sign In
# Username Score
1 3.0000
2 duck 3.8000
3 lovro.ilijasic 4.0000