AutoML3 :: AutoML for Lifelong Machine Learning

Organized by hugo.jair - Current server time: March 30, 2025, 12:15 a.m. UTC
Reward $15,000

First phase

Feedback

July 31, 2018, midnight UTC

End

Competition Ends

Nov. 6, 2018, 11 p.m. UTC

Overview
Evaluation
Terms and Conditions

AutoML for Lifelong Machine Learning

(Automatic Lifelong Machine Learning Challenge with Drift)

In many real-world machine learning applications, AutoML is strongly needed due to the limited machine learning expertise of developers. Moreover, batches of data in many real-world applications may be arriving daily, weekly, monthly, or yearly, for instance, and the data distributions are changing relatively slowly over time. This presents a continuous learning, or Lifelong Machine Learning challenge for an AutoML system. Typical learning problems of this kind include customer relationship management, on-line advertising, recommendation, sentiment analysis, fraud detection, spam filtering, transportation monitoring, econometrics, patient monitoring, climate monitoring, manufacturing and so on. In this competition, which we are calling AutoML for Lifelong Machine Learning, large scale datasets collected from some of these real-world applications will be used. Compared with previous AutoML competitions(http://automl.chalearn.org/), the focus of this competition is on drifting concepts, getting away from the simpler i.i.d. cases. Participants are invited to design a computer program capable of autonomously (without any human intervention) developing predictive models that are trained and evaluated in a lifelong machine learning setting under restricted resources and time.

Although the scenario is fairly standard, this challenge introduces the following difficulties:
• Algorithm scalability. We provide datasets that are 10-100 times larger than in previous challenges we organized.
• Varied feature types. Varied feature types are included (continuous, binary, ordinal, categorical, multi-value categorical, temporal). Categorical variables with a large number of values following a power law are included.
• Concept drift. The data distribution is slowly changing over time.
• Lifelong setting. All datasets included in this competition are chronologically splitted into 10 batches, meaning that instance batches in all datasets are chronologically ordered (note that instances in one batch are not guaranteed to be chronologically ordered). The algorithms will be tested for their capability of adapting to changes in data distribution by exposing them to successive test batches chronologically ordered. After testing, the labels will be revealed to the learning machines and incorporated in the training data.

There’re two phases of the competition:
The Feedback phase is a phase with code submission, you can practice on 5 datasets that are of similar nature as the datasets of the second phase. You can make a limited number of submissions. You can download the labeled training data and the unlabeled test set so that you can prepare your code submission at home and submit it later. The LAST code submission will be forwarded to the next phase for final testing.
The AutoML phase is the blind test phase with no submission. The last submission of the previous phase is blind tested on 5 new datasets. Your code will be trained and tested automatically, without human intervention. The final score will be evaluated by the result of the blind testing.

Brought to you by

Contact the organizers.

Evaluation

Task

The goal of this challenge is to expose the research community to real world datasets exhibiting the concept drift phenomenon, and under a lifelong ML evaluation scenario. Participants must develop AutoML solutions for dealing with these problems. All datasets are formatted in a uniform way, though the type of features from dataset to dataset might differ (Numerical, Categorical, Multi-valued categorical and time features may be available). The data are provided as preprocessed matrices, so that participants can focus on classification, although participants are welcome to use additional feature transformations / extraction procedures (as long as they do not violate any rule of the challenge). All problems are binary classification tasks and are assessed with the Area Under the ROC Curve (AUC) metric. The considered datasets present, in different degree, the concept drift phenomenon.

The identity of the datasets and the type of data is concealed, though their structure (number of patterns, inputs, feature types, etc.) is revealed. The final score in phase 2 (the phase considered for delivering prizes) will be the average rank of the participants' performance on individual datasets. Winners will be determined by ranking them according to the final score (smallest average rank is best). The overall duration of solutions will be considered as tie-breaking criterion.

The tasks are constrained by a time budget, where each dataset has a different (not cumulative) budget. The Codalab platform provides computational resources shared by all participants. Each code submission will be executed in a compute worker with the following characteristics: 4Cores / 16GB Memory / 80G SSD with Ubuntu OS. To ensure the fairness of the evaluation, when a code submission is evaluated, its execution time is limited (details on the time for each dataset are provided in the input data).

Scenario

A simulated lifelong ML evaluation scenario is considered (see the figure below). Each dataset is divided into 10 batches of approximately the same number of instances. Instances are chronologically sorted in each batch (and across batches). The code of participants will have access to the data and labels in the first batch (considered a training batch). After that, participants must make predictions for the next i-th batch (the participant’s code will have access to the data of the new batch) and performance will be evaluated. Next, labels for the i-th batch will be revealed to the code, and participants can update their model for making predicitons for ne batch i+1. Your code must implement at least two methods: fitting/training (using the available data at time i, this method could also store data, perform instance selection, subsampling, etc.), and prediction (your model makes predictions for an unlabeled batch). Please look at the sample code submission included in the Starking kit for guidance on how to design your model/code. Average performance across batches will be used for evaluation of each data set.

Phases

The challenge has two phases:

Phase 1: Feedback phase. You can practice on 5 datasets that are of similar nature as the datasets of the second phase. You can make a limited number of submissions, you can download the labeled training data and the unlabeled test set. So you can prepare your code submission at home and submit it later (Please download the starting kit in the files section for instructions on how to reproduce the evaluation framework). Your LAST submission from this phase will be forwarded to the next phase for final testing.

Phase 2: AutoML phase. Your last submission of the previous phase is blind tested on five new datasets. Your code will be trained and tested automatically, without human intervention.

During the feedback phase, the results of your last submission on test data are shown on the leaderboard. Prizes will be awarded in Phase 2 only.

Important: For Phase 1, we provide you with the first 4 test batches (in addition to the labeled training batch) so you can easier design your models at home. For the final phase, only the training batch will be made available to your code initially (then each test batch will be progressively delivered to your code as outlined above).

Prizes

See the Terms and Conditions site

Contact the organizers.

Prizes

Prizes sponsored by 4paradigm will be granted to top ranking participants (Excecution time of your submission will be used as tie-breaking criterion), provided they comply with the rules of the challenge (see the terms and conditions, section). The distribution of prizes will be as follows.

First place: 10,000USD + Certificate + travel grant
Second place: 3,000USD + Certificate+ travel grant
Third place: 1,000USD + Certificate+ travel grant

To be eligible for prizes you must: publicly relase your code under an open source license, submit factsheet describing your solution, presenting the solution in the competition session at NIPS2018, signing the prize acceptance format and adhering to the rules of the challenge.

Challenge Rules

General Terms: This challenge is governed by the General ChaLearn Contest Rule Terms, the Codalab Terms and Conditions, and the specific rules set forth.
Announcements: To receive announcements and be informed of any change in rules, the participants must provide a valid email.
Conditions of participation: Participation requires complying with the rules of the challenge. Prize eligibility is restricted by USA and Chinese government export regulations. The organizers, sponsors, their students, close family members (parents, sibling, spouse or children) and household members, as well as any person having had access to the truth values or to any information about the data or the challenge design giving him (or her) an unfair advantage, are excluded from participation. A disqualified person may submit one or several entries in the challenge and request to have them evaluated, provided that they notify the organizers of their conflict of interest. If a disqualified person submits an entry, this entry will not be part of the final ranking and does not qualify for prizes. The participants should be aware that ChaLearn and the organizers reserve the right to evaluate for scientific purposes any entry made in the challenge, whether or not it qualifies for prizes.
Dissemination: Top ranked participants will be invited to attend a session collocated with NIPS 2018 (to be confirmed) to describe their methods and findings. The challenge is part of the competition program of the NIPS2018 conference. Top ranked participants will have the opportunity to publish their methods as a book chapter in Springer CIML (chapters will be subject of review). A special issue in a top tier journal is under preparation.
Registration: The participants must register to Codalab and provide a valid email address. Teams must be registed as a regular Codalab user with the team name as Codalab ID and a group email shared by all team members. Teams or solo participants registering multiple times to gain an advantage in the competition may be disqualified.
Anonymity: The participants who do not present their results at the workshop can elect to remain anonymous by using a pseudonym. Their results will be published on the leaderboard under that pseudonym, and their real name will remain confidential. However, the participants must disclose their real identity to the organizers to claim any prize they might win. See our privacy policy for details.
Submission method: The results must be submitted through this CodaLab competition site. The participants can make up to 2 submissions per day in the feedback phase. Using multiple accounts to increase the number of submissions is NOT permitted. The entries must be formatted as specified on the "Participate>Get data" page. There is NO submission in AutoML3 blind test phase (the submissions from the previous Feed-back phase migrate automatically). In case of problem, send email to automl2018@gmail.com.
Cheating: We forbid people during the development phase to attempt to get a hold of the solution labels on the server (though this may be technically feasible). For the final phase, the evaluation method will make it impossible to cheat in this way. Generally, participants caught cheating will be disqualified.

Contact the organizers.

Feedback

Start: July 31, 2018, midnight

Description: Practice on five datasets similar to those of the second phase. Code submission only.

AutoML3 blind test

Start: Oct. 23, 2018, midnight

Description: No new submissions. Your last submission of the first phase will be blindly tested.

Competition Ends

Nov. 6, 2018, 11 p.m.

You must be logged in to participate in competitions.