ChaLearn Fast Causation Coefficient Challenge

Organized by chalearn - Current server time: Nov. 14, 2018, 12:04 p.m. UTC

Previous

Final Phase
June 15, 2014, midnight UTC

Current

Post Challenge
June 18, 2014, 5:31 p.m. UTC

End

Competition Ends
Never

The challenge is over, but is open for post-challenge submissions.

Congratulations to the winners:

  • First place: José Adrián Rodríguez Fonollosa [code]
  • Second place: Wei Zhang [code]
  • Third place: David Lopez-Paz (also winner of the fastest code prize) [code]

The fact sheets are available for inspection.

Score pairs of variables {A, B} with a positive coefficient if A causes B, a negative one if B causes A, and zero otherwise.

Chicken and Egg This is a competition with code submission. Contact the organizers for instructions to get admitted.

The problem of attributing causes to effects is pervasive in science, medicine, economy and almost every aspects of our everyday life involving human reasoning and decision making. What affects your health? the economy? climate changes? The gold standard to establish causal relationships is to perform randomized controlled experiments. However, experiments are costly while non-experimental "observational" data collected routinely around the world are readily available. Unravelling potential cause-effect relationships from such observational data could save a lot of time and effort.

Consider for instance a target variable B, like occurrence of "lung cancer" in patients. The goal would be to find whether a factor A, like "smoking", might cause B. The objective of the challenge is to rank pairs of variables {A, B} to prioritize experimental verifications of the conjecture that A causes B. As is known, "correlation does not mean causation". More generally, observing a statistical dependency between A and B does not imply that A causes B or that B causes A; A and B could be consequences of a common cause. But, is it possible to determine from the joint observation of samples of two variables A and B that A should be a cause of B or vice versa?

This challenge is limited to pairs of variables deprived of their context and deprived of time ordering of the samples. Neither constraint-based methods relying on conditional independence tests and/or graphical models nor Granger causality type of methods using time ordering are applicable. The goal is to push the state-of-the art in complementary methods.

We ran a first version of this challenge from March 29 to September 2, 2013, under the name Cause-Effect Pairs challenge. The participants built on top of algorithms from the literature and greatly improved performance (from and AUC near 0.6 to and AUC near 0.8!). The results were discussed at a workshop at NIPS 2013. But most methods remain relatively slow. Your turn to make it better and faster!

CHALEARN

This challenge is brought to you by ChaLearn, see our credits page. Contact the organizers.

Evaluation

For the purpose of this challenge, two variables A and B are causally related if:

B = f (A, noise) or A = f (B, noise).

If the former case, A is cause of B and in the latter case B is a cause of A. All other factors are lumped into the "noise". We provide samples of joint observations of A and B, not organized in a time series. We exclude feed-back loops and consider only 4 types of relationships:

A->B      A causes B      Positive class
B->A      B causes A      Negative class
A - B      A and B are consequences of a common cause      Null class
A | B      A and B are independent      Null class

We bring the problem back to a classification problem: for each pair of variable {A, B}, you must answer the question: is A a cause of B? (or, since the problem is symmetrical in A and B, is B a cause of A?)

We expect the participants to produce a score between -Inf and +Inf, large positive values indicating that A is a cause of B with certainty, large negative values indicating that B is a cause of A with certainty. Middle range scores (near zero) indicate that neither A causes B nor B causes A.

For each pair of variables, we have a ternary truth value indicating whether A is a cause of B (+1), B is a cause of A (-1), or neither (0). We use the scores provided by the participants as a ranking criterion and evaluate their entries with the area under the ROC curve (AUC). We average two AUCs for the problem of classifying "A causes B" vs. everything else (forward AUC) and the problem of classifying "B causes A" vs. everything else (backward AUC).

Instructions

The participants must submit a zip file containing a program that calculates a fast causation coefficient. The program must be written in Python. To that end, we provide:

  • Starting Kit: A small/fast example of submission derived from the basic Python benchmark of the first challenge (original baseline). Step-by-step instructions on how to run it on your local machine and modify it.
  • Baseline Submission: A competitive example derived from the code of the team jarfo in the first edition of the challenge.

Code submitted bundled in zip file will be executed automatically on the Codalab platform and the scoring results will be posted on the leaderboard for immediate feed-back. The results in the development phase are computed on validation data. The final ranking will be done during the final phase on the test data. The results on test data will remain hidden to the participants until the end of the challenge.

To develop their code, we encourage the participants to use the exact version of Python and libraries that installed on Codalab: Anaconda 1.6.2, which can be found at http://repo.continuum.io/archive/. More particularly, we installed http://repo.continuum.io/archive/Anaconda-1.6.2-Windows-x86_64.exe.

Help

Are there prizes?

Yes, see our Terms and Conditions. There are travel awards for the three top ranking participants to attend the workshop held in conjunction with the Microsoft Faculty Summit in July 2014, and one Grand Prize for $1000 for the winner of the fast code prize (among the three top ranking participants).

Are there publication opportunities?

Yes, you may submit a paper to the JMLR special topic on causality, deadline September 15, 2014.

Do you provide tips on how to get started?

We ran a first version of this challenge from March 29 to September 2, 2013, under the name Cause-Effect Pairs challenge. A lot of help can be found on the website of our previous challenge:

Why does my entry not show up on the leaderboard?

Codalab lets you choose which entry you want to display on the leaderboard. You must submit it by clicking "Submit to Learderboard".

How can I debug my code?

To exactly reproduce the environment used on Codalab, the participants can perform the following steps:

  • Create a Windows Azure account.
  • Login to management portal.
  • Create a new virtual machines (Quick Create, Windows Server 2012 and medium size. Small size is fine too but CodaLab uses Medium.)
  • Go to dashboard of new VM and connect via RDP.
  • Install Anaconda 1.6.2: http://repo.continuum.io/archive/Anaconda-1.6.2-Windows-x86_64.exe.
  • Verify your code runs in this environment.

Can I give an arbitrary hard time to the organizers?

ALL INFORMATION, SOFTWARE, DOCUMENTATION, AND DATA ARE PROVIDED "AS-IS". ISABELLE GUYON, CHALEARN, KAGGLE, MICROSOFT AND/OR OTHER ORGANIZERS AND SPONSORS DISCLAIM ANY EXPRESSED OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR ANY PARTICULAR PURPOSE, AND THE WARRANTY OF NON-INFRIGEMENT OF ANY THIRD PARTY'S INTELLECTUAL PROPERTY RIGHTS. IN NO EVENT SHALL ISABELLE GUYON AND/OR OTHER ORGANIZERS BE LIABLE FOR ANY SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF SOFTWARE, DOCUMENTS, MATERIALS, PUBLICATIONS, OR INFORMATION MADE AVAILABLE FOR THE CHALLENGE. In case of dispute about prize attribution or possible exclusion from the competition, the participants agree not to take any legal action against the organizers or sponsors. Decisions can be appealed by submitting a letter to Vincent Lemaire, secretary of ChaLearn, and disputes will be resolved by the board of ChaLearn.

Where can I get further help?

Post messages of general interest to our Google group.

CHALEARN

This challenge is brought to you by ChaLearn, see our credits page. Contact the organizers.

Challenge Rules

  • General Terms: This challenge is governed by the General ChaLearn Contest Rule Terms, the Codalab Terms and Conditions, and the specific rules set forth.
  • Announcements: To receive announcements and be informed of any change in rules, the participants should subscribe to the Google group causalitychallenge .
  • Conditions of participation: Participation requires complying with the rules of the challenge. Prize eligibility is restricted by US government export regulations, see the General ChaLearn Contest Rule Terms. The organizers, sponsors, their students, close family members (parents, sibling, spouse or children) and household members, as well as any person having had access to the truth values or to any information about the data or the challenge design giving him (or her) an unfair advantage, are excluded from participation. A disqualified person may submit one or several entries in the challenge and request to have them evaluated, provided that they notify the organizers of their conflict of interest. If a disqualified person submits an entry, this entry will not be part of the final ranking and does not qualify for prizes. The participants should be aware that CHALEARN and the organizers reserve the right to evaluate for scientific purposes any entry made in the challenge, whether or not it qualifies for prizes.
  • Previous challenge and prerequisites: This challenge is a new edition of the ChaLearn Cause Effect Pair Challenge, using new versions of the validation and test data. It is not required to have participated to the first challenge to enter this new challenge.
  • Dissemination: The participants will be invited to attend a workshop organized in conjunction with the Microsoft Faculty Summit held at Microsoft Research in Redmond in July 2014. In addition, the participants will be invited to submit a paper to the JMLR special topic on causality, second call-for-paper deadline September 15, 2014.
  • Anonymity: The participants who do not present their results at the workshop can elect to remain anonymous by using a pseudonym. Their results will be published on the leaderboard under that pseudonym, and their real name will remain confidential. However, the participants must disclose their real identity to the organizers. See our privacy policy for details.
  • Submission method: The results must be submitted through this CodaLab competition site. The participants can make up to 100 total submissions BUT ONLY ONE FINAL SUBMISSION (final rule). In case of problem, send email to causality@chalearn.org. The entries must be formatted as specified on the evaluation page.
  • Awards: To compete for awards, the participants must fill out a fact sheet briefly describing their methods. There is no other publication requirement. The winners will be required to make their code publicly available under Apache license 2.0 or later, if they accept their prize, within a week of the deadline for submitting the final results. The winners will de determined according to the best "bidirectional AUC" on test data (see the evaluation page):
    • First place: 750 USD travel award (*) + Award certificate
    • Second place: 750 USD travel award (*) + Award certificate
    • Third place: 750 USD travel award (*) + Award certificate
  • Grand prize: Fastest code among the three award winners: 1000 USD + Award certificate.
    In case of a tie, the prize will be split evenly among the winners. We will declare a two-way tie if the difference in execution time between the two fastest codes is less than 10% of that of the fastest code. We will declare a three-way tie if there is a two-way tie and the difference in execution time between the third fastest code and the median execution time of the two fastest ones is less than 10% of that of the fastest code.

(*) The travel award may be used to attend a workshop organized in conjunction with the Microsoft Faculty Summit held at Microsoft Research in Redmond in July 2014. The award money will be granted in reimbursement of expenses including airfare, ground transportation, hotel, or workshop registration. Reimbursement is conditioned on (i) attending the workshop, (ii) making an oral presentation of the methods used in the challenge, and (iii) presenting original receipts and boarding passes. For winners traveling from outside North America, the travel award will be doubled.

CHALEARN

This challenge is brought to you by ChaLearn, see our credits page. Contact the organizers.

Development Phase

Start: April 9, 2014, noon

Final Phase

Start: June 15, 2014, midnight

Post Challenge

Start: June 18, 2014, 5:31 p.m.

Competition Ends

Never

You must be logged in to participate in competitions.

Sign In
# Username Score
1 david.lopez.paz 91.60
2 reference 287.77