TrackML Throughput Phase

Organized by VictorEstrade - Current server time: March 27, 2025, 7:16 a.m. UTC
Reward $15,000

Development

Sept. 3, 2018, midnight UTC

Current

Final

Nov. 5, 2018, 11:59 p.m. UTC

End

Competition Ends

Never

Overview
Evaluation
Terms and Conditions
Prizes
Sponsors and organisers
Timeline

Welcome!

This competitions is an official NIPS 2018 competition.

To explore what our universe is made of, scientists at CERN are colliding protons, essentially recreating mini big bangs, and meticulously observing these collisions with intricate silicon detectors. Event rates have already reached hundreds of millions of collisions per second, meaning physicists must sift through tens of petabytes of data per year. And, as the resolution of detectors improve, ever better software is needed for real-time pre-processing and filtering of the most promising events, producing even more data. To help address this problem, a team of Machine Learning experts and physics scientists working at CERN (the world largest high energy physics laboratory), has partnered with prestigious sponsors to answer the question: can machine learning assist high energy physics in discovering and characterizing new particles?In this competition, you are challenged to build an algorithm that quickly reconstructs particle tracks from 3D points left in the silicon detectors.

A 3D image of the points (white) and tracks (red):

A simplified view in 2-Dimension : the name of the game is to associated the points into tracks.

The challenge is organized in two phases:

The Accuracy phase (finished) was run on Kaggle from May to August 13, 2018 (winners to be announced end of September). This first phase focused on the highest score, irrespective of the evaluation time. it was an official IEEE WCCI competition (Rio de Janeiro, Jul 2018).
The Throughput phase runs NOW (this Codalab competition), starting in September 2018. Participants must submit their software, which is evaluated by the platform. Incentive is on the throughput (or speed) of the evaluation, while reaching a good accuracy score. It should be noted that the new dataset is slightly different. This phase is an official NIPS competition (Montreal, Dec 2018).

Having participated to Accuracy phase is not necessary (at all) to participate to this Throughput phase. All the necessary information for the Throughput phase is available here on Codalab. The overall TrackML challenge web site is there. Kernels developped and discussion on the Kaggle forum are available to jump start participants to the Throughput phase. Questions should be posted on the forum or directed to trackml.contact at gmail.com

Evaluation Criteria

Participants submit software (following a template provided), which is run on the platform on 50 test events, which are different but share the same characteristics as the training test.

The software produces a submission file containing particule tracking predictions, from which an accuracy score (how good the software is at finding the tracks) is evaluated. The time to run the software is also measured. Accuracy and time are then combined to produce the ranking score.

Accuracy score

In one line : it is the intersection between the reconstructed tracks and the ground truth particles, normalized to one for each event, and averaged on the events of the test set.

First, each hit is assigned a weight:

weight is non zero only for hits left by particles coming from within a cylinder centered at (0,0,0), with axis around the z-axis, radius 2 mm, half length 16.5 cm and with at least 8 hits (this is the only difference with respect to the scoring of the Accuracy phase, where about 10% of the total possible score was for particles not fulfilling this condition)
the few first (starting from the center of the detector) and last hits left by a particle have a larger weight
hits from the more straight tracks (more rare, but more interesting) have a larger weight
random hits or hits from very short tracks have weight zero
the sum of the weights of all the hits of one event is 1 by construction
the hit weights are available in the truth file. They are not revealed for the test dataset

Then, the accuracy score is constructed as follows:

tracks are uniquely matched to particles by the double majority rule:
- for a given track, the matching particle is the one to which the absolute majority (strictly more that 50%) of the track points belong.
- the track should have the absolute majority of the points of the matching particle. If any of these constraints is not met, the score for this track is zero
the score of a surviving track is the sum of the weights of the points of the intersection between the track and the matching particle.
the score of an event is the sum of the score of all its tracks.
the final accuracy score is the average on the events on the test set

A perfect algorithm will have an accuray score of 1, while a random one will have an accuracy score 0. An implementation can be found in the trackml python library.

Evaluation time

The software is run by Codalab on 3 dedicated 48 core machines (Intel(R) Xeon(R) CPU E5-2650 v4 @ 2.20GHz), within a docker limiting the resources used by the software to two cores and 4GB of memory.

The template software is taking care of the I/O, and handing the data to the user code for each event one after the other. Only the wall clock time spent in the user code is accounted for. The input to the ranking score is the average time per event.

Tests have shown the time measurement to be reproducible within a few percent, which is sufficient for the online public leaderboard.

It is probably possible to hack the time measurement ; the public leaderboard and the software submitted will be regularly inspected by the organisers. Any attempt of deliberate hacking will cause the participant to disqualified and his contributions deleted.

Ranking score

Given the accuracy and the time per event (in second), they are combined in a final score as followed:

if accuracy > 0.5:

score = sqrt{ log( 1 + 600 / time) * (accuracy - 0.5)**2 }

else :

score = 0

This picture indicates how the ranking score depends on accuracy and time:

Minimum performance

In practice, the participant software is first run on one test event (always the same). If on this single event, the accuracy is less than 5% or the time more than 600 second, a -1 score is reported.

Final leaderboard

Once the competition is finished, the public of leaderboard will be purged of submissions with sign of unfair practices. Submission will be run on a new 50 events dataset several times, in order to have an accurate time measurement.

The final scores (and final leaderboard) will then be determined from the average time, and the accuracy.

Terms and conditions

General Terms: This challenge is governed by the General ChaLearn Contest Rule Terms, the Codalab Terms and Conditions, and the specific rules set forth.
Announcements: To receive announcements and be informed of any change in rules, the participants must provide a valid email.
Conditions of participation: Participation requires complying with the rules of the challenge. Prize eligibility is restricted by US government export regulations, see the General ChaLearn Contest Rule Terms. The organizers, sponsors, their students, close family members (parents, sibling, spouse or children) and household members, as well as any person having had access to the truth values or to any information about the data or the challenge design giving him (or her) an unfair advantage, are excluded from participation. Participants to the previously run TrackML Accuracy phase on Kaggle can participate. A disqualified person may submit one or several entries in the challenge and request to have them evaluated, provided that they notify the organizers of their conflict of interest. If a disqualified person submits an entry, this entry will not be part of the final ranking and does not qualify for prizes. The participants should be aware that ChaLearn and the organizers reserve the right to evaluate for scientific purposes any entry made in the challenge, whether or not it qualifies for prizes.

Training data access: Only registered participants having accepted the rules can access the training data. By downloading the training data, the participants agree to keep it for their own use and not to re-distribute it in any form, including giving direct access to the URL to download the data.
Dissemination: The participants may be invited to attend a workshop organized in conjunction with a major machine learning conference and contribute to the proceedings. This competition is an official NIPS 2018 competition.
Registration: The participants must register to Codalab and provide a valid email address. Teams should register only once using a group email, which will be shared by all team members. Teams or solo participants registering multiple times to gain an advantage in the competition may be disqualified.
Anonymity: The participants who do not present their results at the workshop can elect to remain anonymous by using a pseudonym. Their results will be published on the leaderboard under that pseudonym, and their real name will remain confidential. However, the participants must disclose their real identity to the organizers to claim any prize they might win. See our privacy policy for details. If a participant provides his real name, it will appear on the learderboard and may be used by the Codalab platform provider at his discretion.
Submission method: The results must be submitted through this CodaLab competition site. The participants can make up to N_SUB_MAX submissions per day. The value of N_SUB_MAX is indicated as "Max submissions per day" on the challenge website. The organisers reserve the right to change it during the challenge. Using multiple accounts to increase the number of submissions is NOT permitted. The public leaderboard will be updated by automatic evaluation of the submission. The Final leaderboard will be determined by running the best submission in a controlled way on a new dataset with the same statistical property as the one provided. The entries must be formatted as specified on the Evaluation page. As specified in the General ChaLearn Contest Rule Terms, any attempt of hacking like trying to export the Test dataset, to use more than the allocated resources or to manipulate the public leaderboard can lead to disqualification.
Prizes: The top performers are eligible to prizes (see Prizes page). To be eligible to any prizes, the participants must make their code publicly available under an OSI-approved license such as, for instance, Apache 2.0, MIT or BSD-like license, fill out a fact sheet briefly describing their methods as well as a short documentation following a template, no later than one week after the deadline for submitting the final results. There is no other publication requirement. In case of a tie, the prize will go to the participant who submitted his/her entry first. Non winners or entrants who decline their prize retain all their rights on their entries and are not obliged to publicly release their code.
Travel awards: The travel awards may be used to attend a workshop organized in conjunction with the challenge. The award money will be granted in reimbursement of expenses including airfare, ground transportation, hotel, or workshop registration. Reimbursement is conditioned on (i) attending the workshop, (ii) making an oral presentation of the methods used in the challenge, and (iii) presenting original receipts and boarding passes. The reimbursements will be made after the workshop.

This challenge is brought to you by ChaLearn and the sponsors listed on the Sponsors page. Questions should be posted on the forum of directed to trackml.contac at gmail.com.

The TrackML Throughput phase offers a new set of prizes (in addition to the first Accuracy phase prizes).

Cash Prizes

Participants with the best score on the final evaluation (that will take place after the challenge is over on a new test set, under tightly controlled timing) are eligible to receive:

1st Place - $ 7,000
2nd Place - $ 5,000
3rd Place - $ 3,000

HEP meets ML prize

A second set of prizes will be attributed by a jury (with international experts in particle physics tracking algorithms and machine learning), which will select the submission with the most promising balance between score, evaluation speed and originality with respect to traditional particle physics combinatorial approaches. We will provide an entry form for those of you interested in such prizes.

Prizes which will be distributed under this category :

One NVIDIA Tesla V100 GPU.
Invitations (at least one) to NIPS dec 2018 in Montreal.
Invitations (at least one) to a grand finale workshop at CERN (Geneva) in spring 2019.

Conditions

To be eligible to both types of prizes the teams must submit their complete code (training and evaluation) with an open source license within one week after the end of the competition. The participating team will decide how the amount of the Award will be divided internally amoung the team members. The Award will not cover the travel expenses of team members who belong to the ATLAS or CMS collaboration or are based at CERN.

The (human) organizers are here

Platinum sponsors

	NVIDIA GPU's are powering the world's fastest supercomputers GPU computing is the most pervasive, accessible, energy-efficient path forward for HPC data centers. GPU's are ushering in the era of Convergence, where modeling and simulation are combined with AI to spur a wave of innovation and insight unmatched since computers were first applied to science problems.
	UNIGE : The University of Geneva and its faculty of science is heavily invested in fundamental research and machine learning applications. Its department of particle physics (DPNC) is a strong and long-standing member of the ATLAS collaboration of CERN's LHC.

Gold sponsors

	ChaLearn: is a non-for-profit organization dedicated to educate the public with the organization of scientific competitions, particularly in machine learning.
	The DATAIA Institute: aims to gather and structure on a scientific site, multidisciplinary expertises of great scope and high visibility to better address the major challenges of data science, artificial intelligence and their applications through decompartmentalization between mathematics, computer science and legal, economic and social sciences.

Silver sponsors

	CERN openlab: is a unique public-private partnership that accelerates the development of cutting-edge ICT solutions for the worldwide LHC community and wider scientific research. Through CERN openlab, CERN collaborates with leading ICT companies and research institutes.
	Paris-Saclay CDS Using data science to advance domain sciences.
	INRIA is French research institute of computer science.
	ERC mPP : This project has received funding from the European Research Council (ERC) under the European Union's Horizon 2020 research and innovation program (grant agreement no 772369). mPP is an ERC Consolidator Grant coordinated by CERN aiming to promote applications based on modern machine learning for particle physics experiments.
	ERC RECPT : This project has received funding from the European Research Council (ERC) under the European Union's Horizon 2020 research and innovation program (grant agreement no 724777). RECEPT is an ERC consolidator grant concerned with studies of lepton universality and real-time reconstruction and analysis of particle trajectories.
	Common Ground : the only professional online platform tailored to academics with STEM qualifications looking for a rewarding career in the private sector. We combine job opportunities in Machine Learning, Data Science and Software Engineering with unique company, education and support services so academics can discover the companies they really want to work for and better prepare themselves for the transition into industry.
	University Paris Sud : is a place dedicated to high-level research, member of the League of European Research Universities. Itis particularly famous for its very high level in basic research, especially in Mathematics and Physics, while hosting numerous research programs in Computer Science, Chemistry and Biology.
	INQNET (INtelligent Quantum NEtworks and Technologies) is a research program of the Alliance of Quantum Technologies. It aims to accelerate progress in areas of fundamental QIS&T, including quantum AI.
	Fermilab is America's particle physics and accelerator laboratory. We bring the world together to solve the mysteries of matter, energy, space and time.
	pytorch is an open source deep learning platform built to be flexible and modular for research, with the stability and support needed for production deployment. It enables fast, flexible experimentation through a tape-based autograd system designed for immediate and python-like execution.

5 November 2018 11:59 UTC End of the competition
12 November 2018 11:59 UTC Deadline to submit survey and short software document (for leaderboard prizes and HEP meets ML prize)
end November 2018 winners annoucement

Development

Start: Sept. 3, 2018, midnight

Description: During this phase participants can submit code that will run on the validation data and get feedback from the platform.

Final

Start: Nov. 5, 2018, 11:59 p.m.

Description: In the final phase participant's best submission will be tested offline against the private dataset. No new submission allowed.

Competition Ends

Never

You must be logged in to participate in competitions.

Competition

TrackML Throughput Phase

Previous

Current

End

Welcome!

Evaluation Criteria

Accuracy score

Evaluation time

Ranking score

Minimum performance

Final leaderboard

Terms and conditions

Cash Prizes

HEP meets ML prize

Conditions

Gold sponsors

Silver sponsors

Development

Final

Competition Ends