Matrix factorization and deconvolution methods to quantify tumor heterogeneity in cancer research

Organized by raphbacher - Current server time: March 22, 2019, 6:08 p.m. UTC


Challenge #2 - phase 2
Dec. 13, 2018, 9 a.m. UTC


Challenge #2 - phase 2
Dec. 13, 2018, 9 a.m. UTC


Competition Ends
Dec. 14, 2018, 1 p.m. UTC

 /!\ The subscription is only open to people physically attending the challenge in Aussois /!\

Successful treatment of cancer is still a challenge and this is partly due to a wide heterogeneity of cancer composition across patient population. Unfortunately, accounting for such heterogeneity is very difficult. Clinical evaluation of tumor heterogeneity often requires the expertise of anatomical pathologists and radiologists.

This challenge is dedicated to the quantification of intra-tumor heterogeneity using appropriate statistical methods on cancer omics data.

In particular, it focuses on estimating cell types and proportion in biological samples based on averaged DNA methylation and full patient history.
The goal is to explore various statistical methods for source separation/deconvolution analysis (Non-negative Matrix Factorization, Surrogate Variable Analysis, Principal component Analysis, Latent Factor Models, ...).



Evaluation Criteria

The matrix D of shape (N patients, M methylation sites) is provided.
D = TA, with T the cell-type profiles (k cell types, M variables) and A the cell-type proportion per patients (N patients, k cell types).

Participants have to identify an estimate of A matrix and a reconstructed D matrix.

During challenge #1, they have to submit a reproductible script (with their implemented solution) that compute D and A.

During challenge #2 - Phase 1, they have to submit directly the estimates of D and A (to avoid computation delay).

During challenge #2 - Phase 2, they have to submit their final script that compute D and A, this script will be executed on a new noisy realisation of the simulated dataset.

For each challenge, the discriminating metric will be computed on the A matrix (mean absolute error between the estimate and the groundtruth).

The root mean squared error on the D matrix is given as indicator.


Terms and Conditions

By participating to this challenge, you accept to publicly share your submissions.

Challenge #1

Start: Dec. 5, 2018, 10:30 a.m.

Description: In this first phase, you need to submit a working code (follow the starting kit example). The code must compute under 3mn.

Challenge #2 - phase 1

Start: Dec. 12, 2018, 8 a.m.

Description: In this second phase, your are provided with a dataset that takes several real omics data issues into account. You must only provide the result matrices A and D.

Challenge #2 - phase 2

Start: Dec. 13, 2018, 9 a.m.

Description: In this final phase, you must provide the last version of your working code used during the phase 2. Your code must run in less than 20mn.

Competition Ends

Dec. 14, 2018, 1 p.m.

You must be logged in to participate in competitions.

Sign In