The main goal of the SdSV Challenge 2020 is to evaluate new technologies for text-dependent (TD) and text-independent (TI) speaker verification (SV) in short duration scenario.
The challenge evaluates SdSV with varying degree of phonetic overlap between the enrollment and test utterances. It is the first challenge with a broad focus on systematic benchmark and analysis on varying degree of phonetic variability on short-duration speaker recognition.
Each team needs at least one CodaLab account to be able to submit their results. When creating an account, please select a team name that can be the name of your organization or any anonymous identity. There are two separate tasks in the challenge. Participants can register to any of the two tasks or both. The same user account (i.e, team name) should be used if teams decided to participate in both tasks. This page corresspont to the first task of the challenge.
Task 2 of the SdSV Challenge is speaker verification in text-independent mode: given a test segment of speech and the target speaker enrollment data, automatically determine whether the test segment was spoken by the target speaker.
Each trial in this task contains a test segment of speech along with a model identifier which indicates one to several enrollment utterances. The net enrollment speech for each model is uniformly distributed between 3 to 120 seconds (after applying an energy-based VAD). The system is required to process each trial independently and produce a log-likelihood ratio (LLR) for each of them.
The in-domain training data in this task contains text-independent Persian utterances from 588 speakers. This data can be used for any purpose such as LDA/PLDA, score normalization, training data for neural network, reducing the effect of language for cross-lingual trials, etc.
There are two partitions in this task. The first partition consists of typical text-independent trials where the enrollment and test utterances are from the same language (Persian). The second partition consists of text-independent cross-language trials where the enrollment utterances are in Persian and test utterances are in English. For this partition, the system should reduce the language effects in order to verify the test utterances in a different language. Similar to Task 1, there are no cross-gender trials in Task 2. Note that any further information about test language will not be provided but participants are allowed to train any language identification system to do it if they needed.
Similar to Task 1, we adopted a fixed training condition where the system should only be trained using a designated set. The available training data is as follow:
The use of other public or private speech data for training is forbidden, while the use of non-speech data for data augmentation purposes is allowed. The in-domain DeepMine training data can be used for any purpose, such as neural network training, LDA or PLDA model training, and score normalization. Part of the data could also be used as a development set since there is no separate development data provided for the challenge. Note that, however, usage of Task 1 in-domain data for this task is not allowed.
The enrollment data in Task 2 consists of one to several variable-length utterances. The net speech duration for each model is roughly 3 to 120 seconds. Since each enrollment utterance is a complete recording without trimming to a specific duration, the overall duration might not be exactly uniform. Note that using the enrollment utterances from the other models is forbidden, for example, for calculating score normalization parameters.
Each trial in the evaluation contains a test utterance and a target model. The duration of the test utterances varies between 1 to 8 seconds. Similar to Task 1, the whole set of trials is divided into two subsets: a progress subset (30\%), and an evaluation subset (70\%). The progress subset is used to monitor progress on the leaderboard. The evaluation subset is used to generate the official results at the end of the challenge.
The main metric for the challenge is normalized minimum Detection Cost Function (DCF) as defined is SRE08. This detection cost function is defined as a weighted sum of miss and false alarm error probabilities:
By using this competition page, teams will be able to submit the system scores and see the results on the progress set. To participate, open the competition page and click on the "Participate" tab. Then accept the Terms and Conditions and click on the "Register" button. After this, you will be able to upload files via the "Participate" tab. Note that a team must only submit to CodaLab the output scores of its system, not the system itself. Make sure you have read the "Submission Instructions" under the "Participate" tab before uploading any files.
To submit a file you need to click on the "Submit/View Results" link under the "Participate" tab. After this, you will be able to see two buttons, corresponding to the Competition Phases. Click on one of the buttons to choose the phase you want to submit to. The available phases are:
Each team can submit one submission per day during the Challenge Period phase but after that, this limit will be increased to 5 submissions per day. See more details about this under "Submit/View Results"
Participation in this challenge is open to all who are interested. There is no cost to participate except writing a system description with at least two pages. We highly recommend submitting the corresponding paper to challenge's special session at Interspeech 2020.
We kindly ask participants to use their organization name as the team name. Also, each organization is allowed to participate using only one account.
There will be three cash prizes. The winners will be selected based on the results of the primary systems on the evaluation subset. In addition to the cash prize, each winner will receive a certificate for their achievement. The cash prizes are as follow:
The evaluation dataset used for the challenge is drawn from the recently released multi-purpose DeepMine corpus. The dataset has three parts and among them, Part 1 is used for TD-SV while Part 3 is for TI-SV. Since the evaluation dataset is a subset of the DeepMine corpus, in addition to the CodaLab account, teams need to complete the dataset’s License Agreement. After signing the agreement, the scanned version of the signed agreement should send back to the challenge organizations by the challenge email: sdsvc2020[at]gmail.com. The dataset download links will be sent to the team’s corresponding user. More information can be found in the SdSV Challenge 2020 page at https://sdsvc.github.io/
Start: Jan. 15, 2020, midnight
Description: Submissions for evaluating systems on the Progress set. Note that in this phase you can only see the results on the progress set which is 30% of the whole trials.
Start: March 15, 2020, midnight
Description: Submissions for doing post evaluation. Note that for this phase the reported results are for the Evaluation set while the results in the Challenge Period phase are for the Progress set.
Dec. 31, 2020, midnight
You must be logged in to participate in competitions.Sign In