Short-duration Speaker Verification (SdSV) Challenge 2021 - Task 2 : Text-Independent

Organized by sdsvc - Current server time: March 30, 2025, 7:21 p.m. UTC

Post Evaluation

March 20, 2021, 10 p.m. UTC

Current

Challenge Period

Jan. 13, 2021, midnight UTC

End

Competition Ends

Dec. 31, 2021, midnight UTC

Overview
Evaluation
Terms and Conditions
Evaluation Data

Short-duration Speaker Verification (SdSV) Challenge 2021

Task 2 : Text-Independent Speaker Verification

Evaluate New Technologies in Short Duration Scenarios

The main goal of the SdSV Challenge 2021 is to evaluate new technologies for text-dependent (TD) and text-independent (TI) speaker verification (SV) in short duration scenario. By providing a new set of tools on the evaluation platform, this year we aim to focus further advancement on new methods and further analysis of results on the challenge dataset.

The challenge evaluates SdSV with varying degrees of the phonetic overlap between the enrollment and test utterances. This is the continuation of the first challenge with a broad focus on systematic benchmark and analysis on a varying degree of phonetic variability on short-duration speaker recognition.

The full challenge evaluation plane can be found in this link. If you have any more questions regarding the challenge you can contact organizers via sdsv.challenge[at]gmail.com.

Each team needs at least one CodaLab account to be able to submit their results. When creating an account, please select a team name that can be the name of your organization or any anonymous identity. There are two separate tasks in the challenge. Participants can register for any of the two tasks or both. The same user account (i.e, team name) should be used if teams decided to participate in both tasks.

Short-duration Speaker Verification (SdSV) Challenge 2021

Task 2 : Text-Independent Speaker Verification

Evaluation Plan

Task 2 of the SdSV Challenge 2021 is speaker verification in text-independent mode: given a test segment of speech and the target speaker enrollment data, automatically determine whether the test segment was spoken by the target speaker.

Each trial in this task contains a test segment of speech along with a model identifier which indicates one to several enrollment utterances. The net enrollment speech for each model is distributed between 3 to 180 seconds (after applying an energy-based VAD). The system is required to process each trial independently and produce a log-likelihood ratio (LLR) for each of them.

The in-domain training data in this task contains text-independent Persian utterances from 588 speakers. This data can be used for any purpose such as LDA/PLDA, score normalization, training data for neural network, reducing the effect of language for cross-lingual trials, etc.

Trials:

There are two partitions in this task. The first partition consists of typical text-independent trials where the enrollment and test utterances are from the same language (Persian). The second partition consists of text-independent cross-language trials where the enrollment utterances are in Persian and test utterances are in English. For this partition, the system should reduce the language effects in order to verify the test utterances in a different language. Similar to Task 1, there are no cross-gender trials in Task 2. Note that any further information about test language will not be provided but participants are allowed to train any language identification system to do it if they needed.

Training condition:

Similar to Task 1, we adopted a fixed training condition where the system should only be trained using a designated set. The available training data is as follow:

VoxCeleb1
VoxCeleb2
LibriSpeech
Mozilla Common Voice Farsi
DeepMine (Task 2 Train Partition)

The use of other public or private speech data for training is forbidden, while the use of non-speech data for data augmentation purposes is allowed. The in-domain DeepMine training data can be used for any purpose, such as neural network training, LDA or PLDA model training, and score normalization. Unlike SdSVC 2020, this year we provide a separate development set for the challenge. Note that, however, usage of Task 1 in-domain data and its development set data for this task is not allowed.

Enrollment Condition:

The enrollment data in Task 2 consists of one to several variable-length utterances. The net speech duration for each model is roughly 4 to 180 seconds. Since each enrollment utterance is a complete recording without trimming to a specific duration, the overall duration might not be exactly uniform. Note that using the enrollment utterances from the other models is forbidden, for example, for calculating score normalization parameters.

Test Condition:

Each trial in the evaluation contains a test utterance and a target model. The duration of the test utterances varies between 1 to 8 seconds. Unlike SdSVC 2020, this year we only have a smaller evaluation set (so the progress set is eliminated) in the Codalab that is used to monitor progress on the leaderboard as well as the final ranks of the participants. Based on this, on the last few days of the challenge deadline, the leaderboard will be hidden.

Performance Measurement:

The main metric for the challenge is normalized minimum Detection Cost Function (DCF) as defined in SRE08. This detection cost function is defined as a weighted sum of miss and false alarm error probabilities:

Participating:

By using this competition page, teams will be able to submit the system scores and see the results on the progress set. To participate, open the competition page and click on the "Participate" tab. Then accept the Terms and Conditions and click on the "Register" button. After this, you need to send an email to the challenge organizers using sdsv.challenge[at]gmail.com. Your registration will be approved in a short time and after that, will be able to upload files via the "Participate" tab. Note that a team must only submit to CodaLab the output scores of its system, not the system itself. Make sure you have read the "Submission Instructions" under the "Participate" tab before uploading any files.

To submit a file you need to click on the "Submit/View Results" link under the "Participate" tab. After this, you will be able to see two buttons, corresponding to the Competition Phases. Click on one of the buttons to choose the phase you want to submit to. The available phases are:

Challenge Period: This is the main phase of the challenge and participants can use it to evaluate their systems.
Post Evaluation: During this phase, new submissions will be accepted and participants can use it to do any post-evaluation to write papers.

Each team can submit 10 submissions per day during the Challenge Period phase but after that, this limit will be increased to 20 submissions per day. See more details about this under "Submit/View Results"

Short-duration Speaker Verification (SdSV) Challenge 2021

Task 2 : Text-Independent Speaker Verification

Terms and Conditions

Participation in this challenge is open to all who are interested. There is no cost to participate except writing a system description with at least two pages that should be submitted on time. We highly recommend participants to submit full system description papers to the challenge's special session at Interspeech 2021 that will be an analysis session.

We kindly ask participants to use their organization name as the team name. We cannot accept requests from individual researchers due to the terms and conditions of the challenge dataset (DeepMine dataset). Also, each organization is allowed to participate in the challenge using only one account. To control this, you should first create an account in the Codalab platform and ask for registration in this competition from the Participate tab. After that, by sending an email to challenge email (sdsv.challenge[at]gmail.com), your registration will be approved.

Short-duration Speaker Verification (SdSV) Challenge 2021

Task 2 : Text-Independent Speaker Verification

Evaluation Dataset

The evaluation dataset used for the challenge is drawn from the recently released multi-purpose DeepMine corpus. The dataset has three parts and among them, Part 1 is used for TD-SV while Part 3 is for TI-SV. Since the evaluation dataset is a subset of the DeepMine corpus, in addition to the CodaLab account, teams need to complete the dataset’s License Agreement if they did not do it for the previous SdSV Challenge (i.e. 2020). After signing the agreement, the scanned version of the signed agreement should send back to the challenge organizers by the challenge email: sdsv.challenge[at]gmail.com. The dataset download links will be sent to the team’s corresponding user. More information can be found in the SdSV Challenge page at https://sdsvc.github.io/

Challenge Period

Start: Jan. 13, 2021, midnight

Description: Submissions for evaluating systems on the evaluation set. Participants will be ranked at the end of this phase.

Post Evaluation

Start: March 20, 2021, 10 p.m.

Description: Submissions for doing post evaluation.

Competition Ends

Dec. 31, 2021, midnight

You must be logged in to participate in competitions.

#	Username	Score
1	IDVoice	0.0319
2	NeteaseHZAILab	0.0322
3	JTBD	0.0386

Competition