For the fully supervised speaker verification task, can we use the video or face data of VoxCeleb2 dev dataset? If I can use the video or face data, which track does it belongs to? Track1 or track2?
Posted by: ab3007 @ Aug. 16, 2021, 2:38 a.m.Hi, thank you for posting this.
Unfortunately, you cannot use the video of the VoxCeleb1 because we're focusing on speaker verification only with audio.
Posted by: vgg_oxford @ Aug. 16, 2021, 9:08 a.m.