CodaLab -

> Question about pre-trained models

The evaluation criteria says that we could only use the corresponding dataset for task-1 to build models for task-1 and dataset for task-2 to build models for task-2 to ensure fairness.
So could we use pre-trained models, such as pre-trained word2vec embeddings or BERT, which are trained on a large amount of unannotated data, and build our models upon them using only the provided annotated data?

Posted by: kliao @ Nov. 18, 2019, 1:58 a.m.

Sorry for confusing, the labeled data you use should be our provided data, while of course you could use other unlabeled data or pre-trained models like BERT.

Posted by: Ariel_yang @ Nov. 18, 2019, 5:03 p.m.

OK. Thanks!

Posted by: kliao @ Nov. 20, 2019, 2:09 a.m.

So, if I understand correctly, e.g. for task2; data augmentation / using annotation from similar tasks is allowed, and we just cannot use data from task1.
Is that right?

Posted by: Martin @ Dec. 16, 2019, 11:30 a.m.

Post in this thread

Forums

SemEval-2020 Task5: Modelling Causal Reasoning in Language: Detecting Counterfactuals Forum

> Question about pre-trained models