CodaLab -

> About pre-training language model

Hi Ajay,

I have some question about using pre-training language model (e.g., BERT):

1. Can we use the publicly released checkpoint of pre-training language model such as XLM and mBART for NMT? (e.g., huggingface, fairseq..)

2. Can we use the pre-training model that is trained using our own data (i.g., crawling data by us, not official) from scratch?

Thanks for your help. Have a nice day.

Cheers,
Jeonghyeok Park

Posted by: JeonghyeokPark @ May 7, 2021, 5:50 a.m.

Hi Jeonghyeok,

To answer your questions:

1. Yes, you could use publicly released pre-trained models.

2. No, the rules forbid to use any other data, apart from those released by us. So you could not use pre-trained models trained on other private data.

- Ajay

Posted by: ajaynagesh @ May 7, 2021, 2:36 p.m.

Hi Ajay,

I have another question about using pre-training language model:

1. Can we use the pre-training model that is trained using the shared data (en-zh, en-ru, zh-ru) from scratch?

Thanks for your help. Have a nice day.

Cheers,
Jeonghyeok Park

Posted by: JeonghyeokPark @ May 10, 2021, 6:16 a.m.

Hi Jeonghyeok,

Sorry for the delayed response, I was on leave.

To answer your question, you could use pre-trained models (that were trained on shared data) as long as they were released publicly. The rules forbid the use of custom data for training in any form.

- Ajay

Posted by: ajaynagesh @ May 13, 2021, 1:48 p.m.

Post in this thread

Forums

Triangular MT: Using English to improve Russian-to-Chinese machine translation Forum

> About pre-training language model