Indonesian is known to be the fourth largest language used over the internet with around 171 million internet users across the globe. Despite a large amount of Indonesian data available over the internet, the advancement of NLP research in Indonesian is slow-moving. This problem occurs because available datasets are scattered with a lack of documentation and minimal community engagement.
Concerning the aforementioned problem, we propose the first-ever Indonesian natural language understanding benchmark, IndoNLU, a collection of 12 diverse tasks. The tasks are mainly categorized based on the input, such as single-sentences and sentence-pairs, and objectives, such as sentence classification tasks and sequence labeling tasks, with different levels of difficulty, domains, and styles. The benchmark is designed to cater to a range of styles in both formal and colloquial Indonesian, which are highly diverse.
To establish a strong baseline, we collect large clean Indonesian datasets, called Indo4B, and use them for training monolingual contextual pre-trained language models, called IndoBERT and IndoBERT-lite. We demonstrate the effectiveness of our dataset and our pre-trained models in capturing sentence-level semantics, and apply them to the classification and sequence labeling tasks.
To help with the reproducibility of the benchmark, we release the pre-trained model, including the collected data and code. In order to accelerate the community engagement and benchmark transparency, we set up a leaderboard website for the NLP community. We publish our leaderboard website at https://www.indobenchmark.com/ and we are also providing the models and the data here: https://github.com/indobenchmark/indonlu.
To participate in the challenge, you can try to submit to this Codalab competitions!
Best of luck!
Evaluation will be based on accuracy, macro-precision, macro-recall, and macro-F1 metrics for classification subtasks. We opt F1 metric as our main evaluation.
We limit 3 submissions per day.
Please kindly check submission example directory. There is different format for each task. Every submission file always start with the `index` column (the id of the test sample following the order of the masked test set).
For you submission, first you need to rename your prediction file into 'pred.txt', then zip the file.
Start: Sept. 20, 2020, midnight
Description: Emotion Twitter Classification Task
Start: Sept. 20, 2020, midnight
Description: Sentence-level Sentiment Analysis Task
Start: Sept. 20, 2020, midnight
Description: Car Reviews Aspect-based Sentiment Analysis Task
Start: Sept. 20, 2020, midnight
Description: Hotel Aspect-based Sentiment Analysis Task
Start: Sept. 20, 2020, midnight
Description: The Wiki Revision Edits Textual Entailment Task
Start: Sept. 20, 2020, midnight
Description: The Prosa Part-of-Speech Task
Start: Sept. 20, 2020, midnight
Description: The PAN Localization Project Part-of-Speech Task
Start: Sept. 20, 2020, midnight
Description: The Airy Span Extraction Task
Start: Sept. 20, 2020, midnight
Description: The Keyphrase Extraction Task
Start: Sept. 20, 2020, midnight
Description: The Grit-ID Named Entity Recognition Task
Start: Sept. 20, 2020, midnight
Description: The Prosa Named Entity Recognition Task
Start: Sept. 20, 2020, midnight
Description: The Factoid Question Answering Task
Never
You must be logged in to participate in competitions.
Sign In