HASOC-Dravidian-CodeMix - FIRE 2020

Organized by dravidiancodemixed - Current server time: July 10, 2020, 3:30 p.m. UTC

Current

First phase
June 19, 2020, 6:53 p.m. UTC

End

Competition Ends
Never

HASOC-Offensive Language Identification- DravidianCodeMix FIRE 2020

There is an increasing demand for offensive language detection on social media texts which are largely code-mixed. Code-mixing is a prevalent phenomenon in a multilingual community and the code-mixed texts are sometimes written in non-native scripts. Systems trained on monolingual data fail on code-mixed data due to the complexity of code-switching at different linguistic levels in the text. This shared task presents a new gold standard corpus for offensive language detection of code-mixed text in Dravidian languages (Malayalam-English and Tamil-English). 

The goal of this task is to identify offenslve language of the code-mixed dataset of comments/posts in Dravidian Languages (Malayalam-English and Tamil-English) collected from social media. The comment/post may contain more than one sentence but the average sentence length of the corpora is 1. Each comment/post is annotated with offensive language label at the comment/post level. This dataset also has class imbalance problems depicting real-world scenarios. 

The participants will be provided development, training and test dataset.

Task1:

This is a message-level label classification task. Given a YouTube comment, systems have to classify it into offensive or not-offensive. To download the data and participate, go to the "Participate" tab.

As far as we know, this is the first shared task on Offensive language in Dravidian Code-Mixed text.

Task2:

This is a message-level label classification task. Given a tweet, systems have to classify it into offensive or not-offensive. To download the data and participate, go to the "Participate" tab.

As far as we know, this is the first shared task on Offensive language in Dravidian Code-Mixed text.

 

More details at https://hasocfire.github.io/hasoc/2020/index.html

  • Each team is allowed to submit up to three systems (each task) for evaluation.
  • The test data will be sent to the participants on the 1 August 2020 and they will be given a window of 10 days (i.e. till 10th of August, 2020 for testing your system and sending us back the labels for the test instances. We will send the participants further instructions on submitting your system and labels for the test data in due course of time.
  • We expect each team to submit a system description paper after the evaluation. The deadline, length of submission and other instructions for the system description papers will be same as that of FIRE 2020 conference papers. All the system papers will be published in the proceedings and the best systems will be given slots for demos and presentations at the workshop.

Bharathi Raja Chakravarthi, PhD Researcher, Insight SFI Research Centre for Data Analytics, Data Science Institute, National University of Ireland Galway

Dr. Anand Kumar, Assistant Professor, Department of Information Technology, National Institute of Technology Karnataka Surathkal, India

Dr John P. McCrae, Lecturer-above-the-bar, Insight SFI Research Centre for Data Analytics, Data Science Institute, National University of Ireland Galway

Prof. K P Soman, Head, CEN, Amrita Vishwa Vidyapeetham

Mr. Premjith, Faculty Associate, CEN, Amrita Vishwa Vidyapeetham

 

HASOC Organizers

 

Thomas Mandl :- University of Hildesheim, Germany

Sandip Modha :- DA-IICT, Gandhinagar, India

prasenjit majumder :- DA-IICT, Gandhinagar, India

Daksh Patel :- Dalhousie University, Halifax, Canada

Gautam Kishore Shahi - University of Duisburg-Essen

Johannes Schäfer - University of Hildesheim

Amit Kumar Jaiswal - University of Bedfordshire

Task announcement: 15 June

Release of Trail data: 20 June

Release of Training data: 1 July

Release of Test data: 1 August

Run submission deadline: 10 August

Results declared: 20 August

Paper submission: 31 August

Revised paper: 30 September

Terms and Conditions

By downloading the data or by accessing it any manner, you agree not to redistribute the data except for non-commercial and academic-research purposes. The data must not be used for providing surveillance, analyses or research that isolates a group of individuals or any single individual for any unlawful or discriminatory purpose.

 

First phase

Start: June 19, 2020, 6:53 p.m.

Competition Ends

Never

You must be logged in to participate in competitions.

Sign In