OffensEval 2019 (SemEval 2019 - Task 6)

Organized by Shervin - Current server time: Dec. 15, 2018, 4:54 a.m. UTC

First phase

Sub-task A (Test)
Jan. 17, 2019, midnight UTC


Competition Ends

OffensEval: Identifying and Categorizing Offensive Language in Social Media (SemEval 2019 - Task 6)

This is the website for the OffensEval 2019 shared task organized at SemEval 2019.

Training Data Release

The OffensEval training set is now available. You can find it in the tab Participate under Files there is a link Public Data in which you can download the file 
This file contains 1) the file offenseval-training-v1.tsv with 13,240 training instances, 2) the file readme-trainingset-v1.txt with important information about the training set and practice submissions, and finally 3) the file offenseval-annotation.txt with some information about the annotation.
The practice submissions should be carried out in the PRACTICE CodaLab installation. The link is included in the README file. This official CodaLab competition will be used ONLY for test submissions in January. Do not try to upload practice submissions in the official CodaLab. Use the PRACTICE CodaLab installation instead. 
If you have any questions please write to (general mailing list) or (organizer's mailing list).


Offensive language is pervasive in social media. Individuals frequently take advantage of the perceived anonymity of computer-mediated communication, using this to engage in behavior that many of them would not consider in real life. Online communities, social media platforms, and technology companies have been investing heavily in ways to cope with offensive language to prevent abusive behavior in social media.

One of the most effective strategies for tackling this problem is to use computational methods to identify offense, aggression, and hate speech in user-generated content (e.g. posts, comments, microblogs, etc.). This topic has attracted significant attention in recent years as evidenced in recent publications (Waseem et al. 2017; Davidson et al., 2017, Malmasi and Zampieri, 2018, Kumar et al. 2018) and workshops such as ALW and TRAC.

In OffensEval we break down offensive content into three sub-tasks taking the type and target of offenses into account.


  • Sub-task A - Offensive language identification;
  • Sub-task B - Automatic categorization of offense types;
  • Sub-task C - Offense target identification.


The data is retrieved from social media and distributed in tab separated format. The trial and traininga data are available in the "Participate" tab. Please register to the competition to download the files.

Participants are allowed to use external resources and other datasets for this task. Please indicate which resources were used when submitting your results.


  •     28 Nov 2018: Training Data Release
  •     15 Jan 2019: Sub-task A test data release
  •     17 Jan 2019: Submission sub-task A
  •     22 Jan 2019: Sub-task B test data release
  •     24 Jan 2019: Submission sub-task B
  •     29 Jan 2019: Sub-task C test data release
  •     31 Jan 2019: Submission sub-task C 
  •     5 Feb 2019: Results announced
  •     10 Mar 2019: System and task description paper submissions due
  •     10 Apr 2019: Author notifications
  •     20 Apr 2019: Camera ready submissions due

Task Organizers

  • Marcos Zampieri (University of Wolverhampton, UK)
  • Shervin Malmasi (Harvard Medical School, USA)
  • Preslav Nakov (Qatar Computing Research Institute, Qatar)
  • Sara Rosenthal (IBM Research, USA)
  • Noura Farra (Columbia University, USA)
  • Ritesh Kumar (Bhim Rao Ambedkar University, India)


Davidson, T., Warmsley, D., Macy, M. and Weber, I. (2017) Automated Hate Speech Detection and the Problem of Offensive Language. Proceedings of ICWSM.

Kumar, R., Ojha, A.K., Malmasi, S. and Zampieri, M. (2018) Benchmarking Aggression Identification in Social Media. In Proceedings of the First Workshop on Trolling, Aggression and Cyberbullying (TRAC). pp. 1-11.

Malmasi, S., Zampieri, M. (2018) Challenges in Discriminating Profanity from Hate Speech. Journal of Experimental & Theoretical Artificial Intelligence. Volume 30, Issue 2, pp. 187-202. Taylor & Francis. 

Waseem, Z., Davidson, T., Warmsley, D. and Weber, I. (2017) Understanding Abuse: A Typology of Abusive Language Detection Subtasks. Proceedings of the Abusive Language Online Workshop.

Evaluation Criteria

Classification systems will be evaluated using the macro-averaged F1-score.

Submission format information is available from the 'Participate' tab above.


Other Datasets

Some participants ask us to provide links to other datasets that could be used together with OffensEval's training set. 

OffensEval's annotation is different than previously released offensive language, agression, and hate speech datasets. Nevertheless, some datasets may be used to provide more training material in sub-task A (offensive vs. not offensive). We recommend the TRAC shared task dataset on aggression identification. 

The links to the training set are listed below and more information can be found in the shared task report

Sub-task A (Test)

Start: Jan. 17, 2019, midnight

Description: Submit predictions on the test set.

Sub-task B (Test)

Start: Jan. 24, 2019, midnight

Description: Submit predictions on the test set.

Sub-task C (Test)

Start: Jan. 31, 2019, midnight

Description: Submit predictions on the test set.

Competition Ends


You must be logged in to participate in competitions.

Sign In