Welcome to the FoLT competition (winter term 2019/2020)!

For this semester's software project you are going to develop your own classification system to assess whether an online comment is toxic or non-toxic. You will be given a large set of annotated Wikipedia comments to train and develop your classifier.
In the second part of the project, you will augment (i.e. increase) your data set to counter any inherent imbalances it may have in terms of gender bias. Your classifier has to be trained on this enlarged training set, too.
One week before the deadline, the test set will be released on Moodle, so you can compute and compare your output for both tasks. For more information, check the task description on Moodle.
To make a subission for a phase upload a ZIP-Archive containing a single file called submission.txt. Each line contains an id from the testset and a toxicity label separated by an underscore.
Please keep in mind that every post in the test set has to be assigned a label (1 = toxic or 0 = non-toxic). The entire test set must be classified, not just a fraction of it. Furthermore, pay attention to the exact format of your submission. Do not use any other label names and do not confuse the file names. We use accuracy for our final evaluation scheme.

Please click on "View scoring output log" after having uploaded a solution to see the accuracy you have obtained according to our evalution script. Please use the Moodle forum for further questions.

Have fun!

Evaluation Criteria

Your submission is compared to the gold standard and the accuracy score is computed. Your classification system should yield the highest possible accuracy score. In task 2 the bias of your submission is evaluated. For pairs of male/female sentences the classification should only very rarely differ.


You should work in the same groups as with your homework. Please upload your full code in moodle and add a short manual. Your submission must be the result of your own personal code. In case your code cannot be executed, your project will not be graded.


Start: Jan. 1, 2020, 1 a.m.

Evaluation for Toxicity Classification (Task 1)

Start: Jan. 1, 2020, 1 a.m.

Evaluation for Data Augmentation (Task 2)

Start: Jan. 1, 2020, 1 a.m.

Evaluation Task 3 (Optional)

Start: Jan. 1, 2020, 1 a.m.

Competition Ends

Feb. 3, 2020, 11:59 p.m.

# Username Score
1 yq_liu 0.7743
2 licht 0.7571
3 melexi 0.7471