I corrected a bug in the evaluation script that allowed counting multiple times the same span for a duplicated tweet, improving the score for no reason. The corrected script now also check for the coherence of the annotation and will reject any prediction file that contains an instance of a tweet predicted with no mention of a medication and another instance of the same tweet predicted with a mention of a medication. For example, the two following instances of the same tweet will be rejected:
821578306491487905 1486889286 2017-02-18 Love trivada https://t.co/KeZ1JleTSb - - - -
821578306491487905 1486889286 2017-02-18 Love trivada https://t.co/KeZ1JleTSb 5 11 trivada trivada
The script has been updated on Codalab and the current version can be found on Bitbucket.