Dear participants,
We just published the baselines over the dev corpus for the four tasks of the challenge (with the user luischir). The baselines are the following:
- task 1: Naive Bayes with tfidf features. (0.6493 F1 over the dev corpus)
- task 2: SVM regression with tfidf features (0.6532 RMSE over the dev corpus)
- task 3: Naive Bayes with tfidf features (0.1038 macro-F1 over the dev corpus)
- task 4: Assign label X if the tweet contains one of the "top" words for label X on the training corpus (top words were selected as the 50th to 60th most frequent words for the label) (0.0595 F1 over the dev corpus)
We encourage all participants to submit their development results.
Regards,
Luis