SemEval-2017 Task 3 Subtask E

Organized by DorisHoogeveen - Current server time: Sept. 24, 2018, 11:35 a.m. UTC

Previous

Testing
Jan. 23, 2017, midnight UTC

Current

Development
Aug. 1, 2016, midnight UTC

End

Competition Ends
Jan. 31, 2017, noon UTC

Welcome!

Community Question Answering (CQA) forums are gaining popularity online. They are seldom moderated, rather open, and thus they have few restrictions, if any, on who can post and who can answer a question. On the positive side, this means that one can freely ask any question and expect some good, honest answers. On the negative side, it takes effort to go through all possible answers and to make sense of them. For example, it is not unusual for a question to have hundreds of answers, which makes it very time consuming to the user to inspect and to winnow. The challenge we propose may help automate the process of finding good answers to new questions in a community-created discussion forum (e.g., by retrieving similar questions in the forum and identifying the posts in the answer threads of those questions that answer the question well).

We build on the success of the previous editions of our SemEval tasks on CQA, SemEval-2015 Task 3 and SemEval-2016 Task 3, and present an extended edition for SemEval-2017, which incorporates several novel facets.

 

 

This CodaLab competition is for Subtask E of SemEval Task 3: the Multi-Domain Duplicate Detection Subtask (CQADupStack Task)

The task is about identifying duplicate questions in StackExchange.

Given:

  • a new question (aka the original question),
  • a set of 50 candidate questions,

rerank the 50 candidate questions according to their relevance with respect to the original question, and truncate the result list in such a way that only "PerfectMatch" questions appear in it. "Related" and "Irrelevant" questions should not be returned in the truncated list. The gold labels are contained in the RELC_RELEVANCE2ORGQ field of the related XML file. We will evaluate both the position of good questions in the rank, and the length of the returned result list based on the number of good questions that exist for each original question; thus, this is a ranking task, but at the same time a result list truncation task.

More information on the task and all the subtasks can be found on the SemEval Task website.

Evaluation Criteria

Subtask E is evaluated using a variant of MAP that can handle truncated result lists. The details of the MAP variant can be found in this paper: Quit While Ahead: Evaluating Truncated Rankings, by Liu et al. (2016).

Apart from providing test data in the same domains as the development and training data, we will also supply two different test sets from other domains. The best system will therefore be the one that not only performs well in the ranking and result list truncation of the test data in the same domain, but will also perform well in a cross-domain setting.

The test data for this subtask will be released on the 23rd of January, instead of the 9th (as is the case for the other subtasks), due to the nature of this subtask and the data used.

The winning system will be the one with the highest macro-average across the four given domains and the two mystery domains.

Note: the dataset is formatted for training in this subtask. The format required for the output of your systems will be detailed in the scorer and format-checker README files. These can be found here.

The name of the development file you submit needs to be allsubforums_truncated_rankings_subtaskE.pred, and it needs to be zipped.

The name of the test file you submit needs to be allsubforums_truncated_rankings_subtaskE_test.pred, and it needs to be zipped.

Each line in the file needs to start with the subforum name, followed by a TAB. After that the format is the same as for the other subtasks. You can choose the number of subforums to evaluate by only putting the data for some of them in the submission file. For the subforums you leave out, the score on the leaderboard will be 0. The two mystery subforums on the leaderboard will remain empty in the development phase, but will be filled in the test phase.

Terms and Conditions

By participating in this competition and submitting results in CodaLab you agree to the public release of your results in the proceedings of SemEval 2017. Furthermore, you accept that the choice of evaluation metric is made by the task organizers, and that they have the right to decide the winner of the competition, and to disqualify teams if they do not follow the rules of the competition.

Development

Start: Aug. 1, 2016, midnight

Testing

Start: Jan. 23, 2017, midnight

Competition Ends

Jan. 31, 2017, noon

You must be logged in to participate in competitions.

Sign In
# Username Score
1 DorisHoogeveen 0.130