SemEval-2017 Task 7, Subtask 3

Organized by Logological - Current server time: Nov. 22, 2017, 11:47 p.m. UTC

Current

Test (Heterographic)
Jan. 23, 2017, midnight UTC

Next

Test (Homographic)
Jan. 23, 2017, midnight UTC

Overview

This is the CodaLab Competition for Subtask 3 of SemEval-2017 Task 7: Detection and Interpretation of English Puns. The competition took place in January 2017 and the official results were presented at the SemEval-2017 workshop in August 2017.  The CodaLab Competition has now been re-opened on an unofficial basis for the benefit of others who wish to use the automated scoring system.

Background

A pun is a form of wordplay in which one signifier (e.g., a word or phrase) suggests two or more meanings by exploiting polysemy, or phonological similarity to another signifier, for an intended humorous or rhetorical effect. For example, the first of the following two punning jokes exploits contrasting meanings of the word "interest", while the second exploits the sound similarity between the surface form "propane" and the latent target "profane":

I used to be a banker but I lost interest.

When the church bought gas for their annual barbecue, proceeds went from the sacred to the propane.

Puns where the two meanings share the same pronunciation are known as homophonic or perfect, while those relying on similar- but not identical-sounding signs are known as heterophonic or imperfect. Where the signs are considered as written rather than spoken sequences, a similar distinction can be made between homographic and heterographic puns.

Conscious or tacit linguistic knowledge – particularly of lexical semantics and phonology – is an essential prerequisite for the production and interpretation of puns. This has long made them an attractive subject of study in theoretical linguistics, and has led to a small but growing body of research into puns in computational linguistics. Most computational treatments of puns to date, however, have focused on generational algorithms or modelling their phonological properties.

Task description

Participants will be provided with two data sets:

Data set 1: Homographic puns.
The first data set will contain several hundred short contexts (jokes, slogans, aphorisms, etc.). In each of these contexts, a single word is used as a homographic pun, and that word is marked.
Data set 2: Heterographic puns.
The second data set will be similar to the first, except that the puns will be heterographic rather than homographic.

This subtask is a word sense disambiguation task. Participating systems must annotate each pun word with its two meanings, using sense keys from WordNet 3.1.

Evaluation criteria

The evaluation for this subtask will be carried out in two simultaneous phases, one for the homographic data set and one for the heterographic data set. Systems may participate in either or both phases.

Systems participating in a given phase may provide single a guess for any or all of the contexts in the data set.

The results for each phase must be submitted in a delimited text file named answer.txt. Each line of the text file consists of three fields separated by horizontal whitespace (a single tab or space character). The first field is the ID of a pun word from the data set. The second field is a semicolon-delimited list of WordNet 3.1 sense keys that match one meaning of the pun. The third field is a semicolon-delimited list of WordNet 3.1 sense keys that match the other meaning of the pun. Sample data and results files are available in the trial data.

To submit the results, place answer.txt in a ZIP file (in the top-level directory), and then upload it to CodaLab according to the instructions at Participating in a Competition.

Systems will be scored using the standard coverage, precision, recall, and F1 measures as used in word sense disambiguation:

coverage
# of guesses ÷ # of contexts
precision
# of correct guesses ÷ # of guesses
recall
# of correct guesses ÷ # of contexts
F1
( 2 × precision × recall ) ÷ ( precision + recall )

A guess is considered to be "correct" if one of its sense lists is a non-empty subset of one of the sense lists from the gold standard, and the other of its sense lists is a non-empty subset of the other sense list from the gold standard. That is, the order of the two sense lists is not significant, nor is the order of the sense keys within each list. If the gold standard sense lists contain multiple senses, then it is sufficient for the system to correctly guess only one sense from each list.

For example, take the following gold standard key:

t_1_17  propane%1:27:00::       profane%3:00:00::;profane%3:00:00:unholy:00

Any of the following system guesses would be considered correct:

t_1_17	propane%1:27:00::	profane%3:00:00::;profane%3:00:00:unholy:00
t_1_17	propane%1:27:00::	profane%3:00:00:unholy:00;profane%3:00:00::
t_1_17	propane%1:27:00::	profane%3:00:00::
t_1_17	propane%1:27:00::	profane%3:00:00:unholy:00
t_1_17	profane%3:00:00::;profane%3:00:00:unholy:00	propane%1:27:00::
t_1_17	profane%3:00:00:unholy:00;profane%3:00:00::	propane%1:27:00::
t_1_17	profane%3:00:00::	propane%1:27:00::
t_1_17	profane%3:00:00:unholy:00	propane%1:27:00::

Terms and conditions

By submitting results to this competition, you consent to the public release of your scores at the SemEval-2017 workshop and in the associated proceedings, at the task organizers' discretion. Scores may include, but are not limited to, automatic and manual quantitative judgements, qualitative judgements, and such other metrics as the task organizers see fit. You accept that the ultimate decision of metric choice and score value is that of the task organizers.

You further agree that the task organizers are under no obligation to release scores and that scores may be withheld if it is the task organizers' judgement that the submission was incomplete, erroneous, deceptive, or violated the letter or spirit of the competition's rules. Inclusion of a submission's scores is not an endorsement of a team or individual's submission, system, or science.

You further agree that your system may be named according to the team name provided at the time of submission, or to a suitable shorthand as determined by the task organizers.

You agree not to redistribute the test data except in the manner prescribed by its licence.

Trial

Start: Dec. 5, 2016, midnight

Test (Homographic)

Start: Jan. 23, 2017, midnight

Test (Heterographic)

Start: Jan. 23, 2017, midnight

Competition Ends

Jan. 30, 2050, 11:59 p.m.

You must be logged in to participate in competitions.

Sign In

Top Three

Rank Username Score
1 Logological 0.5000