SemEval-2019 Task 12 - Toponym Resolution in Scientific Papers Forum

Go back to competition Back to thread list Post in this thread

> Why some abbreviations are not annotated?

Hi Davy, I have a question about the abbreviation annotation. It seems that the corpus lacks consistency of abbreviation annotation. Sometimes abbreviations are annotated as LOC, sometimes not.

In file: PUB20975994.txt.
Context(in table):
between
sequence
pairs.
Th,
Thailand;
VN,
Vietnam;
In,
Indonesia;
HK,
Hong
Kong;
Gd,
Guangdong;
YN,
Yunnan;
Sh,
Shanghai;
Sd,
Shandong;
ST,
Shantou;

Why 'Th', 'VN', 'HK', 'In', 'Gd', 'YN', 'Sh', 'Sd' and 'ST' are not annotated? Some of these abbreviations can be retrieved by Geonames search engine(e.g. Th, VN). According to your annotation guideline(https://drive.google.com/file/d/1NCtHmesaXwaPHNHhDQWY1ZUjYk93HX4J/view) they should be annotated as LOC.

In file: PMC3773574
Context: DK, duck; GS, goose; SbD, spot-billed duck; MD, mallard duck; BbM, black-billed
magpie; CHU, chukkar; GF, guinea fowl; PG, pigeon; PH, pheasant; QA, quail; FR, ferret;
PT, partridge; SW, swine; WD, wild duck; HK, Hong Kong.

Why 'HK' here is annotated as LOC?

Posted by: chengchen.xpj @ Jan. 10, 2019, 2:22 p.m.

Dear Chengchen,

Regarding the abbreviations in PUB20975994.txt, these would be FNs. As stated in previous answers the abbreviation annotation was inconsistent in a subset of the training corpus which was released in 2015 with the publication of a paper. In PMC3773574, this is from a description of a figure (which is not shown in the text file) so I assume that annotator thought it was a reference to a location and annotated both ‘HK’ and ‘Hong Kong’ as LOC.

Best regards,
Davy

Posted by: dweissen @ Jan. 10, 2019, 7:10 p.m.
Post in this thread