> Which type(geoname code) of toponym is preferred?

Hi, Davy, I am confused when I analysis PMC1184074.ann. The toponym Bangkok has two different geonameids. 1609348 and 1609350. These two toponyms are almost the same, but only differ in Code (1609348-AMD1 1609350-PPLC). Similar confusing annotations exist in many other files. So, I have two questions:

1. Are these annotations appropriate? 

2. The annotation guideline says "This rule applies to other instances in which the location mention can be referring to more than one entity, with each entity having different specificities." Does it means that toponym with class A is always preferred when it has candidate of class A and candidate of class P?
Thank you.

Posted by: TTCoding @ Jan. 13, 2019, 12:52 p.m.

Dear TTCoding,

There are instances where the same toponym name refers to locations with different geographic boundaries. Bangkok, for example, Is the name of a province and the name of a city in Thailand. Our guidelines allowed the annotators to use the text of the article to disambiguate which toponym is being referenced:
‘The annotators should use the context of the paragraph to disambiguate a toponym that can have more than one latitude and longitude. However, if another mention of this toponym previously occurred in the text, the annotators can disambiguate the new mention with the same coordinates, provided that no opposite evidence for this location is found in the paragraph.’

In PMC1184074, the authors referred to both the city (‘…sterilization program has been in place only in Bangkok <<GeoName ID: 1609350>> City since June 2002.’) and the province (‘ Bangkok <<GeoName ID: 1609348>> (Chatuchak)’ ). In the latter example, Chatuchak is a district in the province Bangkok, hence the code for the province is the correct annotation.

In cases where the text did not provide any contextual clues for disambiguating the level of specificity, the least specific, or highest code in the hierarchy, was selected. So it is not correct to say that class A is always preferred when it has candidate of class A and candidate of class P. It is that class A is preferred when there no way to distinguish if the author is referring to class A or class P.

Hope this help.

Best regards,

Posted by: dweissen @ Jan. 14, 2019, 7:34 p.m.

Thanks, Davy.

Posted by: TTCoding @ Jan. 15, 2019, 6:22 a.m.
