We believe that multi-level language structures need to be labelled together, otherwise errors in one tag level will lead to errors in the following. Existing pipelines “tokenization - morphology - lemmatization - syntax” accumulate errors at each stage.
During the competition, participants aim to build systems that define:
We offer the participants to try to build systems that implement complete morphological and syntactic markup with lemmatization within the framework of Universal Dependencies.
The cumulitive evaluation consists of:
The testing procedure will include tests on “golden” texts in many genres and from different sources in Russian. We welcome systems that steadily process all the variety of texts in the Russian language (including texts that differ in style, scope and genre, region, time of creation).
We are open to questions about data, metrics, and testing procedures.
Start: Feb. 1, 2020, midnight
Start: Feb. 23, 2020, 9 p.m.
Feb. 25, 2020, 11:59 p.m.
You must be logged in to participate in competitions.Sign In