Hierarchical multi-label classification (HMC) of patents is the task of classifying multiple labels for a patent text, where each label is part of an underlying hierarchy of categories. The increasing amount of available digital documents and the need for more and finer grained categories calls for a new, more robust and sophisticated text classification methods. Large datasets often incorporate a hierarchy for which can be used to categorize information of documents on different levels of specificity. The traditional multi-class text classifcation approach is thoroughly researched, however, with the increase of available data and the necessity of more specific hierarchies and since traditional approaches fail to generalize adequately, the need for more robust and sophisticated classification methods increases.
With this task we aim to foster research within this context. This task is focusing on classifying English, German and French patents into their respective hierarchically structured.
The workshop of this shared task will be held in conjunction with the Conference on Natural Language Processing SwissText and KONVENS 2020 in Zurich, Switzerland.
System submissions are done in teams. There is no restriction on the number of people in a team. However, keep into consideration that a participant is allowed to be in multiple teams, so splitting up into teams with overlapping members is a possibility. Every participating team is allowed to submit 3 different systems to the competition. For submission in the final evaluation phase, it is necessary for every team to name their submission (.zip and the actual submission .txt file) in the form "[Teamname]__[Systemname]" (note the two underscores!). E.g. your submission could look like
Funtastic4__SVM_NAIVEBAYES_ensemble1.zip | +-- Funtastic4__SVM_NAIVEBAYES_ensemble1.txt
We also ask you to put exactly this name into the description before submitting your system. This identification method is needed to correctly associate each submitted system with its description paper. Thus, please make sure to write the name exactly as it will appear in your description paper (i.e. case sensitive). If your submission does not follow these rules it might not be evaluated. The evaluation script has been adopted for a formality check.
Only the person who makes the submission is required to register for the competition. All team members need to be stated in the description paper of the submitted system. The last submission of a system will be used for the final evaluation. Participants will see whether the submission succeeds, however, there will be no feedback regarding the score. The leaderboard will thus be disabled during the test phase.
The evaluation script is provided with the data so that participants can still evaluate their own data splits. The evaluation.py takes two input parameters - the path for the input and for the output folder. The input folder must consist of two files: the output of the system, named with the scheme described above and the gold data, gold.txt . The output files will then be written into the output folder.
Classification systems will be listed on the leaderboard based on the F1-score over micro- and macro-averaged F1-score. Therefore, not focusing only on classes with high importance or giving to much focus on low importance classes. The detailed test report will additionally include micro recall, mico precision as well as the subset accuracy. The latter metric is being captured, as it measures how well labels are selected in relation to each other.
The label order is irrelevant. The scoring program handles the assinged labels as a set, duplicates are thus ignored. This is not an issue for the hierarchical task because every child has one parent which results into unambiguity.
The patents were released by EPO on the Creative Commons License. This dataset is redistributed under the creative commons license CC BY-NC.
By participating at this competition, you consent the public release of your scores at the GermEval-2019 workshop and in respective proceedings, at the task organizers' discretion. The scoring of a system may include, but is not limited to, the metrics mentioned on this page. The final decision of the metric choice and score value is made by the task organizers. In case the organizers' judge a submission as incomplete, deceptive or as a violation to the competition's rules, scores may be withheld. The system of a participating team will be named according to the team name provided at the time of submission, or to an abbreviation selected by the task organizers.
All due times are at 23:59 (AoE)
This shared task consists of two subtask, described below. You can participate in one of them or both.
Subtask A: Classify the patent as in a standard multi-lingual hierarchical multi-label document classification setup with a large amount of patents.
Subtask B: In this subtask, a zero-shot/few-shot approach is needed since some labels in the test set have very few or even zero training samples. We provide here the ontology with the descriptions of the classes.
The dataset consists of patents (abstract, title, description), obtained from the European Patent Office. The data was released under the Creative Commons Attribution 4.0 International Public license. The data is divided into four major files: the file with the ids for train, dev and test, the file for titles, the file for abstracts patent_ABSTR.csv, patent_TITLE.csv and for the descriptions patent_description.csv. The title, abtracts and descriptions can be in multiple languages.
The dataset is available under the following links.
The following table shows important quantitative characteristics of the total dataset:
|Number of Patents||366k|
|Number of Words||4'287'596'100|
|Number of classes||815|
Information for Subtask A/B: Exactly one parent is assigned to a child genre. The underlying hierarchy is a forest. The most specific writing genre of a book is not necessarily a leaf node.
Evaluation: for Subtask A, we do not consider the zero/few-shot labels, but only when evaluating for subtask B.
The shared task is organized by:
Dr. Fernando Benites (Zurich University of Applied Sciences (ZHAW), Switzerland)
Dr. Ahmad Aghaebrahimian (ZHAW, Switzerland)
Steffen Remus (Uni Hamburg, Germany)
Prof. Dr. Mark Cieliebak (ZHAW, Switzerland).
Start: Jan. 14, 2020, midnight
Description: Submit predictions for the validation set. The Scoreboard will be enabled.
Start: March 17, 2020, midnight
Description: Submit predictions for the test set. Results during this phase will be used to assess the performance of a submission for this shared task. The scoreboard is disabled.
Start: March 31, 2020, midnight
Description: For evaluation after the competition ends. Submit additional test set predictions.
You must be logged in to participate in competitions.Sign In