Edition 1.2 of the PARSEME Shared Task on Semi-supervised Identification of Verbal Multiword Expressions

Journal article

Carlos Ramisch, Agata Savary, Bruno Guillaume, J. Waszczuk, Marie Candito, Ashwini Vaidya, Verginica Barbu Mititelu, Archna Bhatia, U. Iñurrieta, Voula Giouli, T. Gungor, M. Jiang, Timm Lichte, Chaya Liebeskind, J. Monti, Renata Ramisch, Sara Stymne, Abigail Walsh, Hongzhi Xu
Workshop on Multiword Expressions, 2020

Semantic Scholar DBLP

Cite

APA Click to copy
Ramisch, C., Savary, A., Guillaume, B., Waszczuk, J., Candito, M., Vaidya, A., … Xu, H. (2020). Edition 1.2 of the PARSEME Shared Task on Semi-supervised Identification of Verbal Multiword Expressions. Workshop on Multiword Expressions.

Chicago/Turabian Click to copy
Ramisch, Carlos, Agata Savary, Bruno Guillaume, J. Waszczuk, Marie Candito, Ashwini Vaidya, Verginica Barbu Mititelu, et al. “Edition 1.2 of the PARSEME Shared Task on Semi-Supervised Identification of Verbal Multiword Expressions.” Workshop on Multiword Expressions (2020).

MLA Click to copy
Ramisch, Carlos, et al. “Edition 1.2 of the PARSEME Shared Task on Semi-Supervised Identification of Verbal Multiword Expressions.” Workshop on Multiword Expressions, 2020.

BibTeX Click to copy

@article{carlos2020a,
  title = {Edition 1.2 of the PARSEME Shared Task on Semi-supervised Identification of Verbal Multiword Expressions},
  year = {2020},
  journal = {Workshop on Multiword Expressions},
  author = {Ramisch, Carlos and Savary, Agata and Guillaume, Bruno and Waszczuk, J. and Candito, Marie and Vaidya, Ashwini and Mititelu, Verginica Barbu and Bhatia, Archna and Iñurrieta, U. and Giouli, Voula and Gungor, T. and Jiang, M. and Lichte, Timm and Liebeskind, Chaya and Monti, J. and Ramisch, Renata and Stymne, Sara and Walsh, Abigail and Xu, Hongzhi}
}

Abstract

We present edition 1.2 of the PARSEME shared task on identification of verbal multiword expressions (VMWEs). Lessons learned from previous editions indicate that VMWEs have low ambiguity, and that the major challenge lies in identifying test instances never seen in the training data. Therefore, this edition focuses on unseen VMWEs. We have split annotated corpora so that the test corpora contain around 300 unseen VMWEs, and we provide non-annotated raw corpora to be used by complementary discovery methods. We released annotated and raw corpora in 14 languages, and this semi-supervised challenge attracted 7 teams who submitted 9 system results. This paper describes the effort of corpus creation, the task design, and the results obtained by the participating systems, especially their performance on unseen expressions.