Abigail Walsh

Postdoctoral Researcher

Menu

PARSEME corpus release 1.3


Journal article


Agata Savary, Cherifa Ben Khelil, Carlos Ramisch, Voula Giouli, Verginica Barbu Mititelu, Najet Hadj Mohamed, Cvetana Krstev, Chaya Liebeskind, Hongzhi Xu, Sara Stymne, Tunga Güngör, Thomas Pickard, Bruno Guillaume, E. Bejcek, Archna Bhatia, Marie Candito, P. Gantar, U. Iñurrieta, Albert Gatt, Jolanta Kovalevskaite, Timm Lichte, Nikola Ljubešić, J. Monti, Carla Parra Escartín, M. Shamsfard, I. Stoyanova, V. Vincze, Abigail Walsh
Workshop on Multiword Expressions, 2023

Semantic Scholar DBLP DOI
Cite

Cite

APA   Click to copy
Savary, A., Khelil, C. B., Ramisch, C., Giouli, V., Mititelu, V. B., Mohamed, N. H., … Walsh, A. (2023). PARSEME corpus release 1.3. Workshop on Multiword Expressions.


Chicago/Turabian   Click to copy
Savary, Agata, Cherifa Ben Khelil, Carlos Ramisch, Voula Giouli, Verginica Barbu Mititelu, Najet Hadj Mohamed, Cvetana Krstev, et al. “PARSEME Corpus Release 1.3.” Workshop on Multiword Expressions (2023).


MLA   Click to copy
Savary, Agata, et al. “PARSEME Corpus Release 1.3.” Workshop on Multiword Expressions, 2023.


BibTeX   Click to copy

@article{agata2023a,
  title = {PARSEME corpus release 1.3},
  year = {2023},
  journal = {Workshop on Multiword Expressions},
  author = {Savary, Agata and Khelil, Cherifa Ben and Ramisch, Carlos and Giouli, Voula and Mititelu, Verginica Barbu and Mohamed, Najet Hadj and Krstev, Cvetana and Liebeskind, Chaya and Xu, Hongzhi and Stymne, Sara and Güngör, Tunga and Pickard, Thomas and Guillaume, Bruno and Bejcek, E. and Bhatia, Archna and Candito, Marie and Gantar, P. and Iñurrieta, U. and Gatt, Albert and Kovalevskaite, Jolanta and Lichte, Timm and Ljubešić, Nikola and Monti, J. and Escartín, Carla Parra and Shamsfard, M. and Stoyanova, I. and Vincze, V. and Walsh, Abigail}
}

Abstract

We present version 1.3 of the PARSEME multilingual corpus annotated with verbal multiword expressions. Since the previous version, new languages have joined the undertaking of creating such a resource, some of the already existing corpora have been enriched with new annotated texts, while others have been enhanced in various ways. The PARSEME multilingual corpus represents 26 languages now. All monolingual corpora therein use Universal Dependencies v.2 tagset. They are (re-)split observing the PARSEME v.1.2 standard, which puts impact on unseen VMWEs. With the current iteration, the corpus release process has been detached from shared tasks; instead, a process for continuous improvement and systematic releases has been introduced.


Share



Follow this website


You need to create an Owlstown account to follow this website.


Sign up

Already an Owlstown member?

Log in