Latin lexical semantic annotation

dataset

posted on 2021-11-12, 16:23 authored by Barbara McGillivrayBarbara McGillivray

This dataset is a collection of lexical annotation of the corpus occurrences 40 Latin lemmas. The corpus instances are from LatinISE and the process is described in Schlechtweg et al. (2020, 2021).

The annotation was coordinated by Barbara McGillivray, and done by Annie Burman, Daria Kondakova, Francesca Dell'Oro, Helena Bermudez Sabel, Hugo Burgess, Paola Marongiu, and Rozalia Dobos. The pre-annotation was coordinated and designed by Barbara McGillivray and done by Manuel Márquez Cruz.

References

McGillivray, B. and Kilgarriff, A. (2013). Tools for historical corpus research, and a corpus of Latin. In Paul Bennett, Martin Durrell, Silke Scheible, Richard J. Whitt (eds.), New Methods in Historical Corpus Linguistics. Tübingen: Narr

Barbara McGillivray, Dominik Schlechtweg, Haim Dubossarsky, Nina Tahmasebi, & Simon Hengchen. (2021). DWUG LA: Diachronic Word Usage Graphs for Latin [Data set]. Zenodo. https://doi.org/10.5281/zenodo.5255228

Schlechtweg, D., McGillivray, B., Hengchen, S., Dubossarsky, H., Tahmasebi, N. (2020). SemEval-2020 Task 1: Unsupervised Lexical Semantic Change Detection. In Proceedings of the Fourteenth Workshop on Semantic Evaluation, 2020. International Committee for Computational Linguistics. DOI: 10.18653/v1/2020.semeval-1.1

Schlechtweg, D., Tahmasebi, N., Hengchen, S., Dubossarsky, H., McGillivray, B. (2021). DWUG: A large Resource of Diachronic Word Usage Graphs in Four Languages. In Proceedings of EMNLP 2021.