Consumer Eroski parallel corpus

Asier Alcázar

doi:10.1387/asju.3874

Consumer Eroski parallel corpus

PDF (English)

Argitaratua 2007-04-13

DOI https://doi.org/10.1387/asju.3874

Asier Alcázar

Laburpena

This paper introduces the Consumer Eroski Parallel Corpus, a collection of articles originally written in Spanish and later translated to three languages also spoken in Spain: Basque, Catalan and Galician. The articles have been correlated in the four languages at the sentence level automatically using Moore's bilingual sentence alignment tool (2002). The Spanish section is also annotated morphosyntactically for parts of speech using SVMtool (Giménez and Márquez 2004). The Basque, Catalan and Galician sections may be annotated in a future release with the collaboration of Computational Linguistics Groups in Spain. To my knowledge, the Consumer Eroski Parallel Corpus is the first resource to exist that encompasses a substantial body of parallel text from these four languages spoken in Spain. I would like to thank the Eroski Foundation for granting permission to share the corpus in the public domain. Making this resource public will provide additional opportunities to test, train and develop natural language processing tools in the computational linguistics community. It may also help translators as a reference. With the addition of an advanced search interface, currently under development, the corpus may be consulted by Basque and Romance linguists interested in cross-linguistic research.

Aipuak nola egin

Alcázar, Asier. 2007. «Consumer Eroski Parallel Corpus». Anuario Del Seminario De Filología Vasca "Julio De Urquijo" 41 (2):1-10. https://doi.org/10.1387/asju.3874.

Abstract 380 | PDF (English) Downloads 418

Zenbakia

Libk. 41 Zk. 2 (2007): Proceedings of BIDE 2005

Atala

Artikuluak

Lan hau Creative Commons Aitortu-EzKomertziala-LanEratorririkGabe 4.0 Nazioartekoa lizentzia baten mende dago.

##plugins.themes.bootstrap3.article.main##

##plugins.themes.bootstrap3.article.sidebar##

Laburpena

Aipuak nola egin

##plugins.themes.bootstrap3.article.details##

Most read articles by the same author(s)