Linguistic Corpora as International Cultural Heritage: The Corpus of Bulgarian and Ukrainian Parallel Texts


  • Olena Siruk Taras Shevchenko National University of Kyiv, Ukraine
  • Ivan Derzhanski Institute of Mathematics and Informatics — Bulgarian Academy of Sciences, Sofia, Bulgaria



Text Corpus, Corpus Linguistics, Parallel Texts, Translation Equivalents, Cultural Heritage


The paper relates about our ongoing work on the creation of a corpus of Bulgarian and Ukrainian parallel texts. We discuss some differences in the approaches and the interpretation of some concepts, as well as various problems associated with the construction of our corpus, in particular the occasional ‘nonparallelism’ of original and translated texts. We give examples of the a pplication of the parallel corpus for the study of lexical semantics and note the outstanding role of the corpus in the lexicographic description of Ukrainian and Bulgarian translation equivalents. We draw attention to the importance of creating parallel corpora as objects of national as well as global cultural heritage.


Lendau, S.I.: Dictionaries: the Art and Craft of Lexicography (in Ukrainian). Kyiv (2012)

Cysouw, M., Wälchli, B. (eds.) Parallel Texts. Using Translational Equivalents in Lingui stic Typology. Theme issue in Sprachtypologie and Universalienforschung STUF 60.2. (2007)

National Corpus of the Russian Language,

Corpus of Parallel Russian and Bulgarian Texts,

Siruk, O., Derzhanski, I.: Lexical Translation Equivalents in Bulgarian and Ukrainian Parallel Texts (in Ukrainian). Ukrajins’ke movoznavstvo: Mizhvidomchyj naukovyj zbirnyk. V. 43, pp. 75 – 86 (2013)

Ukrainian Law,

Dobrovolskij, D.O., Kretov, A.A., Sharov, S.A.: A Corpus of Parallel Texts: Architecture and Possibilities of Use (in Russian). National Corpus of the Russian Language: 2003 – 2005, pp. 263 – 296. Moscow (2005)

Garabík, R., Zakharov, V.P.: A Parallel Russian– Slovak Corpus (in Russian). Proceedings of the International Conference “Corpus Linguistics 2006”, pp. 81– 87. Saint Petersburg (2006)

Vitas, D., Krstev, C., Laporte, E.: Preparation and Exploitation of Bilingual Texts. Lux Coreana No1. pp. 110– 132. Han-Seine (2006)

Demska, O.: Text Corpus: an Idea of a Different Form (in Ukrainian). Kyiv (2011)

Varga, D., Németh, L., Halácsy, P., Kornai, A., Trón, V., Nagy, V. Parallel corpora for medium density languages. Proceedings of the RANLP, pp. 590 – 596 (2005). The program Hunalign:

Derzhanski, I.: Time Words in Bulgarian and Ukrainian (Using Evidence from Parallel Texts) (in Bulgarian). In: A. Burova, D. Ivanova, E. Hristova, S. Dimitrova, Ts. Avramova (eds.), Time and History in Slavic Languages, Literatures and Cultures. Proceedings of the Eleventh National Slavic Studies Conference, 19 – 22 April 2012. Volume One: Linguistics, Sofia: St Kliment Ohridski University Publishing House, pp. 229 – 237 (2013)

Derzhanski, I., Siruk, O. Brief Time Words in Bulgarian and Ukrainian (Using evidence from parallel texts). The Eight International Conference “Formal Approaches to South Slavic and Balkan Languages”: Book of Abstracts. Zagreb (2012)




How to Cite

Siruk, O., & Derzhanski, I. (2013). Linguistic Corpora as International Cultural Heritage: The Corpus of Bulgarian and Ukrainian Parallel Texts. Digital Presentation and Preservation of Cultural and Scientific Heritage, 3, 91–98.