The Role of Language Technologies in Digital Humanities (The Case of Parliamentary Debates)


  • Petya Osenova Institute of Information and Communication Technologies, Bulgarian Academy of Sciences, 2, Georgi Bonchev Str., Sofia, 1113, Bulgaria



Parliamentary Debates, Parlamint, Comparable Corpora, Language Technology, Digital Humanities


The paper focuses on the use case of parliamentary debates as part of Digital Humanities. First, the ParlaMint project is outlined as a flagship initiative of CLARIN ERIC infrastructure. The project makes content from the national and regional parliaments visible, comparable and accessible for policy making and research. Then, the approaches are considered that have been applied in the creation of 31 corpora from national and regional parliaments. Last but not least, the utility of the multilingual resource is discussed.


Calabretta, I. a. (2021). Helsinki Digital Humanities Hackathon 2021: ‘Parliamentary Debates in COVID Times’ .

DCEP. (n.d.). Digital Corpus of the European Parliament .

DeepL. (n.d.). DeepL translator .

Del Fante, D. a. (2023). ParlaMint – A Resource for Democracy .

Devlin, J. a.-W. (2019). BERT: Pre training of Deep Bidirectional Transformers for Language Understanding. arXiv:1810.04805

DHH23. (n.d.). Political polarization .

Erjavec, T. a. (2023). The ParlaMint corpora of parliamentary proceedings. Language Resources and Evaluation, , 415–448. 09574-0

Erjavec, T. e. (2021). Multilingual comparable corpora of parliamentary debates ParlaMint 2.1 . CLARIN ERIC.

Fan, A. a.-K. (2020). Beyond English - Centric Multilingual Machine Translation.

Fišer, D. a. (2021). Voices of the Parliament: A Corpus Approach to Parliamentary Discourse Research. Institute of Contemporary History.

Google. (n.d.). Google Translate .

Infrastructure, C. L. (n.d.). CLARIN .

Liu, Y. a. (2019). RoBERTa: A Robustly Optimized BERT Pretraining Approach. arXiv:1907.11692

NoSketchEngine. (n.d.). ParliaMint corpora.

Opus-MT. (n.d.). Opus - MT .

ParlaCLARIN. (n.d.). Parla - CLARIN Schema .

ParlaMint. (n.d.). ParlaMint: Towards Comparable Parliamentary Corpora .

ParlaMint-II. (n.d.). ParlaMint II .

ParlaMint-Partners. (n.d.). Project Partners .

ParlaMint-Schema. (n.d.). ParlaMint - Schema .

ParlaMintSpeech. (n.d.). ASR training dataset for Croatian ParlaSpeech - HR v1.0 .

Skubic, J. a. (2023). Networks of Power - Gender Analysis in European Parliaments .

Stanford. (n.d.). Stanza .

Tang, Y. a.-J. (2020). Multilingual Translation with Extensible Multilingual Pretraining and Finetuning.

UD. (n.d.). Universal Dependencies .

USAS. (n.d.). UCREL Semantic Analysis System (USAS) .




How to Cite

Osenova, P. (2023). The Role of Language Technologies in Digital Humanities (The Case of Parliamentary Debates). Digital Presentation and Preservation of Cultural and Scientific Heritage, 13, 61–68.