Automatic Identification of Domain Terms: An Approach for Italian


  • Maria Teresa Artese IMATI – CNR, Via Bassini 15, 20133, Milan, Italy
  • Isabella Gagliardi IMATI – CNR, Via Bassini 15, 20133, Milan, Italy



Classification Methods, Word Embedding Models, Probability, Food, Italian Language


The problem of creating a fully automated specific-domain thesaurus is very topical. The paper presents a novel method to address this problem in the Italian language. The main feature of this approach is the integration of different methods: machine learning classification methods working on the semantic representation of candidate terms, word embeddings models, able to capture the semantics of words, and a computation of the degree of specialization of a term. The work is in progress and results obtained so far are promising.


Teresa Artese, M., & Gagliardi, I. (2020). Automatic Identification of Domain Terms: An Approach for Italian. Digital Presentation and Preservation of Cultural and Scientific Heritage, 10, 251–258.

