Classifying Latin Inscriptions of the Roman Empire: A Machine-Learning Approach

Vojtech Kase*, Petra Heřmánková, Adéla Sobotková

*Corresponding author for this work

Research output: Contribution to book/anthology/report/proceedingArticle in proceedingsResearchpeer-review

8 Citations (Scopus)
76 Downloads (Pure)

Abstract

Large-scale synthetic research in ancient history is often hindered by the incompatibility of taxonomies used by different digital datasets. Using the example of enriching the Latin Inscriptions from the Roman Empire dataset (LIRE), we demonstrate that machine-learning classification models can bridge the gap between two distinct classification systems and make comparative study possible. We report on training, testing and application of a machine learning classification model using inscription categories from the Epigraphic Database Heidelberg (EDH) to label inscriptions from the Epigraphic Database Claus-Slaby (EDCS). The model is trained on a labeled set of records included in both sources (N=46,171). Several different classification algorithms and parametrizations are explored. The final model is based on Extremely Randomized Trees algorithm (ET) and employs 10,055 features, based on several attributes. The final model classifies two thirds of a test dataset with 98% accuracy and 85% of it with 95% accuracy. After model selection and evaluation, we apply the model on inscriptions covered exclusively by EDCS (N=83,482) in an attempt to adopt one consistent system of classification for all records within the LIRE dataset.
Original languageEnglish
Title of host publicationCHR 2021: Computational Humanities Research 2021 : Proceedings of the Conference on Computational Humanities Research 2021 Amsterdam, the Netherlands, November 17-19, 2021.
EditorsM Ehrmann, F Karpsdorp, M Wevers, T L Andrews, M Burghardt, M Kestemont, E Manjavacas, M Piotrowski, J van Zhundert
Number of pages13
Volume2989
Publication date22 Oct 2021
Pages123-135
Chapter12
Publication statusPublished - 22 Oct 2021
EventComputational Humanities Research 2021 -
Duration: 17 Nov 202119 Nov 2021

Conference

ConferenceComputational Humanities Research 2021
Period17/11/202119/11/2021
SeriesCEUR Workshop Proceedings
Volume2989
ISSN1613-0073

Keywords

  • comparative analysis
  • document classification
  • epigraphy
  • latin inscriptions
  • machine learning
  • roman empire
  • type of inscriptions

Fingerprint

Dive into the research topics of 'Classifying Latin Inscriptions of the Roman Empire: A Machine-Learning Approach'. Together they form a unique fingerprint.

Cite this