Projects per year
Abstract
Large-scale synthetic research in ancient history is often hindered by the incompatibility of taxonomies used by different digital datasets. Using the example of enriching the Latin Inscriptions from the Roman Empire dataset (LIRE), we demonstrate that machine-learning classification models can bridge the gap between two distinct classification systems and make comparative study possible. We report on training, testing and application of a machine learning classification model using inscription categories from the Epigraphic Database Heidelberg (EDH) to label inscriptions from the Epigraphic Database Claus-Slaby (EDCS). The model is trained on a labeled set of records included in both sources (N=46,171). Several different classification algorithms and parametrizations are explored. The final model is based on Extremely Randomized Trees algorithm (ET) and employs 10,055 features, based on several attributes. The final model classifies two thirds of a test dataset with 98% accuracy and 85% of it with 95% accuracy. After model selection and evaluation, we apply the model on inscriptions covered exclusively by EDCS (N=83,482) in an attempt to adopt one consistent system of classification for all records within the LIRE dataset.
Original language | English |
---|---|
Title of host publication | CHR 2021: Computational Humanities Research 2021 : Proceedings of the Conference on Computational Humanities Research 2021 Amsterdam, the Netherlands, November 17-19, 2021. |
Editors | M Ehrmann, F Karpsdorp, M Wevers, T L Andrews, M Burghardt, M Kestemont, E Manjavacas, M Piotrowski, J van Zhundert |
Number of pages | 13 |
Volume | 2989 |
Publication date | 22 Oct 2021 |
Pages | 123-135 |
Chapter | 12 |
Publication status | Published - 22 Oct 2021 |
Event | Computational Humanities Research 2021 - Duration: 17 Nov 2021 → 19 Nov 2021 |
Conference
Conference | Computational Humanities Research 2021 |
---|---|
Period | 17/11/2021 → 19/11/2021 |
Series | CEUR Workshop Proceedings |
---|---|
Volume | 2989 |
ISSN | 1613-0073 |
Keywords
- comparative analysis
- document classification
- epigraphy
- latin inscriptions
- machine learning
- roman empire
- type of inscriptions
Fingerprint
Dive into the research topics of 'Classifying Latin Inscriptions of the Roman Empire: A Machine-Learning Approach'. Together they form a unique fingerprint.Projects
- 1 Finished
-
SDAM: Small data - Big Challenges: Synthetic study of complexity in the Balkans and Black Sea
Sobotkova, A. (PI), Hermankova, P. (Participant), Kase, V. (Participant) & Ostoic, A. R. (Participant)
01/07/2019 → 31/12/2023
Project: Research