Classifier identification in Ancient Egyptian as a low-resource sequence-labelling task

Dmitry Nikolaev, Jorke Grotenhuis, Haleli Harel, Orly Goldwasser

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

The complex Ancient Egyptian (AE) writing system was characterised by widespread use of graphemic classifiers (determinatives): silent (unpronounced) hieroglyphic signs clarifying the meaning or indicating the pronunciation of the host word. The study of classifiers has intensified in recent years with the launch and quick growth of the iClassifier project, a web-based platform for annotation and analysis of classifiers in ancient and modern languages. Thanks to the data contributed by the project participants, it is now possible to formulate the identification of classifiers in AE texts as an NLP task. In this paper, we make first steps towards solving this task by implementing a series of sequence-labelling neural models, which achieve promising performance despite the modest amount of training data. We discuss tokenisation and operationalisation issues arising from tackling AE texts and contrast our approach with frequency-based baselines.

Original languageEnglish
Title of host publicationML4AL 2024 - 1st Workshop on Machine Learning for Ancient Languages, Proceedings of the Workshop
EditorsJohn Pavlopoulos, Thea Sommerschield, Yannis Assael, Shai Gordin, Kyunghyun Cho, Marco Passarotti, Rachele Sprugnoli, Yudong Liu, Bin Li, Adam Anderson
PublisherAssociation for Computational Linguistics (ACL)
Pages42-47
Number of pages6
ISBN (Electronic)9798891761445
StatePublished - 2024
Event1st Workshop on Machine Learning for Ancient Languages, ML4AL 2024 - Hybrid, Bangkok, Thailand
Duration: 15 Aug 2024 → …

Publication series

NameML4AL 2024 - 1st Workshop on Machine Learning for Ancient Languages, Proceedings of the Workshop

Conference

Conference1st Workshop on Machine Learning for Ancient Languages, ML4AL 2024
Country/TerritoryThailand
CityHybrid, Bangkok
Period15/08/24 → …

Bibliographical note

Publisher Copyright:
© 2024 Association for Computational Linguistics.

Fingerprint

Dive into the research topics of 'Classifier identification in Ancient Egyptian as a low-resource sequence-labelling task'. Together they form a unique fingerprint.

Cite this