Presentation at “Balisage. The markup conference 2021”

Encoding meaningful semantic relationships in literary texts is almost as difficult as defining and identifying them. Defining the types and the components of semantic relationships that can be extracted from literary texts is a quite challenging task because literature is full of implicit and oblique messages and references. Subsequently, identifying and encoding semantic relationships in literature is even more challenging because often relations do not have neither clear nor standard linguistic form and usually they overlap each other. This paper discusses modeling and encoding issues concerning the mapping of relationships of cultural content in literary and humanities texts, highlighted by the case of the ECARLE project annotation campaign. On handling these modeling and encoding issues the paper proposes a methodology of minimalistic and flexible annotation techniques, combined in order to generate human annotated training data for a Relation Extraction machine learning system. The proposed methodology utilizes the available TEI tagset, and, without any further customizations, allows the mapping of relations formed by named entities in a simple yet flexible way, open to reuse, interchange, conversion and visualization.

Leave a Reply

Your email address will not be published. Required fields are marked *


Exploitation of Cultural Assets with computer-assisted Recognition, Labeling and meta-data Enrichment.


Follow us

Copyright © 2018 - 2019 ECARLE. All rights reserved.

Σχεδιασμός & Ανάπτυξη: izachros.gr