Tonio Sebastian Richter, Ralph Birk, Konstantin Fackeldey, Marcus Weber
N. Alexia Raharinirina, Sarah Michelle Klasse
01.01.2021 − 31.12.2022
The linguistic abilities of scribes writing in Egyptien de Tradition has been severely contested. The underlying assumptions on the evolutionary process of languages are highly questionable. In this project, quantitative and novel non-quantitative approaches will be developed to assess scribal activities in the 1st millenium BCE. Our approach aims to achieve a meaningful clustering of the different texts belonging to Egyptien de Tradition. The results will be evaluated according to the historical context of creation of the texts. We begin by using classical methods such as correspondence analysis to pinpoint the differences in grammatical aspects of different texts. This method results in the clustering of the texts according their grammatical construct. We will then explore other clustering methods belonging to the general family of spectral clustering. We will also develop a novel clustering approach based the idea that the operations of a Boolean Ring are highly suitable for text analysis. For example, the clustering of two texts A and B into two sets representing their commonalities and their differences is equivalent to computing the product and the addition of A and B in some suitable Boolean Ring.
In our preliminary work we studied the correspondence between grammatical forms and different texts of Ancient Egyptian. A talk representing our results has been given by R. Birk and can be found here
For the start of our project we presented our visions/ideas to members of the BBAW. Here is a part of our presentation, which includes the idea of using Boolean Rings for the comparative text analysis.
From project members (also preliminary work):
Marcus Weber and Konstantin Fackeldey.
The Complexity of Comparative Text Analysis — “The Gardener is always the Murderer” says the Fourth Machine (2020). arXiv.2012.07637[cs.CL]
Confidential list of authors (including M. Weber): DIN SPEC 2343: Übertragung von sprachbasierten Daten zwischen Künstlichen Intelligenzen — Festlegung von Parametern und Formaten. Beuth Verlag GmbH, 09/2020.
Michael Greenacre and Trevor Hastie.
The Geometric Interpretation of Correspondence Analysis. Journal of American Statistical Association, Vol. 83, No. 398, pp. 437–447
von Luxburg, U. A tutorial on spectral clustering. Stat Comput 17, 395–416 (2007). https://doi.org/10.1007/s11222-007-9033-z
A possible realization of “four” machines to train for comparative text analysis