The current ATMO-2 project runs 15 August 2018 - 30 June 2022. (ATMO-1 ran 1 January 2015 - 15 August 2018.) They are funded in large part by the Henry Luce Foundation.
The projects support the digital scanning, publication, and analysis of selected manuscripts from the Jarring Collection of manuscripts from Eastern Turkestan collected by Prof. (later Ambassador) Gunnar Jarring and donated by him to the Lund University Library (LUB).
All of the scanned manuscripts are or will be available online both from this ATMO site and from the Alvin portal, which now hosts the LUB collections, including the Jarring Collection. A Concordance of Jarring Collection ms. numbers with Alvin record numbers may be useful to those seeking a shortcut to navigate the Alvin Portal.
Some newly scanned manuscripts are transcribed; in addition to the original Perso-Arabic script, transliterations into an extended Latin script are provided. Some transcripts have English glosses; some of these are also annotated linguistically (segmented into morphemes and tagged with part of speech).
Automatically assigning parts of speech (such as verb or adjective) is generally done by training a machine learning approach on previously annotated texts. This machine learner 'learns' regularities from the already annotated texts. Generally, the more annotated texts we have to feed to the machine learner, the better the results. For ATMO-2, we faced two challenges: from ATMO-1 we had only a few annotated texts, yet had created a very fine grained set of parts of speech (with detailed morphological information).
Our research shows that for such a case, a neural network produces the best results. The neural network is first trained on word classes only (e.g., verb or noun), and then retrained on the larger set of fine-grained parts of speech. More information can be found in the following paper:
Kenneth Steimel, Akbar Amat, Arienne Dwyer, and Sandra Kübler. 2020. Fine-Grained Morpho-Syntactic Analysis for the Under-Resourced Language Chaghatay. Proc. of the 19th International Workshop on Treebanks & Linguistic Theories, Hamburg (Published October 2020).
Besides transcribing and analyzing manuscripts to understand the transmission of medical ideas across Eurasia, network analysis reveals connections in the text that may be otherwise invisible. Network analysis helps us to visualize and understand the totality of the concepts for ailment and cures in the manuscripts. It measures the centrality and connectedness of elements, among other measures.
We began by analyzing the ailments and ingredients from the 169 medical formulae found in the digital edition of Prov. 351, Handbook of Medicine. Among many other elements, the most central and frequent "ailments" were catching colds and sexual stamina; of ingredients, the most central were sugar, honey, and rose water.
During 2019 and 2020, both Rydberg-Cox and Dwyer (separately and together) presented papers on the preliminary discoveries revealed by network analysis.
Currently, the project is applying learning from this preliminary work to other medical manuscripts, as well as to the genealogical scroll (Prov. 561) that is the focus manuscript for the current ATMO-2 project. The latter manuscript is a 2D representation of a genealogical network, tracing the purported ancestry of the presumed patron of the scroll and his contemporaries back to Adam and Eve, with a fascinating mixture of religious, mythological, and historical figures in between. Virtually all named individuals are linked by connecting horizontal and vertical "kinship" lines. Representing this information in an algorithmic network will be our next challenge.
Gunnar Jarring was trained as a Turcologist and spent a year (1929-1930) in the Western Tarim Basin in (what was then) Eastern Turkestan; during this time he acquired a number of manuscripts for himself and on behalf of the Lund University Library. Many of the manuscripts in the Jarring collection had originally been acquired by Swedish missionaries active in the area and by the linguist Gustav Raquette.
Later, after a long and distinguishd career in the Swedish foreign service and at the United Nations, Ambassador Jarring donated his collection of manuscripts to the Lund University Library. The collection has since been augmented with further acquisitions.
It is from this collection that the manuscripts to be scanned, transcribed, or annotated in these projects are drawn.
Credit: Painting of Mahmud al-Kashgari © 1981 by Ghazi Emet.