top of page

Digital Humanities and the Study of Mediterranean Mobilities Workshop

On January 18 and 19, 2024, the University of California, Berkeley hosted a workshop titled "Digital Humanities and the Study of Mediterranean Mobilities." This event gathered scholars from several disciplines, including historians, modelers, and data scientists, who shared their latest research findings in a collaborative environment that enriched everyone's experience.

I had the privilege of speaking at the workshop! I presented my exploration of historical data using data analysis and AI image creator to depict population characteristics and movement patterns in the late Ottoman Empire. As a PhD student, this opportunity greatly expanded my academic perspective in two ways. Firstly, it allowed me to network with established researchers across different fields, opening doors to potential collaborations and knowledge sharing. Secondly, it deepened my appreciation for the value of interdisciplinary work, particularly between historians, computer and data scientists.

A key discussion point at the workshop was the challenge historians face in processing and storing transliterated data from Ottoman Turkish to modern Turkish. Transliteration preserves the original meaning and pronunciation when converting text between writing systems [1], which involves not only translating individual names, concepts, or events but categorizing and connecting them in an ontology [2][3]. Moreover, hurdles exist in data analysis, as historians might lack methods to extract information from their data, and data scientists might not understand Turkish, risking data loss or misinterpretation in translation.

The workshop highlighted Wikidata as a useful tool for connecting and archiving data. As a collaborative database, Wikidata organizes information systematically, assigning a unique identifier (QID number) to each entity and detailing their characteristics. This allows for easy access and analysis by both humans and machines, supporting Wikipedia with linked data. FactGrid, another tool presented, shares Wikidata's structured approach but focuses on historical data, helping researchers collect, organize, and analyze details about people, institutions, and past events. FactGrid's graph-based organization enables users to uncover historical connections, although its integration with broader projects may be limited compared to Wikidata. One particularly exciting initiative is the "FactGrid Cuneiform" project, which aims to develop an ontology for all languages written in cuneiform, one of the oldest writing systems from ancient Mesopotamia. By integrating interactive web applications and 3D virtual environments, this project seeks to make historical data more accessible and to link key databases containing cuneiform sources.

As data archiving advances, some questions remain: How should this wealth of historical data be analyzed? And how can language barriers be navigated? My approach involved translating (via Google) some of the data, conducting statistical analysis towards identifying commonalities in names, occupations, diseases, and migration trends. The analysis relied on frequency analysis, box plots, correlation matrices, and bivariate tables. I also relied on AI-generated images, using people's physical descriptions from the data, to bring past societies to life visually. The analysis prompted reflection not only on translation challenges but also on the potential benefits that these approaches bring to the analysis of historical data.

Historical research continues to rely on interdisciplinary collaboration to evolve data analysis and archiving. The quality of information deposited in the databases I mentioned before, among other things, still needs to be analyzed, especially when it comes to other languages and contexts not yet explored. Moreover, while exploring ancient writing systems and their contexts, the burgeoning field of AI offers new methodologies for data analysis. It's incumbent on data and computer scientists to seek engagement across disciplines, ensuring our skills remain pertinent and beneficial for other fields. As importantly, we could benefit from the perspectives provided by other disciplines on how they see data and what information is relevant to their practices. These considerations may lead to finding new solutions and creative ways of thinking about data.


[1] Bilac, S., & Tanaka, H. (2005, February). Direct combination of spelling and pronunciation information for robust back-transliteration. In International Conference on Intelligent Text Processing and Computational Linguistics (pp. 413-424). Berlin, Heidelberg: Springer Berlin Heidelberg.

[2] Lesmo, L., Mazzei, A., & Radicioni, D. P. (2011). An ontology based architecture for translation. In Proceedings of the Ninth International Conference on Computational Semantics (IWCS 2011).

[3] de Azevedo, R. R., Freitas, F., Rocha, R. G., Figueredo, D. S., de Almeida, S. C., & e Silva, G. D. F. P. (2013). Translating Natural Language into Ontology. In SBBD (Short Papers) (pp. 14-1).

Recent Posts

See All


Rated 0 out of 5 stars.
No ratings yet

Add a rating
Post: Blog2_Post
bottom of page