From Translation to Metadata Repair: Archival Visibility of Finno-Ugric Heritage in a Multilingual Knowledge Graph

Translation, multilingual metadata, and Wikibase-based knowledge infrastructures for reconnecting dispersed cultural heritage collections

Abstract

The presentation examines archival visibility, translation, and metadata repair through examples drawn from Estonian, Hungarian, and Finnish Finno-Ugric collections. Many Mari, Seto, Võro, Udmurt, and Moldavian Csángó materials entered museum and archival collections more than a century ago using inconsistent linguistic standards, phonetic transcriptions, and monolingual cataloguing practices that are often difficult to interpret today, even for specialists or source communities themselves. Using the Finno-Ugric Data Sharing Space (FUDSS), the talk demonstrates how multilingual Wikibase-based infrastructures, controlled vocabularies, lexicons, and community-driven metadata enrichment can reconnect fragmented archival materials with living cultural communities. The presentation discusses how translation becomes a form of metadata repair and semantic reconciliation, enabling collections to become more understandable, searchable, and reusable across institutional and linguistic boundaries. The lecture also shows how dispersed research legacies — for example those of János Jankó or Aladár Bán across Hungarian, Estonian, and Finnish collections — can be virtually reunited through interoperable knowledge graph infrastructures and multilingual metadata harmonisation. Particular emphasis is placed on the role of machine-readable vocabularies, thesauri, and linked identifiers in supporting both scholarly research and community-led interpretation of cultural heritage.

Date
May 27, 2026 2:00 PM — 6:00 PM
Location
ELTE BTK, “A” épület, Dékáni Kistanácsterem (1. emelet 144.)
Múzeum krt. 4/A, Budapest, H-1088

This presentation explores how multilingual knowledge graphs, translation, and metadata repair can reconnect dispersed Finno-Ugric cultural heritage collections with the communities to whom they belong.

Using the Finno-Ugric Data Sharing Space, the talk demonstrates how Wikibase-based infrastructures can improve archival visibility, semantic interoperability, and community participation across ethnographic and linguistic collections.

Key Ideas

  • Translation as archival infrastructure: translation is not only linguistic conversion, but a form of metadata repair and cultural interpretation.
  • Metadata enrichment: multilingual thesauri, controlled vocabularies, and linked identifiers make fragmented collections searchable and reusable.
  • Community participation: endangered language communities can help interpret, validate, and correct historical archival descriptions.
  • Knowledge graph federation: dispersed collections across museums, libraries, and archives can be virtually reunited through interoperable infrastructures.

Examples

  • The Finno-Ugric Data Sharing Space connects materials related to Mari, Seto, Võro, Udmurt, and Moldavian Csángó communities across multiple institutional collections.
  • Historical ethnographic collections often contain outdated transcription systems, inconsistent naming conventions, or descriptions inaccessible to present-day communities and researchers.
  • The presentation discusses how the dispersed legacy of researchers such as János Jankó and Aladár Bán can be reconnected across Hungarian, Estonian, and Finnish collections through multilingual metadata harmonisation.

Next Steps

  • Expand multilingual lexicons and controlled vocabularies for Finno-Ugric cultural heritage.
  • Improve interoperability between Wikibase, Wikimedia Commons, and institutional catalogues.
  • Develop community-based workflows for metadata correction and annotation.
  • Support machine-readable archival descriptions that remain understandable to both experts and source communities.

Federating Open Knowledge through Wikibase: The Case of The Finno-Ugric Data Sharing Space

Daniel Antal
Daniel Antal
Data and AI entrepreneur working with cultural data, with a life-long passion for photography.