Edward Anderson, data engineer, and Chris Dijkshoorn, head of Collection IT of the Research Services Department at Rijksmuseum.

This month we took part in the European Library Automation Group’s (ELAG) conference in Riga. After two long years of Zooms and chatrooms, it was a treat to take up the opportunity of in-person collaboration and knowledge sharing with our peers. And with this year’s programme showcasing multiple knowledge graph projects, it was a perfect context for explaining and demonstrating our current work on our bibliographic and collections data. Here’s a short summary of our session “The Rijksmuseum Integration Layer: a Linked Data Infrastructure for Sustainable Data Services”.

We presented an overview of the organizational and technical aspects we have encountered implementing a new Linked Data infrastructure. What have been the major challenges integrating data from multiple domains? How are we then supporting the rich and sustainable data services our users need? With what sort of technology stack?

The Rijksmuseum is currently building a new platform to serve users with interconnected data about its collection, library and archive. A platform for data describing over a million artworks, objects, books and documents charting Dutch art and history. Our ELAG 2022 presentation introduced the Rijksmuseum Integration Layer, its architecture and implementation, and set out to explain how Linked Data principles are enabling us to bridge systems, integrate metadata and deliver client-focused data services.

The Integration Layer is inspired by microservice architectures and the Semantic Web. Linked Data is the key to its design, with identifiers and ontologies providing a foundational shared data structure. At the core of the platform is a knowledge graph served by extract-transform-load pipelines producing and abstracting standards-compliant data models. This data layer is realized by a constellation of loosely-coupled containerized Python applications communicating asynchronously over message queues. It is deployed with Kubernetes in the Azure Cloud.

Our Integration Layer is a stack of simple software components underpinned by Web and cultural heritage sector standards. It’s standards all the way down: HTTP, XML, XSLT, RDF, SPARQL, MARC, CIDOC-CRM, RDA, EDM and Dublin Core all glued together with minimal application code. The platform isn’t fully built yet, but it is maturing into a stable and maintainable infrastructure for the future.

It’s been marvelous to share our progress implementing this technology stack. We’ve garnered useful feedback and recommendations from the community, and are looking forward to tackling the sector’s data integration challenges alongside the many bright technologists we were lucky to meet.

The complete presentation of Edward and Chris can be found here.
Geef een reactie