The Rijksmuseum Integration Layer : Update

Edward Anderson, data engineer, and Chris Dijkshoorn, head of Collection IT of the Research Services Department at Rijksmuseum.

Title page: Rijksmuseum Integration Layer A Linked Data Infrastructure for Sustainable Data Services
Title page: Rijksmuseum Integration Layer A Linked Data Infrastructure for Sustainable Data Services

This month we took part in the European Library Automation Group’s (ELAG) conference in Riga. After two long years of Zooms and chatrooms, it was a treat to take up the opportunity of in-person collaboration and knowledge sharing with our peers. And with this year’s programme showcasing multiple knowledge graph projects, it was a perfect context for explaining and demonstrating our current work on our bibliographic and collections data. Here’s a short summary of our session “The Rijksmuseum Integration Layer: a Linked Data Infrastructure for Sustainable Data Services”.

Figuur 2: The data load from Adlib (Collection Management System) and Koha (Library-database) to the Triple Store as a preparation of Linked Data availability on the website
Figuur 2: The data load from Adlib (Collection Management System) and Koha (Library-database) to the Triple Store as a preparation of Linked Data availability on the website

We presented an overview of the organizational and technical aspects we have encountered implementing a new Linked Data infrastructure. What have been the major challenges integrating data from multiple domains? How are we then supporting the rich and sustainable data services our users need? With what sort of technology stack?

Figuur 3: Unlock data about objects from Adlb as well as about books from Koha using its Persistent Identifiers
Figuur 3: Unlock data about objects from Adlb as well as about books from Koha using its Persistent Identifiers

The Rijksmuseum is currently building a new platform to serve users with interconnected data about its collection, library and archive. A platform for data describing over a million artworks, objects, books and documents charting Dutch art and history. Our ELAG 2022 presentation introduced the Rijksmuseum Integration Layer, its architecture and implementation, and set out to explain how Linked Data principles are enabling us to bridge systems, integrate metadata and deliver client-focused data services.

Figuur 4: Match the data with external sources, automatically extend matches, reconcile based on alignments and save the institutional identifier in the source systems
Figuur 4: Match the data with external sources, automatically extend matches, reconcile based on alignments and save the institutional identifier in the source systems

The Integration Layer is inspired by microservice architectures and the Semantic Web. Linked Data is the key to its design, with identifiers and ontologies providing a foundational shared data structure. At the core of the platform is a knowledge graph served by extract-transform-load pipelines producing and abstracting standards-compliant data models. This data layer is realized by a constellation of loosely-coupled containerized Python applications communicating asynchronously over message queues. It is deployed with Kubernetes in the Azure Cloud.

Figuur 5: From database to a Kubernetes Triple Store to Data Services like PID Resolver, OAI-PMH API, REST API and SPARQL Endpoint to serve our customers
Figuur 5: From source systems to a Triple Store to Data Services like PID Resolver, OAI-PMH API, REST API and SPARQL Endpoint to serve our customers

Our Integration Layer is a stack of simple software components underpinned by Web and cultural heritage sector standards. It’s standards all the way down: HTTP, XML, XSLT, RDF, SPARQL, MARC, CIDOC-CRM, RDA, EDM and Dublin Core all glued together with minimal application code. The platform isn’t fully built yet, but it is maturing into a stable and maintainable infrastructure for the future.

Figuur 6: All these steps cannot be done without developing and testing thoroughly, like with this Python code example
Figuur 6: All these steps cannot be done without developing and testing thoroughly, like with this Python code example

It’s been marvelous to share our progress implementing this technology stack. We’ve garnered useful feedback and recommendations from the community, and are looking forward to tackling the sector’s data integration challenges alongside the many bright technologists we were lucky to meet.

Figuur 7: Conclusion page - We love standards, Tiny apps are great, More models, more flexibility, Optimize later
Figuur 7: Conclusion page – We love standards, Tiny apps are great, More models, more flexibility, Optimize later

The complete presentation of Edward and Chris can be found here.

2 responses to “The Rijksmuseum Integration Layer : Update”

Geef een reactie

Ontdek meer van The Art of Information

Abonneer je nu om meer te lezen en toegang te krijgen tot het volledige archief.

Lees verder