Enriching metadata with organisation entities

This page describes how Europeana links information about providing organisations to Europeana entities for organisations.

Europeana Organisation Entities

Data partners include information about organisations involved in the aggregation chain in the metadata, like so:

<ore:Aggregation rdf:about="#Example_01Aggregation"> <edm:aggregatedCHO rdf:resource="#Example_01"/> <edm:dataProvider>Bibliothèque nationale de France</edm:dataProvider> <edm:intermediateProvider>The European Library</edm:intermediateProvider> <edm:provider>Gallica</edm:provider> [ other Aggregation data ] </ore:Aggregation>

Organisations are recorded in the Europeana customer relationship management system (CRM), where we store the main information related to them, such as, among others, the official name and English translation, acronym, country of location, and website. Each Europeana organisation entity contains information extracted from the CRM that is augmented with organisation data available in Wikidata, for example:

https://api.europeana.eu/entity/organization/1482250000002112001.json

Semantic enrichment process

Enrichment Service (one of the ingestion steps in Metis) analyses the values mapped to metadata fields that serve as source fields for enrichment: 

  • edm:dataProvider

  • edm:intermediateProvider

  • edm:provider

For each of these values, a match to target fields from the Europeana entity is performed:

  • skos:prefLabel

  • skos:altLabel

  • edm:acronym

  • owl:sameAs

Matching rules

Matching follows a specific set of rules that specify how a match is obtained between the source and target values. It is case insensitive and takes into account language tags of textual values when available:

  1. If source data is not language tagged, language tags of target fields are disregarded and matching relies solely on textual values.

  2. If source data is language tagged, language tags of the source and target fields must be the same for matching to be successful. 

Matching based on textual values

Metis looks for textual reference (label) mapped to the source field and finds an organisation entity with the same label in the target field. 

Data partners can provide labels that are names of providing organisations in their original language or English translation, as well as acronyms as long as they are recorded in the CRM and part of the entity. 

Source fields:

ore:Aggregation rdf:about="#Example_01Aggregation"> <edm:aggregatedCHO rdf:resource="#Example_01"/> <edm:dataProvider>Bibliothèque nationale de France</edm:dataProvider> [ other Aggregation data ] </ore:Aggregation>

 

Target fields:

 

When the match is found, the source value is replaced with the Europeana organisation entity URI and the entity is added to the record. Each entity is an instance of a contextual class as defined in the EDM for representing organisations (foaf:Organization). Inclusion of foaf:Organization class allows the exploitation of rich data about the contextual resource and allows it to be kept separate from the Aggregation class:

 

Matching based on URI (co-)references

Metis looks for URI mapped to the source field and matches it against the coreference link (indicated by owl:sameAs relation) available for the organisation entity. 

Source fields:

 

Target fields:

 

When the match is found, the source value is replaced with the Europeana organisation entity URI and the entity is added to the record:

 

Data partners can provide a range of different persistent identifiers (PI) for organisations, including those listed below (note this list is not exhaustive):

Vocabulary / PI system name

Example URI

Wikidata

Virtual International Authority File (VIAF)

The Getty - Union List of Artist Names (ULAN)

Gemeinsame Normdatei (GND)

International Standard Name Identifier (ISNI)

Archival Resource Key (ARK)

The Research Organization Registry (ROR)

 

Partners need to make sure to provide the correct URI pattern and notify Europeana about the PI they wish to use so that it can be recorded in the CRM.