/
Ops Team meeting topics (mid-2017)

Ops Team meeting topics (mid-2017)


New elements in EDM, EDM profiles

Changes to EDM schema 

DQC recommendations that will have technical implication at the level of the Europeana Products
Minutes of the meeting on EDM changes at: https://docs.google.com/document/d/1jPSTfFs3IpnMjO9JFoubGB7Q7gs6ouOsFzWtN4XWlIY/edit#

Last EDM changes reported at:
https://docs.google.com/document/d/1jopc4L9Mc55YV3iAY3JSYcEnf8x4eMWZavlSWzDZSSY/edit# 
Future changes listed in: https://europeanadev.assembla.com/spaces/europeana-ingestion/tickets/2404-edm-schema-changes-for-future-updates-/details#
EDM test records at https://app.assembla.com/spaces/europeana-ingestion/wiki/EDM_Test_records

Test record for EDM internal 
ACTION: Kirsten to create two separate test records as a Google doc -- ONGOING
 list of EDMInternal properties from Github: https://github.com/europeana/corelib/wiki/EDMObjectTemplatesEuropeana
https://docs.google.com/document/d/1z7cyuJuOKFO_JB_UTjDTEXuD7kLtZlm7nq7bINfD1Vo/edit?usp=sharing 

Test record for EDM external

EDM profiles

EDM extensions specified, implementation on the way

  • Annotations: The EDM extension is split into two documents, one that explains the basics of the model (as a concrete implementation of the Web Annotation Data Model), the classes and properties, and a companion document which explains how the model should be used to support each of the application scenarios that have been implemented. It is being implemented as part of the AnnotationAPI

The Annotation EDM profile spec: Main EDM Annotation profile, Modelling of the Application Scenarios

Cloud pilot, could be used as data: https://docs.google.com/spreadsheet/ccc?key=0ArFeVeAoD0YBdE1YSkJIT2hfMjZQR285QUxhdGxVbVE 
Kate is interested in mapping some of the CARARE data to the collection profile. ESounds used the profile. Creative has user generated sets and saved searches. TEL could migrate TEL collections in EDM to EDM
will be implemented via the User Sets (and one element in the annotations to represent membership in virtual exhibitions and thematic collections)
DDB starting to model their implementation

ACTION: Call with DDB.  They have now updated their Organisation profile and examples. We need to have a look again. 

EDM for collections 
Questionnaire at https://docs.google.com/forms/d/e/1FAIpQLSfo9UeMhI86F9Uzr0ST1DnT4CFMcQYSclJjnybny1RWlHDr3g/viewform?c=0&w=1 + discussion on Opsteam list 

Possible EDM extensions being discussed

  • Adaptation of the existing Rights part of EDM (for statements with deprecation date) will be discussed with the RightsStatements.org group. Proposal to RS.org group at https://basecamp.com/1768384/projects/11769988/messages/62190960
  • Representing automatic enrichments. Projects that have them: DDB, APEx, SOCH, MIMO, LOCloud, TEL, DM2E, Food and Drink, Fashion. Fashion profiles creates momentum for this. Implementation (not ideal solution, but still) of enrichments as annotations is being discussed
  • Representation of Full-text in EDM as part of TEL migration and Cloud (Europeana Newspapers)

Longer-term EDM items:

  • Representing Europeana links (internal to Europeana data space) derived from provider-sent links, for hierarchical objects, edm:isDerivativeOf, etc. This should be done on the Europeana proxies (e.g. an (edm:isRepresentationOf,dm2e.eu/blah) on provider proxy is replicated into (edm:isRepresentationOf,europeana.eu/blah) on the Europeana proxy.

  • Validating rights / see ticket https://europeanadev.assembla.com/spaces/europeana-ingestion/tickets/realtime_list?ticket=561 (for Metis)

Finished EDM task forces (for reference):

Data documentation

Re-shaping EDM Documentation Robina has proposal for reconfiguring the documents ("how to") to meet data provider needs.
Is Marie-Claire's metadata brief (https://docs.google.com/document/d/1PUwINOvMxyRg2qQzYLOJfufWvTKxDQWomyzgO3YvEWU) fitting Robina's recommendation? Someone should to check whether it would be possible to create the EDM how to by mixing the brief and Robina's draft.

Last update of EDM guidelines and definition October 2017


ACTION: Valentine will update the IIIF Profile with the one change about svcs:Service + check if Dataset and organisation profile need update 

To be used internally, for the ingestion team. To keep track of the ideas behind (XSLT) mappings.


EDM ingestion from partners

Ongoing:

  • DDB (Kirsten) working on their profile  
  • Europeana Space (Pierre)

"AthenaRC has developed two new micro-services: (1) a micro-service that allows mapping of subject terms to a common thesaurus (in this case AAT) and (2) a micro-service that maps temporal information to a common thesaurus (in this case Perio.do). Both micro-services require that the content providers create the appropriate mappings using MORe."
http://perio.do/guide/#finding-and-using-a-uri-for-a-period-or-collection
http://www.europeana.eu/api/v2/search.json?wskey=api2demo&query=when:*n2t*
http://mint-events.image.ntua.gr/wp-content/uploads/2016/03/Gavrilis-MoRe-presentation-Technopolis.pdf

  • OpenUp/CommonNames! Do we get CommonNames data in a useful way? (i.e do the URI they send us de-refer to RDF)?
  • There are two types of links , Pierre is talking with Gerda

Follow-up with Gerda now on the structure of the CommonNames vocabulary (Pierre, Valentine, Hugo) 

  • Europeana 14-18: new revision: new revision: some data is now prepared by Richard with new mapping. Data being tested but issues within UIM (difference in number of CHO/records in proxy). Ready for ingestion. Just wiating for CRF 

first was https://europeanadev.assembla.com/spaces/europeana-npc/tickets/1684-europeana-1914-18--create-complete-list-of-complete-field-mapping-14-18-data/details#
then this https://europeanadev.assembla.com/spaces/europeana-npc/tickets/1689-spike--best-way-to-represent-stories-items/details#
then supposedly new mapping will be https://europeanadev.assembla.com/spaces/europeana-npc/tickets/1855-create-an-updated-mapping-that-takes-hierarchies-into-account/details#

  • IIIF ingestion

Marjolein working with Nuno: NLWales Photography to replace current data > some issues on their side, but looking good. Testing with Wellcome library of improvements with current data
Issues are on descriptive side or IIIF data?
Mostly richer descriptive data, IIIF is good. No news

Swedish museum [mysterious name]  with IIIF (Pablo) working to map to EDM
Follow-ups from IIIF Vatican conference, after Nuno's presentation on harvesting expeirments:
- BSB is interested in IIIF harvesting. Maybe we should try to see when DDB would be ready, and potentially discuss with them if they'd be ok with BSB sending us stuff 'on the side' if they won't be IIIF compatible before long time?
Can be interesting for KPIs
ACTION: Kirsten to communicate with DDB about timeframe for implementing IIIF

- Durham university is also interested with contributing some IIIF collections
- also Heidelberg. much longer-term though (that would go from a research dept through their UL)


ACTION: Ingestion team to update the list of vocabularies used by data partners 

NB: there is already a list of problems at https://europeanadev.assembla.com/spaces/europeana-ingestion/tickets/1992-list-of-bad-enrichments-to-be-removed-from-uim/details?tab=followers  To be updated with automatic enrichment problems (entities to be removed)

Changes from http to https might impact some embeddable players. Will impact BnF, Dismarc .... some sounds datasets

Data quality

Issues with language tags: https://europeanadev.assembla.com/spaces/europeana-npc/tickets/927/ (nothing will be done on this issue in the short term).  

  • Have a vocabulary agreed and available to normalise dc:type. ONGOING

https://docs.google.com/spreadsheets/d/1kqazJP74zNcsRmLsQgxoxqctQz8hMDdBChVSwHBwRzM/edit?usp=sharing

  • Vocabulary was reviewed
        - libraries for feedback on TEXT. Adina got a reply from Serbia. 
        - Fashion for feedback on Fashion types
        - shared the vocabulary  with the Rise of literacy

    Draft report from Pablo https://docs.google.com/document/d/1_o2k_aqc4qcE9QmA73MjiG26AocWvC8ift-tgRNAEhM/edit (ACTION: open comments rights) > missing words  'multilingual' and 'searchable' --> this needs to be in the report

    There may be need of guidelines for using specific values (or parts of the vocabularies) with specific EDM properties (dcterms:medium, format, etc).
    It's dangerous to risk that a concept used with dc:format (e.g. by Fashion) would be used with another property by another aggregator.

    Feedback from Kate Fernie
    Call with EFashion. They will provide feedback to refine the list for Fashion items 

  • Data Quality Plan (Kirsten and Pablo). https://docs.google.com/document/d/1bveUqx1KJP35UVrkpk3rJeC2axtm6MKaE-Af4fW5qC0/edit  (template/ can be used as an introduction to the idea of Data Quality Plan

- Introduction letter to DSI partners (Draft) 
https://docs.google.com/document/d/1uF7uv7dAviP5LGrvjv8D6vylTDYA6NyEtJbPYw-RELw/edit?usp=sharing

- Data Quality Plan progress report document
https://docs.google.com/document/d/1QqfYNrvWE_0oPI9yEyoderfi6JTaKyLrYOUWGl43LaU/edit?usp=sharing
MdV: Draft data quality plan has been shared with Fashion and should be agreed on by the 15th of August. 

Carare- still in progress -  i would ike to include those or some of in the DQP. It needs to be agreed with Kate  
https://docs.google.com/document/d/1OjkcRG-CxTU1JJbgV2bvG0ndWcV53ArnrWNt0vjVp-M/edit?usp=sharing
Euscreen - done     https://docs.google.com/document/d/144lOhwL3dOIO1dkJBHzmIludPE00Kzfc5G22XiWMphM/edit?usp=sharing
Photoconsortium - sent, waiting for their reply
https://docs.google.com/document/d/19OhisD3jhUtvGXs1bj-k5C18KFj0qrdHinzozz2XeMI/edit?usp=sharing
Fashion (finalised): https://docs.google.com/document/d/1p_tjHEv0qy2HJv_U4DESZA0-l2fPSumxDzWQXX4HYuA/edit?usp=sharing
Museu - still in progress https://docs.google.com/document/d/1_e3eTodKgz-ElgpkxgHofc0lZy55alu5PjLWW2kWLBQ/edit?usp=sharing
Sounds (finalised) https://docs.google.com/document/d/1XAkxThdkPMvSzm8HLrr7q___LWb-kEklhlePRRtaE1M/edit?usp=sharing
Open-up - sent, waiting for their reply
https://docs.google.com/document/d/1iOtymVnaJlVCk0zhJXkwktJ2geM5ONYUp6SAQ0LOSOc/edit?usp=sharing
EFG: to come
APEF - sent, waiting for their reply
https://docs.google.com/document/d/1KmIvrVq0mdfE76APn_iF2dNpPpmB8UeWU4-yrjZUf3M/edit?usp=sharing

  • Problems pattern catalogue 

To be distributed  to data providers https://docs.google.com/spreadsheets/d/1atZr1w-h9AdWwWSBYLCCk6fAdJSxrCNP56QRLNY1jLg/edit#gid=1801176604

  • Update from DQC

Completeness measure 

Problem patterns

ACTION: everyone to look at the sheet and provide feedback
https://docs.google.com/spreadsheets/d/1zoU-1uPk2O5t5zRC1-MD3LakBQGJ2hsWlSnp3XS2iAk/edit?disco=AAAAA2-Id5E

Kirsten and Henning made a plan to disseminate patterns to providers. Currently in DQC.

20th century black hole

https://github.com/hugomanguinhas/europeana/blob/master/rd-exp/experiments/BlackHole.md
http://pro.europeana.eu/blogpost/the-missing-decades-the-20th-century-black-hole-in-europeana

  • Document was amended with comments from Hugo and sent to Kennisland. But we don’t know how Kennisland is using it

date enrichment and normalization will be considered in the coming prioritizations for enrichment and normalization. Discussions were continued as part of the definition of the DSI 2 KPIs.

Kennisland is re-starting work. Pablo is liaising with them.

Enhance provider data 

use ISNI? 

Cécile: Move to URIs (replace data provider literals with URI) is one thing. We will need to have a workflow in place: how do organizations register to get an identifer? Do we then ask them to use the identifier in the data? If yes, we need a communication plan because the change is huge. 
Some data providers like DDB have organisation data. Should we harvest these data or not? 

 Relation with https://basecamp.com/1768384/projects/1000684/messages/68533674

Cécile: Workflow for organisation for METIS needs to be refined. How to ingest Organisation data from data providers, what will the process of attributing identifiers. 
No feedback yet to Aggregator Forum feedback. 
What type of data we accept: with Europeana ID. But after feedback from data providers do we continue on the same line, + owl:sameAs, form on Pro to get Org.
How do we create new organisations? Registry within Pro, or an email where they are provided with their Identifier?
Misunderstanding? by ID we mean a URI to resources, while their identifiers are catalogue numbers. 
Do we accept these others identifiers, map it somewhere else? 
Pablo: i'd say stick to original plan. 
Cecile: there will be no changing data in METIS for first release. What data do we get in, and what do we want to get in. 

ACTION: DPS to think of process flow needed for getting Organisations identifiers. 
https://docs.google.com/spreadsheets/d/1BFmJidtdsSVEA10lcbKKAqZEH13cCkWhd-GYJaCUifU/edit 

--> the meeting happened, some things still need to be discussed about how we envision our organization inventory. Henning met with Dasha and Aubery following the first meeting to start evaluating the process with Zoho in mind. 

Probably nothing will be done before September. Cécile will follow-up with Dasha
Conversation required between Network team and ingestion team to discuss the relationships between the inventory and Zoho

DSI2 KPI on deduplcation of providers at risk. Discuss how we want to report on this. 
Same applies to the completeness measure and dc:language normalized.

Normalization

Normalising providers and data provider names is on the way (Pablo) Plan had been written https://docs.google.com/document/d/1jLy971Zwpv9qu7hL1DzMsfREzAtMoJWUPhJnj6D25UY/edit

Normalization service for Metis requirements: see https://europeanadev.assembla.com/spaces/europeana-ingestion/tickets/2156-metis-requirements--cleaning-and-normalization-service-v1/details#

 https://docs.google.com/document/d/1nJKZk7xgXXiCBAzA423MLT4V94-b8VEgIv0OCz6X68Y/edit
Overview of the different quality work started or needed at Europeana available at https://docs.google.com/document/d/1YW6829VGl1LSc-tguSLFh-tvvwrw88-SXqe1RfeNCd4/edit

Normalisation language: Nuno developed a new plugin to normalise the values of dc:language. We are now evaluating the first results of this normalisation 
https://docs.google.com/spreadsheets/d/1Z-CGWr6rS7lkcGK75a4UxzRzrd80_O6L_s5DAuj_9uM/edit

Enrichment and entity collection

Documentation on tickets and progress: 
https://www.assembla.com/spaces/europeana-ingestion/wiki/Enrichment_work_and_entities_collection

Old actions and pointers:

METIS Requirements 

Technical design plan version 3 published: https://docs.google.com/document/d/1zlOMDsrb1TTBtomdrzpkmOqmg4t9nMXc7mk6ryuUo-Y/edit
See also : https://www.assembla.com/spaces/europeana-ingestion/wiki/Metis_ALL

Design work progressing: https://projects.invisionapp.com/share/246QE59X7#/screens/219433550

Reindex

Document for re-index is here: https://docs.google.com/document/d/1lkea78ZgjkngiDqGCAqGT-Vuu_1eoElSE2B_kjg1EfY/edit

Events-Conferences

Participation to the Ingestion team to conferences: 
EuropeanaTech conference table at 
https://docs.google.com/spreadsheets/d/11r3O7XhQDYz_2wWwYFP1K0FI54lMSPdXcuBHzrlivsM/edit#gid=1291848083. Interesting conferences and call for papers can be flagged there. 

Related content

IIIF to EDM profile
IIIF to EDM profile
More like this
Documentation update and edit history
Documentation update and edit history
More like this
Classes from EDM Profiles
Classes from EDM Profiles
More like this
Europeana Semantic elements (ESE)
Europeana Semantic elements (ESE)
More like this
EDM profile for technical metadata
EDM profile for technical metadata
More like this
edm:Agent
More like this