Metis Sandbox Training

 

1 What is the Europeana Metis Sandbox?

The Europeana Metis Sandbox, referenced as Metis Sandbox in the rest of this training, is an online tool to check the quality of datasets that are offered for ingestion into the Europeana database. The Metis Sandbox allows users to get insight in their records, check for any errors and get a preview of what the data would look like on the Europeana website. 

This training resource is in Beta stage. The instructions and exercises are published for public feedback. It can be used for training purposes, but please note that not all training information is available at this moment and that text and images are subject to change.

 

1.2 Who are the intended users of the Metis Sandbox?

The Metis Sandbox is developed with different users and use cases in mind. Examples of users and their use cases are:

  1. Metadata coordinators at an aggregator to 

    1. Get insight into the data quality checks and processes that apply to data;

    2. Check the data they are planning to send to Europeana for ingestion;

  2. Professionals at a cultural heritage institution working with metadata to check their datasets before sharing this with an aggregator;

  3. Employees of the Europeana Foundation to check and give feedback on datasets from aggregators.

1.3 What will you learn in this training?

After completing this training you will:

  1. Know how to access the Metis Sandbox

  2. Know how to use the Metis Sandbox

  3. Be able to interpret the outcomes of the use of the Metis Sandbox

  4. Get insight into how to improve the data you want to share with Europeana

1.4 How will you learn this?

There are several exercises in this training. You are asked to complete these exercises and answer the questions.

1.5 What will you need?

For this training you need:

  1. This training document.

  2. A computer with an internet connection.

  3. 45-60 minutes depending on your previous experience with working with datasets for Europeana.

  4. Example files for the first few exercises (see instructions below).

  5. A dataset of your own, preferably limited to 50 records (not yet included in this training)

  6. Something to take notes on, physically (a pen and notebook or sticky notes) or digitally (text editor)

  7. The Metis Sandbox User Guide for reference. You can use the user guide whenever you want to look up more information about the Sandbox.

Download the sample files

There are 3 files that you need for exercises 2.1 and 2.3. The filenames describe for which exercises they are.

  1. Click on this link to go to the files for exercise 2.1.

  2. Click on the download button in the top right corner to download the files as a zip file.

  3. Choose a location on your computer to download the files to. Please leave the zip file as is, there is no need to unzip it.

 

  1. Click on this link to go to the files for exercise 2.3.

  2. Click on the download button in the top right corner to download the files as a zip file.

  3. Choose a location on your computer to download the files to, you will need to use this dataset in the first exercise. Unzip this file, you should then have a folder with two files, a zip file and a xsl file.


2 Upload a dataset

In the following exercises you will 

  1. Go through the different processes of uploading a test dataset to the Metis Sandbox.

  2. Review any errors in the dataset.

 

Use case

You are preparing a dataset to be processed for publication to the Europeana website. With the Sandbox you want to check if there are any problems with the dataset before submitting your records for processing (to the Europeana Foundation DPS team).

Ways to upload

There are three ways to upload a dataset to the Metis Sandbox. 

  1. Uploading a zip file that you have on your computer

  2. Upload a zip file that is accessible on a web server

  3. Incremental harvesting through OAI-PMH

The files can be in EDM or can be converted to EDM during the upload process. Only zip files are supported. Other compression methods, such as tar.gz, can not be used in the Sandbox.

In the exercises below you will learn how to upload

  1. a zip file in EDM format from your computer and on a server,

  2. a zip file not in EDM format from your computer.

OAI-PMH harvesting is currently excluded from this training, please contact Henning Scholz if you would like to recieve more information about this option. 

It takes some time for a dataset to be processed by the Sandbox. The order of the exercises is chosen to make the best use of your time. You can choose to skip upload exercises 2.2 and/or 2.3 if you know that these are not relevant for you. This might result in that you have to wait a short time for your dataset of exercise 2.1 to be processed to be able to continue with the next exercises.


2.1 Upload a local zip file in EDM format

You need the dataset Sandbox_Training_Files_2.1.zip for this exercise.

2.1.1 Create a new dataset

  1. Go to the Metis Sandbox by going to Sandbox .

  2. Create a new dataset by clicking on the “create new dataset” link.


2.1.2 Specify and upload the dataset

  1. Enter a name for the dataset. The name of a dataset is important to be able to identify the dataset later, by you and others. Choose something that is descriptive for this exercise, components to consider are:

    1. Something to identify the uploader of the file, for example your name or the name of your organisation.

    2. Specifics of the dataset and/or its purpose for uploading, for example a version number or in this case that it’s for training purposes.

    3. Spaces are not allowed in names of datasets. 

  2. Select a random Country and random Language from the drop down menus. The Country and Language drop down menus are also for you to be able to identify your own datasets. 

  3. Under Harvest Protocol, select “File upload” if this is not selected.

  4. Click on the button left of “No file chosen” and select the sample file named Sandbox_Training_Files_2.1.zip that you downloaded in 1.5 of this training.

  5. Leave the checkbox unchecked of “Records are not provided in the EDM (external) format”. This dataset has been converted to EDM already for this exercise.

  6. Click on the “Submit” button to start your upload.


2.1.3 Processing the dataset 

The processing of the dataset will start. Processing has completed when you see a checkmark appear below the upload icon.

You will already see the Dataset ID of the dataset you just uploaded in two locations:

  1. In the top right corner, on the same height as the name you gave the dataset.

  2. pre -filled below “Enter the id of a dataset to track”.

Write down this number, you will need this for exercise 3.

 

 


2.1.4 Review the dataset (basic review)

Review any results that are already shown while the dataset is being processed. You will be asked to review the results of the entire processed dataset in exercise 3.1. Look for the processing steps that have

  1. No errors, indicated in green.

  2. Non-critical errors, indicated in yellow.

  3. Critical errors, indicated in red.

Click on the “view details ()” link, next to the processing steps to see the record names that have errors and details about the errors.

 

For educational reasons, several mistakes are included in this set. This test set is based on an original dataset of Leiden University Libraries which does not have these mistakes.


2.1.5 Test your knowledge

Test what you have learned by answering the following questions.

 

 


2.2 Upload a file on a web server in EDM format

This scenario is for testing purposes of your data, but is not used for ingestion.

2.2.1 Create a new dataset

  1. Open the Sandbox if you do not have it open yet.

  2. Create a new dataset by clicking on the “create new dataset” link.


2.2.2 Specify and upload the dataset

 

  1. Enter a name for the dataset. Spaces are not allowed in names of datasets.

  2. Select a random Country and random Language from the drop down menus.

  3. Select “HTTP upload” as the Harvest Protocol if this is not selected.

  4. Copy and paste the following URL in the URL text box: http://ftp.eanadev.org/uploads/Sandbox_Training_Files_2.2.zip

  5. Leave the checkbox unchecked of “Records are not provided in the EDM (external) format.

  6. Click on the “Submit” button to start your upload.


2.2.3 Processing the dataset

The processing of the dataset will start. Processing has completed when you see a checkmark appear below the upload icon.

 

You will already see the Dataset ID of the dataset you just uploaded in two locations:

  1. In the top right corner, on the same height as the name you gave the dataset

  2. pre -filled below “Enter the id of a dataset to track”

Write down this number, you might need this for the next exercises.


2.2.4 Review the dataset

Review any results that are already shown while the dataset is being processed. Look for the processing steps that have

  1. No errors, indicated in green.

  2. Non-critical errors, indicated in yellow.

  3. Critical errors, indicated in red.


2.3 Upload a local zip file not in EDM format

You need the files BnF_test_Sandbox.zip and BnF_Manuscripts.xsl for this exercise.

2.3.1 Create a new dataset

  1. Open the Sandbox if you do not have it open yet.

  2. Create a new dataset by clicking on the “create new dataset” link.


2.3.2 Specify and upload the dataset

 

  1. Enter a name for the dataset. See 2.1.2 for more detail if needed.

  2. Select a random Country and random Language from the drop down menus.

  3. Under Harvest Protocol, select “File upload” if this is not selected.

  4. Click on the button left of “No file chosen” and select the sample file named BnF_test_Sandbox.zip that you downloaded in 1.5 of this training.

  5. Click on the checkbox of “Records are not provided in the EDM (external) format”. This dataset has been converted to EDM already for this exercise. A new upload option will appear.

  6. Click on the button left of “No file chosen” and select the sample file named BnF_Manuscripts.xsl that you downloaded in 1.5 of this training.

  7. Click on the “Submit” button to start your upload.


2.3.3 Processing the dataset

The processing of the dataset will start. Processing has completed when you see a checkmark appear below the upload icon.

You can see a confirmation of the XSL transformation right below the dataset name. In the list of the processing steps you will also see the step “transform to EDM external” and the number of records that have been transformed.

Write down the dataset number, you might need this for the next exercises.

 


2.3.4 Review the dataset

Review any results that are already shown while the dataset is being processed. Look for the processing steps that have

  1. No errors, indicated in green.

  2. Non-critical errors, indicated in yellow.

  3. Critical errors, indicated in red.


3 Track a dataset and preview in Europeana

In this exercise you will learn how to track a dataset that was previously uploaded and review some quality indicators for it. You will do this with the dataset uploaded in exercise 2.1, but you can also repeat the exercises below with the datasets from exercise 2.2 and/or 2.3.

Use case

Processing a dataset can take some time. This depends on the size of your dataset. Smaller datasets take less time to process. You do not have to keep the browser window open once a dataset has been uploaded. You can check if the data has been processed at a later moment. It is also possible that you want someone else to have a look at your dataset in the Sandbox.

The Sandbox allows you to get a preview of what your dataset would look like on Europeana.eu and review the problem patterns identified for it. You will also be able to get insight in the Content Tiers and Metadata Tiers of individual records.

You will practice this in the following exercises.

3.1 Track a dataset


3.1.1 Enter the dataset ID

  1. Open the Sandbox in a new window.

  2. Enter the ID of the dataset of exercise 2.1.3  in the text box on the homepage. 

  3. The “track” button will change from grey to green and will be active.

  4. Click on the “track” button

  5. Review the results of this dataset (also see 2.1.4).


3.1.2 Preview the dataset on Europeana

 

Click on the “View published records” link in the top right corner, below the dataset ID, to open a preview of the dataset in Europeana. This preview will open a in a new tab or window. Please keep the sandbox tab open too for the next exercise.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Click on any item in the preview for a more detailed look at the record.

 

 

 

 

 

 

 

 

 

 


3.2 View a record report

3.2.1 Open a record report

  1. Copy the record ID of the item you previewed in step 7 of exercise 3.1.2. The record ID can be found in the URL of the item. In the URL this is the part after “item” and consists of the dataset ID and record ID. Make sure to include the first forward slash, resulting in ”/dataset_id/record_id”.

  2. Go back to the Sandbox and paste the record ID in the “Enter the id of a record” field.

  3. The “report” button will change from grey to green when a valid id has been entered. 

  4. Click on the “Tier Report” button to generate a report.

 


3.2.2 Review a record report

  1. Review the Content Tier Breakdown for the record. This is shown by default.

  2. Click on the link “Metadata Tier” in the summary or the Metadata Tier icon to review the Metadata Tier Breakdown for the record.

  3. Review the dimensions in the Metadata Tier Breakdown by clicking on their respective icons:

    1. Contextual Classes Dimension

    2. Enabling Elements Dimension

    3. Contextual Classes Dimension



 

 

 

 


3.2.3 Compare records before and after enrichment

This exercise allows you to get insight in the enrichtments that have been made to a record.

In the same record report as in 3.2.2

  1. Click on the link “record before processing (as provided)”. This will open the record in a new tab or window. View the page source in your browser to see the record in more detail.

  2. Click on the link “record after processing (as published)”. This will open the record in a new tab or window.

You can now see the differences between both records and the enrichments that have taken place.


3.3 Test your knowledge


3.4 Problem patterns

 

This exercise allows you to gain insight in the problem patterns that have been identified for a dataset.

When reviewing a dataset’s progress (section 3.3.1 above) you will also gain access to the problem patterns listing.

Click on the “Issues (Overview)” button, below the dataset tracking report, to open a listing of the problem patterns that were identified for this dataset.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

The listing contains problem patterns (e.g. P9 – very short description) with underneath a sample of records where the pattern was encountered along with offending values.

Click on the problem pattern title of P9 “Very short description”.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

You now see a detailed description of a problem pattern.

The various problem patterns that the application can detect are described in detail in the Metis Sandbox User Guide.

 


3.5 Test your knowledge

 


4. End of training and satisfaction survey

Thank you for taking the time to complete this training. Could you share your satisfaction by filling out the survey below? For any questions about this training or to share your feedback in detail, please contact Sebastiaan ter Burg. More information about the process to publish your data can be found on Europeana Pro.