Table of Contents | ||||
---|---|---|---|---|
|
Goal of the Metis Sandbox
The Metis Sandbox is a test environment for your datasets. It consists of two tools:
to get insight in your records and
preview what your records will look like on Europeana.eu.
Where to find the sandbox
The Sandbox can be accessed through https://metis-sandbox.europeana.eu/ .
How to prepare your dataset
A dataset for the Metix Sandbox can currently not exceed 1,000 records. If your dataset is bigger than you’ll see an error message. [ A future update will display a message that the first 1000 records will be used in the data archive instead of showing an error.]
Your dataset needs to meet the requirements of the Europeana Data Model external.Your dataset should be on your computer or on a web address as a zip file. The zip file will be unzipped entirely after uploading.
Note |
---|
The Metis Sandbox is a testing environment that is under development. The Metis Sandbox also gets cleaned up at least once every two weeks. If you look for the dataset after this period, a warning message is dispalyed, eg. Dataset not found OK Http failure response for https://metis-sandbox-rest.eanadev.org/dataset/35: 400 OK As a result there is no clear indication how long a dataset stays available after uploading. Datasets that are deleted from the Metis Sandbox will need to be uploaded again for testing. |
General interface elements
There are several different methods to interact with the sandbox. Below is a list of the general interface elements and their uses.
Step indicators
...
The step indicators indicate the user’s location within the Sandbox. They behave as tab headers: clicking on a step will navigate to the corresponding section.
Navigation buttons
Navigating between the steps can also be done by using the navigation buttons.
...
The Previous step button: allows you to go back one page
...
The Next step button: allows you to go to the next page
Track and submit buttons
...
The Track and Submit buttons allow you to track or submit a dataset. These buttons are grey (disabled) when required information is missing in the form. The buttons become green (and clickable) when you enter correct information into the field(s) and the form can be submitted.
...
Links
...
Navigation Buttons
...
Background colour: green
Border: light / green
Shape: round
...
Default state: the step has not been completed yet (green background), the step is not active (light / green border) and the data is not submittable (round shape)
...
Border: heavy / yellow
...
Active state: the step is active (heavy / yellow border) and the content for that step is being displayed
...
Background colour: orange
...
Valid state: the form information for this step has been correctly specified (but is not yet submittable)
...
Shape: square
...
Submittable state: the form information for this step has been correctly specified and the data is submittable
...
Indicator: spinner
...
Submitted state: the form information for this step is being submitted
...
Indicator: tick
Background: green
Shape: round
...
Processed state: the form information for this step has been submitted and processed (tick). The step is not active (green border)
...
Indicator: tick
Background: orange
Border: heavy / white
Shape: square
...
Processed and submittable state: the form information for this step has been submitted and processed (tick). The step contains valid information (orange background) that can be submitted (square shape). The step is active (heavy border)
Home screen
The home screen gives you the option to track an existing dataset or create a new dataset. The options in the home screen are:
The menu toggle makes the different steps of the sandbox visible.
The Dataset ID bar can be used to enter the ID of a previous uploaded dataset
The Track button allows you to track the dataset entered in the dataset ID bar
The Create new dataset link can be used to start uploading a new dataset
Dataset ID
Each uploaded dataset gets a unique ID (see point 6.1 below). You will need to remember or save this ID for yourself to be able to get back to it later. Enter the dataset ID if you want to track its progress on different steps in the text field of the homepage and click on the “track” button, or to query the publish metis sandbox in a different tab/window browser
Create a new dataset
Click on “create a new dataset” on the Homepage or the Dataset Processing page. This will take you to the Dataset Name page.
...
The dataset name is for you to identify the dataset yourself. A name does not have to be unique, because a unique ID is assigned to each dataset by the Sandbox. It is advised to choose a name that will make it easy for you to remember which dataset you’ve created and any details of that dataset. Spaces are not allowed in the name of the dataset.
Enter a name in the text field and click on the “Next step” button.
...
In the “Dataset details” screen you need to add the details of the dataset. You will need to indicate the country and the language of the dataset from a drop down menu. The drop down menu is searchable: click on the down arrow and type the first letters of the language or country to jump to the country or language in the list. Click on the “Next step” button to go to the next step.
Configure the Data Source
...
There are two options to upload your records to the sandbox [TBC: OAI-PMH upload is not enabled currently. This functionality will support harvesting of the records, comparable to current functionality in Metis.]:
File upload: upload a file from your computer
HTTP upload: Add an URL of the zip file with the records on a web server
This field is mandatory. Zip Files are unzipped by the Sandbox. [MacOS has the tendency to add files to zip files, these extra files are automatically skipped. ]
...
Select the radio button left of “File upload”. A “Browse” button will appear.
Click on the “Browse” button. A pop up will appear to search for the file on your computer.
Select the file you wish to use. Click on OK in the pop up window. The Submit button will be enabled once a file has been selected.
Click the “Submit” button to start uploading the file.
...
The upload time depends on the size of the file and the speed of your internet connection.
[TBC: February 2022 update: Additional developments are in progress, such as imposing a limit of 1 terabyte per upload.]
Use HTTP Upload
Select the radio button next to “HTTP upload”. A text bar will appear.
Paste the entire url of the zipfile in the text bar, including http://.
Click on the “Submit” button to start uploading the file.
Processing and analysing
...
Processing dataset
A tracking screen will appear after uploading. The tracking screen displays
The name of the dataset
The tracking number
The date and time when the dataset is submitted in the Sandbox
The details of the dataset (language and country)
The steps and results of the data ingestion process used by Europeana
Record counters
Errors in the data ingestion process and details of the errors
Number of records that are considered for ingestion
...
The numbers show the successful numbers and total numbers of each step. The steps are:
Harvest (H): how many records of your total dataset have been imported successfully
Validate(edm external(Vi): how many records passed the validation of the external EDM [ Conversion will be added soon for XML files with records that do not comply with the required EDM format before validation.]
Transform (T): how many records have been been transformed from the external EDM to the internal EDM [The internal EDM consists of the external EDM and information about the dataset that is added during this import process.]
Normalise (N): how many records have been normalised. Normalisation acts on individual values in the data and could include deletion of double spaces and of duplicate values.
Enrich (E): How many records have been successfully enriched with the information of processing the data.
Process media (M): for how many records the linked media could be found
Preview (Pr): how many records were made available in the preview of a copy the Europeana website (see chapter 7)
Publish (Pu): how many records records were published in a copy the Europeana website (see chapter 7)
The colours of each step indicate how successful this step was:
...
Green: step completed without errors, all records are considered for ingestion
Yellow: non critical warning. Problems with the records have been detected, but the records will still be considered for ingestion.
Red: critical warning. Incomplete records have been detected, these records will no longer be considered for ingestion.
View errors
...
Click on “view detail(s)” to see the details of the error message(s).
...
These messages are generated by… [TBC: Incomplete sentence?]
Example above: a record is missing a title or description. EDM documentation is available to look up details mentioned in error messages. These error messages come from the library parsing tool.
Copy the error messages and/or make a screenshot. Keep the error messages and the dataset close.
Reviewing steps
You can click on the steps to view the information that you’ve entered in the previous steps by clicking on the icons of the steps or clicking on the “previous step” button.
...
You can only view the information, it is not possible to make changes. You can go forward again by clicking on the corresponding icons or by using the forward arrow.
Preview records in Europeana
Click on “view preview” to view your data in a copy of the Europeana website. It can take up to 15 minutes for a preview to be generated. Please wait if your data is not showing yet.
...
Viewing Tier 0 items
It is possible that not all your items are shown in this view. Records with media Tier 0 are hidden by default. You can make these records visible by clicking on “More filters”, scroll down, click the button “Show only items not meeting our publishing criteria” and click to confirm this filter.
...
Table of Contents | ||||
---|---|---|---|---|
|
...
1 Goal of the Metis Sandbox
The Metis Sandbox is a test environment for your data. It consists of a set of tools with which you can:
simulate ingesting and running the Metis workflow on your data,
see what your records would look like on the actual Europeana.eu portal,
get insight into the quality of your records.
...
2 Where to find the sandbox
The Sandbox can be accessed through https://metis-sandbox.europeana.eu/ .
...
3 How to prepare your dataset
It is advised to make sure that your dataset can be used with the Metis Sandbox. Some things to keep in mind:
A dataset for the Metis Sandbox can currently not exceed 1,000 records. If your dataset is larger than that then you will see a warning message indicating that only the first 1,000 records will be processed.
A dataset should contain one record at minimum. If your dataset is empty, you will see an error message.
The records in your dataset needs to meet the requirements of the Europeana Data Model (EDM) external. More information on the EDM can be found on https://pro.europeana.eu/page/edm-documentation . If records are found that do not conform to this schema, you will see error messages. Note that you may choose to provide an XSLT file with the dataset, which the Metis Sandbox will use to try to transform your records into the correct format before validating them against the EDM external specifications.
A dataset must be either uploaded as a zip file or it can be sent via HTTP (i.e. zip file download) or OAI protocols.
Note |
---|
Scheduled and unscheduled clean up Additionally, during system maintenance or the release of a new Metis Sandbox version data may be removed at any time. Where possible, these events will be announced beforehand. Datasets that are deleted from the Metis Sandbox will need to be uploaded again if you wish to access the tests and reports. |
...
4 General interface elements
There are several different methods to interact with the sandbox. Below is a list of the general interface elements and their uses.
4.1 Page header elements
The page header contains two navigational elements, both of which are visible at all times.
...
The ‘hamburger icon’ (the three horizontal lines) on the left opens the side panel with external links and the theme selection.
Furthermore, clicking on the Europeana logo brings you back to the welcome screen at any time.
...
4.2 Buttons
Buttons are used to upload a dataset.
...
...
4.3 Input fields
Input fields are the white boxes where information can or must be entered by you. The description with the input field states what information should or can be entered. Input fields that are required to have a value can be recognised by an asterisk*.
...
Invalid or missing entries will result in an error message to be displayed below or next to the input field.
...
4.4 Page Indicators
Page indicators are shown at the top of the page. They behave as tab headers: clicking on an orb will navigate to the corresponding page. The number of page indicators can vary depending on your use of the Sandbox.
The active page is shaded orange with a yellow border (in the image below, the “Upload Dataset” is the active page). This is also indicated by the page title to the left of the orbs.
...
There are page indicators for the following pages (going clockwise from the orange item):
Upload Dataset
Track Dataset
Problem Patterns – dataset overview
Problem Patterns – record report
Tier calculations – record report
In addition, a page indicator can display a page’s state. In the example below the page indicator for the Record Report shows that:
Data has loaded
The values in the id fields (supplied by you and always available) reflect the data being displayed, i.e. the form is “clean”
...
A cog indicates that the page is busy. An example of this is when a new dataset is still being processed.
...
Note that the page indicator (orb) for Problem Patterns can appear twice: once for viewing problem patterns of a dataset and once for viewing problem patterns and individual records.
...
4.5 Links
Links are used to navigate between pages or to open popups in the Sandbox. There are different types of links used in the Sandbox interface.
For example:
The “track a new dataset” link takes you to the corresponding page.
...
The view detail links open up a popup that displays the details of an error.
...
Links with a warning sign open up a pop up with more information.
...
Links with a light bulb take you to the page with more information.
...
Underlined links switch from view of the tiers.
...
External links have an icon.
...
Some links, when hovered, show a small “copy” button which if clicked will copy the link (the URL) to your clipboard:
...
Links can be greyed when required information is missing. The image below shows that the Track and Issues links are greyed out because there is no information in the input field left of the links.
...
...
4.6 Drop-down menus
Drop-down menus allow you to make a selection of a list of predetermined values.
...
...
5 The Basics: arriving at the Metis Sandbox
5.1 The Welcome screen
The default view, the screen you land on when navigating to the tool is the Welcome screen.
...
You can click ‘GET STARTED’ to navigate to the Home screen (see section below).
The page indicators are already active, so you may for instance use the upload icon (the left-most one in the example above) to take a shortcut and navigate directly to the dataset upload form.
...
5.2 The page header and side panel
These two page elements are present and functional at any time, in any Metis Sandbox page you may find yourself in.
Most useful is the ‘hamburger icon’:
...
This icon opens the side panel. This panel contains three external links and a theme related option:
...
The available links are:
A link to training material, that can be used to try out some of the Metis Sandbox functionality in a more controlled setting.
A link to the feedback page, that also contains a helpdesk functionality. You can register a bug here, ask for support or suggest a new feature/improvement.
You are strongly encouraged to use the feedback page in case you find a bug, if you need support or if you come up with an idea for a new feature or the improvement of an existing one. The Europeana Foundation is committed to keep improving the Metis Sandbox for its users.
A link to the User Guide (which is the document you’re currently reading).
Additionally, you will find an option to switch (toggle) between the two available themes.
...
5.3 The Home screen
This screen allows you to start accessing the Metis Sandbox functionality. Here you can track an existing dataset, request information about a record within that dataset or create a new dataset. It looks like this:
...
A. Page Indicator: indicates that "Dataset Processing" is the current step. Once other steps become available then clicking this will return you to this step.
B. Dataset Id Input: used to enter the id of a previously uploaded dataset.
C. Record Id Input: used to enter the id of a record within the specified dataset. It enables when a dataset id is entered.
D. Create New Dataset Link: enables and navigates to the “Upload a new Dataset” functionality (see below).
E. Track link. This link enables when a dataset id is entered and, when clicked, takes you to the “Dataset Processing” functionality (see below) for the dataset with this dataset id.
F. Issues (Overview) link. This link enables when a dataset id is entered and, when clicked, takes you to the “Problem Patterns” functionality (see below) for the dataset with this dataset id.
G. Issues (Record) link. This link enables when a record id is entered and, when clicked, takes you to the “Problem Patterns” functionality (see below) for the record with this record id.
H. Tier Report link. This link enables when a record id is entered and, when clicked, takes you to the “Record Report” functionality (see below) for the record with this record id.
When you type a dataset ID or a record ID, a green link will appear in the input field. If you click it, you will be taken to the dataset or record preview as it would look like on Europeana.
...
...
6 Upload a new dataset
To create a new dataset click on the “create a new dataset” link at the bottom of the home screen (D in the image above). This will take you to the “Upload Dataset” form.
6.1 The Upload Form
The “Upload Dataset” view looks like this:.
...
A. Step Indicator: clicking this will take you to the “Dataset Processing” step.
B. The dataset name input field. A dataset name is valid if it contains only letters, digits and the underscore character (‘_’).
C. The dataset country drop-down.
D. The dataset language drop-down.
E. The harvest protocol radio button set.
F. The zip file input. This appears because “file upload” is the selected protocol. If the selected protocol is changed to “OAI-PMH upload” or “HTTP upload” then an alternative field (or set of fields) will appear here.
G. Step size field.
H. An (optional) checkbox to specify that you want the Metis Sandbox Server to transform your dataset using XSLT. If selected then a file input will appear below it allowing you to upload an XSL file.
I. The “Submit” button: enables when all the (obligatory) fields have been completed.
J. Step Indicator (inactive): indicates that "Upload Database" is the current step. If you switch to another step then clicking this will return you to this step.
Enter a descriptive name for your dataset in the input field below “Name”. Only letters, digits and the underscore character (‘_’) are supported. You can select the country and language of the dataset with the dropdown menus.
The next step is to determine the “Harvest protocol”: how you will upload your dataset. This is described in detail below. The “Submit” button at the bottom left will be enabled when all information is filled in and valid.
...
6.2 The Harvest Protocol
There are three ways to upload your datasets to the sandbox:
File upload: upload an archive (e.g. a zip file)
OAI-PMH upload: Ingestion with OAI-PMH
HTTP upload: ingestion via a hosted archive (e.g. a zip file) on a server through HTTP or HTTPS
6.2.1 Zip File
The “File upload” protocol is selected by default. This option allows you to upload an archive file with a dataset that is stored locally. The supported archive types are .zip
, .tar
and .tar.gz
archives.
...
Note that, even though it is not currently possible to upload multiple archive files, you can still achieve the same result by wrapping all your archives in one new zip file. The application fully supports nested archives (i.e. zip files of zip files).
6.2.2 OAI-PMH
To use the harvest protocol to OAI-PMH, you should enter values for the harvest URL, the metadata format, and optionally a setSpec value. For more details on these, please see the OAI-PMH specification.
...
6.2.3 HTTP(S) upload
You can also specify an archive that is accessible with a URL. Set the harvest protocol to “HTTP upload” to be able to enter a value for the URL. The URL should be the (HTTP or HTTPS) download location of an archive (.zip
, .tar
or .tar.gz
file) that contains the dataset records.
...
6.3 XSL Transformation to EDM (Optional)
It is possible to transform the records in the dataset to the EDM format, using XSLT before any further processing. Check the option “Records are not provided in the EDM (external) format”. An additional file input will appear for an XSL file to be specified.
...
6.4 The step size
This field allows you to influence the sampling behaviour.
...
A step size of n tells the Metis Sandbox to select every nth record for processing. This value must be a strictly positive whole number (i.e. 1 or larger). The default value is 1.
Info |
---|
If your dataset contains more than 1,000 records, the Metis Sandbox does not process them all. Instead it takes a sample of 1,000 records. By default (with a step size of 1), the first 1,000 records that are encountered in the dataset are selected for processing. But if your dataset is larger and made up of several batches of slightly different records, this may not yield a representative record sample of the dataset. The step size field may be used to achieve a more representative sample. |
For instance, with a step size of 3, the records in position 3, 6, 9, 12, …, 3000 will be selected (or fewer, if the dataset is smaller than 3,000 records).
A good rule of thumb for choosing the stepsize is the following:
If your dataset contains at most 1,000 records, leave the default value of 1. All records will be processed.
If your dataset contains more than 1,000 records, but they are quite homogeneous, leave the default value of 1. The first 1,000 records will be processed.
If your dataset contains more than 1,000 records and they vary in composition and/or structure, select a step size as follows. Take the dataset size, divide by 1,000 and round down. For instance: if your dataset has 123,456 records, a good value for the step size would be 123. This way you ensure that 1,000 records are selected with maximum spreading.
...
6.5 The Generated Dataset ID
The “Submit” button will become enabled once you have filled all fields. Click the “Submit” button to upload your dataset. You will be redirected to the “Dataset Processing” page, where you can see the data being processed in real-time.
A unique dataset id is generated for your upload and displayed at the top-right of the “Dataset Processing” page. Remember or save this ID to be able to get back to the dataset in the future (i.e. from the home screen, see above).
...
7 Dataset processing
Enter a dataset ID in the home screen (the “Dataset Processing” page) and click the ‘Track’ link to track (monitor) the processing of an uploaded dataset, or to see the results after it finishes processing.The “Track” button for the dataset id field is disabled when the field value is empty. This button will enabled when you type in a valid dataset id.
...
Invalid id’s will show a warning, and the submit buttons will be disabled again.
...
A record id can only be entered when a valid dataset id has been entered. The links next to the record field are greyed out when the field is empty or when an invalid value has been entered. The links will be enabled once you enter a valid record id.
...
See “record provider IDs and Europeana IDs” (below) for more information about record ids and record provider ids.
7.1 The Data Processing View
A submitted dataset id will bring up the dataset processing view. It will also change the page’s url to reflect the id of the dataset processing being displayed. The dataset processing view looks like the picture below.
...
A. The dataset name. The tick after the dataset name indicates that processing is complete
B. An (optional) flag indicating whether the dataset was xsl-transformed.
C. The processing date, preceded by an (optional) flag indicating that not all records in the dataset were processed.
D. The country and language of the dataset selected when the dataset was uploaded.
E. The processing steps performed on the dataset (they correspond to the list of items just below, element F).
F. The details of the processing steps performed on the dataset.
G. The (optional) warning indicating that not all records in the dataset were processed. See “step size” above for more information.
H. The (not enabled) record id field.
I. The dataset ID of the current dataset.
J. A link to the dataset preview as it would look like on Europeana.
K. The tier statistics tab opener.
L. The tier-zero indicator.
The tick after the dataset name indicates that processing is complete, and the generated dataset id is shown at the top-right.
The main (white) panel shows a list of processing steps, detailing how many records were processed during each, and an (optional) warning indicating that not all records in the dataset were processed. Clicking this warning, if present, will show additional information about the import.
The dataset id will also be filled in at the bottom of the screen, enabling the the “record id” field.
To track the data processing of a different dataset just replace the value in the dataset id field with another id and click the “track” button.
...
7.2 The Metis workflow
The data goes through nine steps as part of the processing workflow. These steps are:
Harvest (H): how many dataset records have been successfully imported
Transformation to EDM (Te): How many records have been transformed to the external EDM format (optional step)
Validation External (Ve): how many records passed EDM validation
Transformation (T): how many records have been transformed from the external EDM format to the internal EDM format
Validation Internal (Vi): how many records have passed internal validation
Normalisation (N): how many records have been normalised. Normalisation acts on individual values in the data and could include the deletion of redundant whitespace or of duplicate values
Enrichment (E): how many records have been successfully enriched
Media Processing (M): how many records have had their associated media processed
Publish (Pu): how many records have been published, i.e. uploaded to the Sandbox preview environment (which is a copy of the ‘real’ Europeana website, but does not share the same data).(see chapter 7)
The colours of each step indicate how successful this step was:
Green: (success) - the step completed without errors, and all records are considered suitable for ingestion
Yellow: (non-critical warning) - problems with the records have been detected, but the records could still be processed.
Red: (critical warning) - more serious problems with the records have been detected, and (some of) these records could not continue their path through the pipeline. These should longer be considered for ingestion (in their current form).
...
7.3 The Data Processing Errors Window
Shown below is an example of a dataset that processed with many errors:
...
A. A link to the errors window
B. The bold font of the number indicates that this is another link to the errors window
C. No report is available for this error, so the the number does not have a bold font and there is no link to the errors window
Errors are flagged by red numbers in the panel, and if an error report is available, by the “view detail” links in the right-hand column. The red number indicates the number of records affected (one in this case) and this number is repeated (parenthesised) in the “view detail” link.The red number also serves as a link to the error report, if available. In the screenshot above an error report is available for all processing steps apart from the last.
Clicking a link to the errors report will open a pop-up window, allowing you to see the error detail.
...
...
7.4 View the published records
Click on “view published records” (item J in the image in 7.1) to view your final data in a copy of the Europeana website. This link is shown in the top-right of the submitted “Dataset Processing” page UI, underneath the generated dataset id. This will show the dataset records as published on the Sandbox Preview environment.
Note |
---|
15 minute delay for data publication Please note that it can take up to 15 minutes after the publish step finishes for the data to become available on the website. Please wait if your data is not showing yet. |
It may, for example, appear like the image below.
...
Note |
---|
Tier 0 records hidden by default |
...
...
7.5 Tier Statistics
Once a dataset has been processed it’s possible to view its tier statistics to help assess the dataset’s quality. The dataset processing tab will look something like this once a dataset has been processed:
...
A. The tier statistics tab opener
When you click the tier statistics tab opener, you will see a tab that looks like this:
...
A. The pie chart gives an overview of the statistics - shown by the content tier dimension (by default).
B. If you click the column headers, you toggle the column sort order and change the data dimension of the pie chart to that header’s default.
C. The second row of clickable column headers allow specific data dimensions to be set and sorted on.
D. The search input allows you to filter the record data by (part of the) record id.
E. The data grid shows the record data in a panel that you can scroll through. The fields are record id, content tier, content tier license, metadata tier (aggregate value), metadata tier (language dimension), metadata tier (enabling elements dimension) and metadata tier (contextual classes dimension). If you click on a record id, you will be taken to the tier calculation report for that record (see below).
F. Page navigation is enabled where necessary.
G. Here you can select the number of rows shown at a time in the table.
H. Here you can jump to a specified page by entering a (valid) page number.
I. The dataset floor row gives the lowest tier value present in the dataset (and the value you probably wish to look at to improve the quality of your data).
7.6 Filtering Tier Statistics
Clicking a pie-slice (or its corresponding legend item) will filter the data down to that value. A click on the value "3" in the pie, for example, will restrict the grid to showing only records that have a content tier value of "3".
...
A. The active filter. Clicking the active pie-slice will remove the applied filter.
B. The active filter's legend item. Clicks on legend items are equivalent to clicks on pie-slices.
C. Orange column headers indicate the active filter.
D. A new summary row appears below the data grid indicating aggregate values for the filtered data.
E. The pagination updates to reflect the filtered data.
F. Only records with a content-tier value of "3" are visible in the grid.
7.7 Sorting Filtered Tier Statistics
When dataset tier statistic data is filtered by content tier you can sort it by one of the other dimensions by clicking its column header. Usually clicking a column header changes the pie chart dimension and sorts on that column, but when a filter is active the sort will be applied within the data dimension that has been filtered on.
Here we see data that was filtered by content tier (value 3) and sorted by metadata tier (aggregate value).
...
A. Clicking this column-header will not change the dimension (it will remain “content tier”), but it will the sort (by metadata tier) within that dimension.
B. As before, the specific type of metadata tier sort (aggregate value) is clarified with an arrow-head indicator in the second sub-header row.
...
8 The tier calculation report
You can view a tier calculation report by clicking on a record ID in the tier statistics grid (see above). Alternatively, you can view the report by entering both the id of a dataset as well as the id of a record within this dataset (see below).
8.1 Record Provider Ids and Europeana Ids
Every processed record has both a Provider id and a Europeana id.
A Europeana id begins with a forward slash followed by the record’s dataset id, another forward slash and then a further sequence of (non-whitespace) characters. You can find the Europeana ID of a specific record by clicking the dataset preview link and finding and inspecting the records there.
A record’s Provider id, on the other hand, can be any sequence of (non-whitespace) characters, and is the value that can be found in the ‘rdf:about’ attribute of the ‘providedCHO’ section of your record.
You can search for a record using either of these record ids, so the “Report” button will enable itself when any sequence of non-whitespace characters has been entered into the record id field. If, however, the UI detects that you’ve entered an id that matches the format of a valid Europeana record id, then it will show a line connecting the record id with the dataset id, as shown here:
...
A. The record id begins with a slash followed by the dataset id, so the id fields are shown as connected.
B. You can now open the record report by clicking the button labelled “Tier Report”.
...
8.2 The Record Report
The record report - or Tier Report - is divided into two main sections:
the content tier section
the metadata tier section
You can navigate between these sections by clicking the corresponding navigation orbs. The computed value of each tier is shown within its navigation orb at the bottom. These computed values are single digit: numeric in the case of the content tier.
In the illustration below the computed values are “3” (for the content tier) and “A” (for the metadata tier).
...
A. Page Indicator: the inactive "Dataset Processing" orb, indicates that this page is not active and, if clicked, will bring you to the dataset processing page.
B. The Record Report summary: top-level information about this record as well as record download and viewing links.
C. Tier Navigation Orbs: you can toggle between the content and the media tier report from here.
D. Content Tier Information: data about the record's content tier.
E. Media Navigation Orbs: you can navigate multiple media items from here.
F. Processing Errors: record processing error information appears here.
G. Page Indicator: indicates that "Record Report" is the current page (via its orange colour) and that the form below is “clean” (via its tick icon).
...
8.3 Content Tier Media Information
The media information appears under the content tier breakdown section. If there are 5 or fewer items, then a navigation orb corresponding to each item will appear. The icon of each navigation orb illustrates the type of media item, as shown below.
...
If there are more than 5 media items available in the record report then the navigation orbs will be replaced with navigation arrows, an editable field and a spinner allowing you to browse the items or jump directly to a specific one, as shown below.
...
...
8.4 Metadata Tier Information
You can see the record report’s metadata tier information by clicking on the metadata tier navigation orb. Metadata tier information is split into three sub-sections:
Language dimension
Enabling Elements Dimension
Contextual Classes Dimension
These, like the main sections of the report, are navigable by clicking on the corresponding navigation orb.
Active language dimension
...
Active enabling elements dimension
...
Active contextual classes dimension
...
...
9 Problem patterns
You can view problem patterns for both a dataset and for a record. The dataset id and record id fields each have a (secondary) link labelled “Issues”.
...
Clicking “Issues (Overview)”, next to the dataset id input field (A) , will open a problem viewer page for the whole dataset. Clicking “Issues (Record)” (B)will open a problem viewer page for an individual record.
9.1 Dataset / Overview
The problem pattern viewer for datasets shows all the problem types that occur within a given dataset.
...
A key is shown (P1, P2, P3 etc.) together with a list of records in which that problem pattern was found. The little arrows at the top-right corner may be used to navigate between the different problem patterns.
The record-references behave as (internal) links to the separate instance of the problem pattern viewer used for records (with the exception of the references for P1, as they are not displayable for individual records).
The problem pattern report can be downloaded using the “export as pdf” link.
The 8 problem patterns that are in use now are:
Key | Title | Description |
P1 | Systematic use of the same title. | Check across all records if there are any duplicate titles, ignoring letter (upper or lower) case. |
P2 | Equal title and description fields. | Check whether there is a title - description pair for which the values are equal, ignoring letter (upper or lower) case. |
P3 | Near-Identical title and description fields. | Determine whether there is a title - description pair for which the values are too similar (or if one contains the other). We do this ignoring the letter case. |
P5 | Unrecognisable title. | Apply heuristics to determine whether a title is not human-readable. We check whether there are at most 5 characters that are not either alphanumeric or simple spaces. We also check whether the value fully contains a dc:identifier value. |
P6 | Non-meaningful title. | Check whether the record has a title of 2 characters or less as a rough heuristic of whether a title is meaningful. |
P7 | Missing description fields. | Check whether the record is lacking a description (or only has empty descriptions). |
P9 | Very short description. | Check whether the record has a description of 50 characters or less. |
P12 | Extremely long titles. | Check whether the record has a title of more than 70 characters. |
Info |
---|
You can see that the numbering of the problem patterns is not consecutive. There are more problem patterns identified than in use at the moment. The number of problem patterns might change in the future. |
Click on the title of a specific problem pattern to see a description.
...
9.2 Record
The problem pattern viewer for records shows all the types of problem patterns that occur within a single record.
...
Note that two of the page indicators in the image above show the same icon - one for each instance of the problem pattern viewer.
If you click on the “</>” button to the right of the problem pattern viewer, a panel expands that provides access to download links for the record.
...
...
10 Tier Zero Records
You will be warned if your dataset contains any records that have a “tier zero” rating, either for the content tier or the metadata tier in the track tab of the dataset processing page.
...
One or two indicators will be shown on the right side of the screen whenever a “tier zero” record was detected. The first is for records with content tier zero (the orb with stars), the second for records with metadata tier zero (the orb with a gauge). Only one may appear, or both, as appropriate.
...
Click the warning indicators to see the tier-zero warning panel. This panel will show links to at most 10 sample records that were detected as having content or media tier 0.
...
These links open the Record Report (see above) for the clicked record, opening the relevant subsection of the report according to whether the tier zero warning pertained to the content-tier or the metadata-tier. The small yellow triangular warning icons will be visible until the warnings have been reviewed. Only one warning is present in the image above, because the content tier zero records have already been viewed.
...
11 Troubleshooting
Dataset not found
...
Every two weeks the sandbox is emptied. It is highly possible that the dataset has been removed because of this.