Table of Contents |
---|
Introduction
The International Image Interoperability Framework (IIIF) combines is a set range of APIs standards that offer an alternative and complementary means to group, search and access information about the digital objects (aside from the Search and Record APIs) specifically tailored for the display of Newspaper materials. These APIs follow the specifications defined by the IIIF consortium and therefore can also be suitable for other materials that can have full text content besides Newspaperswere initially designed to offer a standardized and scallable way to describe how digital objects composed of one or more images can be displayed to an end-user. This offers the flexibility to developers and users to respectively develop and use different IIIF compatible viewers according to their needs and preferences. Since its first release, it has grown to cover other types of media such as sound and video, and soon also 3D.
The Europeana APIs offer for IIIF covers the Presentation API (versions 2.1 & 3), the Content Search API (version 1) and make use of the IIIF Image API when made available by content providers. Additionally, Europeana’s IIIF offer also includes support for fulltext (such as transcriptions, translations of transcriptions, captions and subtitles) via the IIIF Fulltext API using an EDM extension for fulltext. These APIs follow the specifications defined by the IIIF consortium.
Before starting to use these APIs, we recommend reading the Registering for an API key and reading the Terms of Use. If you want to get started with these APIs, go directly to the Getting Started section or try it out directly on the Console. If you want to get regular updates about the Europeana API, provide feedback and discuss it with other developers, we suggest to join the Europeana API discussion group at Google Groups.
IIIF datasets in Europeana: A Scholar’s delight
The recently launched Europeana Media player brings Europeana into a new era of International Image Interoperability Framework (IIIF) compatible, interoperable and unified playout of audiovisual heritage material online.
IIIF & Europeana Working Group
This dedicated IIIF & Europeana Working Group follows the first of the proposals from the Task Force 'Preparing Europeana for IIIF involvement.'
Impact Assessment report: EuropeanaTech and IIIF
This assessment looked at the impact of EuropeanaTech’s members, steering group and the Europeana Initiative’s work on IIIF - read a summary of the research and download the full report.
Getting Started
Retrieving a manifest
A manifest describes the information needed for a viewer to present display a digital object to the end-user, such as basic metadata such as the a title and description, and the sequence of views/imagescontent that makes part of the digital object. The manifest is not meant to present all the descriptive metadata associated to a given digital object but just the bare minimum for a user to grasp what it is about. If you wish to access the full metadata for an item, see the Record API. The manifest also offers links to the Record API using the “seeAlso“ field.
Presently, only both version 2.1 & 3 of the IIIF specification is supported but very soon also the v3 will become availableare supported. The novelty of version 3 is that it also covers audio and video besides images.
Request
Code Block |
---|
https://iiif.europeana.eu/presentation/[RECORD_ID]/manifest |
Parameter | Description |
---|---|
RECORD_ID | The identifier of the record which is composed of the dataset identifier plus a local identifier within the dataset in the form of "/DATASET_ID/LOCAL_ID", for more detail see Europeana ID. |
Response
Expand | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Example: Requesting a manifest in v2.1
Request:
Code Block |
---|
https://iiif.europeana.eu/presentation/9200356/BibliographicResource_3000118390149/manifest?wskey=YOUR_KEY |
Expand | ||
---|---|---|
| ||
|
Retrieving full-text
The term full-text is meant to refer to the combination of the textual representation of the digital object plus where parts of the text is present in the original image (represented as annotations). In the EDM profile, the textual representation of the digital object is referred to as Full-text Resource while the relations between the segments of the text and the coordinates in the image are referred to as Annotations.
Annotation Pages
An Annotation Page contains all the annotations that make up the full-text of a Page (ie. image). It is referred to by the Manifest and can be accessed via the following request.
Request
Code Block |
---|
https://iiif.europeana.eu/presentation/[RECORD_ID]/manifest |
Parameter | Description |
---|---|
RECORD_ID | The identifier of the record which is composed of the dataset identifier plus a local identifier within the dataset in the form of "/DATASET_ID/LOCAL_ID", for more detail see Europeana ID. |
Response
The response is a JSON-LD structure composed of the following fields:
Expand | |||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| |||||||||||||||||||||||||||||||||||||||||||||
|
Example: Requesting an Annotation Page in v2.1.
Request:
Code Block |
---|
https://iiif.europeana.eu/presentation/9200396/BibliographicResource_3000118436165/annopage/1 |
Expand | ||
---|---|---|
| ||
|
Fulltext Resource
The edm:FullTextResource represents the transcription of a single page of a Newspaper. A full-text resource can be accessed separately from the Annotation List that it is associated to using the following method.
Request
Code Block |
---|
https://www.europeana.eu/api/fulltext/[RECORD_ID]/[FULLTEXT_ID] |
Parameter | Description |
---|---|
RECORD_ID | The identifier of the record which is composed of the dataset identifier plus a local identifier within the dataset in the form of "/DATASET_ID/LOCAL_ID", for more detail see Europeana ID. |
FULLTEXT_ID | The identifier of the full text resource. |
Response
The response is a JSON-LD structure composed of the following fields:
Expand | ||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| ||||||||||||||||||
|
Example: Requesting a full-text resource.
Request:
Code Block |
---|
https://www.europeana.eu/api/fulltext/9200396/BibliographicResource_3000118435063/8ebb67ccf9f8a1dcc2ea119c60954111 |
Expand | ||
---|---|---|
| ||
|
Searching on full-text
The full-text can also be searched using a separate Search API while the Newspapers Thematic Collection is in MVP. It supports the same functionality as the main API but under the following endpoint and with the addition of 2 search fields and 1 profile as described below. For more information on the other methods, see the Search API documentation.
Request
Code Block |
---|
https://newspapers.eanadev.org/api/v2/search.json |
Fields | Datatype | Description |
---|---|---|
fulltext | Text | Allows searching on the transcribed text (ie. full-text) of the item. |
issued | Date | A true date field reflecting the date of the Newspapers Issue. |
Profile | Description | |
hits | Displays the mentions in the transcribed text where the search keyword was found. | |
Parameter | Datatype | Description |
hit.fl (optional) | List (String) | A comma- or space-separated list of fields from which hit highlighting should be generated. A wildcard of “*†(asterisk) can be used to match multiple fields, such as “fulltext.*†or even “*†to highlight on all fields where highlighting is possible. If omitted default to “*â€. |
hit.selectors (optional) | Number | Specifies the maximum number of highlighted selectors (ie. snippets in Solr) to generate per result (ie. record). If omitted defaults to 1. It is possible for any number of selectors from 1 to this value to be generated, up to a limit of 10. |
Example: Searching on full-text and showing hit highlighting.
Request:
Code Block |
---|
https://newspapers.eanadev.org/api/v2/search.json?query=paris&profile=hits&wskey=APIKEY |
Expand | ||
---|---|---|
| ||
|
Accessing images in high resolution: downloading data
To foster the reuse of the data that is published in Europeana as part of the Newspapers Thematic Collections, we make both the metadata and the full-text available for bulk download as compressed zip files. The metadata is available as CC0 the same way as all the metadata exposed via the API (see Terms of Use) while the full-text is available as Public Domain Mark.
List of datasets
The table below lists all the datasets that are published and available for download. If you are looking for the complete text of a Newspaper then we suggest using the (4) option, as opposed to using (3) where the trascription is partioned per page.
Given the fact that the files are very big and can take many hours to download, as an alternative to download directly via the browser, you can login to the FTP server at "download.europeana.eu" with username "anonymous". This will allow you to resume if the download gets stuck.
dataset number | Metadata1 | Full-text (ALTO)2 | Page level full-text (EDM)3 | Issue level full-text (EDM)4 |
---|---|---|---|---|
9200300 | (229M) (MD5) | (63G) (MD5) | (116G) (MD5) | (113G) (MD5) |
9200301 | (37M) (MD5) | (13G) (MD5) | (20G) (MD5) | (20G) (MD5) |
9200338 | (213M) (MD5) | (158G) (MD5) | (278G) (MD5) | (277G) (MD5) |
9200339 | (39M) (MD5) | (11G) (MD5) | (21G) (MD5) | (17G) (MD5) |
9200355 | (212M) (MD5) | (97G) (MD5) | (159G) (MD5) | (157G) (MD5) |
9200356 | (137M) (MD5) | (40G) (MD5) | (17G) (MD5) | (17G) (MD5) |
9200357 | (23M) (MD5) | (5G) (MD5) | (9G) (MD5) | (9G) (MD5) |
9200396 | (4M) (MD5) | (849M) (MD5) | (2G) (MD5) | (1G) (MD5) |
Legend:
The original metadata in EDM XML format before being ingested into Europeana. There are slight differences between this data and the one published. For more information see the /wiki/spaces/EF/pages/2385313809.
The full-text encoded using ALTO (Analyzed Layout and Text Object) as it was delivered to Europeana. The ALTO is an open XML Schema meant to describe text coming from OCR and layout information of pages for digitized material. For more information see the official documentation page at the Library of Congress.
The full-text encoded using the EDM profile for IIIF fullltext after being preprocessed for publication in Europeana. A note that as opposed to the format used by the API (ie. JSON-LD), the data is in RDF/XML as it is the format used for ingestion into Europeana.
Very similar to (3) but wih the full-text represented at the Issue level. This means that the edm:FullTextResource will convey the complete transcription of the Newspaper.
Dataset structure
On each compressed zip file, there will typically be a file per each item (ie. metadata or issue level full-text) or page (ie. ALTO and page level full-text) with the following structure:
Item | DATASET_ID/LOCAL_ID.xml |
Page | DATASET_ID/LOCAL_ID/PAGE_ID.xml |
That structure can be translated into links to the Europeana Collection portal where the item can be displayed or into the several APIs described on this page.
Changelog
As mentioned in the introduction, the IIIF APIs are made up of several distinct APIs, each one with its own project in GitHub and changelog as listed below.
API | Last version | Description |
---|---|---|
IIIF Manifest API | Supports only the retrieval of manifests. | |
IIIF Full-text API | Supports the retrieval of full-text, both the Annotation Pages and Full-text resources. |