Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Table of Contents

Introduction

The International Image Interoperability Framework (IIIF) combines  is a set range of APIs standards that offer an alternative and complementary means to group, search and access information about the digital objects (aside from the Search and Record APIs) specifically tailored for the display of Newspaper materials. These APIs follow the specifications defined by the IIIF consortium and therefore can also be suitable for other materials that can have full text content besides Newspaperswere initially designed to offer a standardized and scallable way to describe how digital objects composed of one or more images can be displayed to an end-user. This offers the flexibility to developers and users to respectively develop and use different IIIF compatible viewers according to their needs and preferences. Since its first release, it has grown to cover other types of media such as sound and video, and soon also 3D.

The Europeana APIs offer for IIIF covers the Presentation API (versions 2.1 & 3), the Content Search API (version 1) and make use of the IIIF Image API when made available by content providers. Additionally, Europeana’s IIIF offer also includes support for fulltext (such as transcriptions, translations of transcriptions, captions and subtitles) via the IIIF Fulltext API using an EDM extension for fulltext. These APIs follow the specifications defined by the IIIF consortium.

Before starting to use these APIs, we recommend reading the Registering for an API key and reading the Terms of Use. If you want to get started with these APIs, go directly to the Getting Started section or try it out directly on the Console. If you want to get regular updates about the Europeana API, provide feedback and discuss it with other developers, we suggest to join the Europeana API discussion group at Google Groups.

IIIF datasets in Europeana: A Scholar’s delight

The recently launched Europeana Media player brings Europeana into a new era of International Image Interoperability Framework (IIIF) compatible, interoperable and unified playout of audiovisual heritage material online. 

IIIF & Europeana Working Group

This dedicated IIIF & Europeana Working Group follows the first of the proposals from the Task Force 'Preparing Europeana for IIIF involvement.'

Impact Assessment report: EuropeanaTech and IIIF

This assessment looked at the impact of EuropeanaTech’s members, steering group and the Europeana Initiative’s work on IIIF - read a summary of the research and download the full report. 

Getting Started

Retrieving a manifest

A manifest describes the information needed for a viewer to present display a digital object to the end-user, such as basic metadata such as the a title and description, and the sequence of views/imagescontent that makes part of the digital object. The manifest is not meant to present all the descriptive metadata associated to a given digital object but just the bare minimum for a user to grasp what it is about. If you wish to access the full metadata for an item, see the Record API. The manifest also offers links to the Record API using the “seeAlso“ field.

Presently, only both version 2.1 & 3 of the IIIF specification is supported but very soon also the v3 will become availableare supported. The novelty of version 3 is that it also covers audio and video besides images.

Request

Code Block
https://iiif.europeana.eu/presentation/[RECORD_ID]/manifest

Parameter

Description

RECORD_ID

The identifier of the record which is composed of the dataset identifier plus a local identifier within the dataset in the form of "/DATASET_ID/LOCAL_ID", for more detail see Europeana ID.

Response

Expand
titleResponse fields

Parameter

Datatype

Description

@context

String (URL)

The URL of the JSON-LD context (always with the value "http://iiif.io/api/presentation/2/context.json").

@id

String (URI)

The canonical identifier of the Manifest.

@type

String

The type of the resource. Always set to "sc:Manifest".

label

Array (LangObject)

The title(s) of the Item.

description

Array (LangObject)

The description(s) of the Item.

metadata

Metadata

A short list of metadata values.

thumbnail

Image

The thumbnail as defined in the edm:preview of the record.

navDate

String (xsd:dateTime)

The issue date of the Newspaper Item.

attribution

String

A human readable label that must be displayed when the item is displayed or used, presenting the copyright or ownership statements and an acknowledgement of the owning and/or publishing institution.

license

String (URI)

One of the rights statements defined for use in Europeana. It defines the copyright, usage and access rights that apply to this digital object.

logo

String (URL)

An image depicting the Europeana logo.

seeAlso

Array (Dataset)

sequences

Array (Sequence)

An array containing one only sequence.

Sequence

@id

String (URI)

The canonical identifier of the Sequence.

@type

String

The type of the resource. Always set to "sc:Sequence".

label

String

A label for the sequence. Always set to "Current Page Order".

startCanvas

String (URI)

The URI of the first canvas to be displayed. Typically the first page./td>

canvases

Array (Canvas)

An ordered list of Canvases.

Canvas

@id

String (URI)

The canonical identifier of the Canvas.

@type

String

The type of the resource. Always set to "sc:Canvas".

label

String

A label for the Canvas.

height

Number

The height of the canvas which corresponds to the height of the image./td>

width

Number

The width of the canvas which corresponds to the width of the image./td>

attribution

String

A human readable label that must be displayed when the item is displayed or used, presenting the copyright or ownership statements and an acknowledgement of the owning and/or publishing institution.

license

String (URI)

One of the rights statements defined for use in Europeana. It defines the copyright, usage and access rights that apply to this digital object.

images

Array (Annotation)

A list of one annotation that represents the projection of the image into the canvas where it is displayed.

otherContent

Array (Fulltext)

A list of one reference to the Annotation Page that holds all the full-text for that Newspaper page.

Annotation (modelling construct used for painting media into a Canvas)

@id

String (URI)

The canonical identifier of the Annotation.

@type

String

The type of the resource. Always set to "oa:Annotation".

motivation

String

A motivation for the annotation, in this case declaring that the image will be projected (ie. "sc:painting") on to the Canvas.

resource

Object (AnnotationBody)

The image resource being projected into the Canvas.

on

String (URI)

The identifier of the Canvas on which the image will be projected.

AnnotationBody

@id

String (URL)

The URL of the image.

@type

String

The type of the resource. Always set to "dctypes:Image".

format

String

The mimetype of the resource.

service (optional)

Object (Service)

The Service hosting and delivering the IIIF resource, when applicable.

Service

@context

String (URL)

The URL of the JSON-LD context (always with the value "http://iiif.io/api/image/2/context.json").

@id

String (URL)

The URL of the IIIF service hosting/serving the resource.

profile

String (URI)

The URI of the version supported by the service.

Metadata

label

String

The name of a metadata property. One of: date, format, relation, type, language, source.

value

Array (LangObject)

The value of the metadata property.

Image

@id

String

The URL of the image, ie. thumbnail.

@type

String

The type of the resource. Always set to "dctypes:Image".

width

Number

The width of the image.

height

Number

The height of the image.

Dataset

@id

String

The URL to the metadata in a specific format.

format

String

The mimetype of the format.

profile

URL

The url of the profile. Always set to "http://www.europeana.eu/schemas/edm/".

Example: Requesting a manifest in v2.1

Request:

Code Block
https://iiif.europeana.eu/presentation/9200356/BibliographicResource_3000118390149/manifest?wskey=YOUR_KEY
Expand
titleExample response

Code Block
{
  "@context": "http://iiif.io/api/presentation/2/context.json",
  "@id": "https://iiif.europeana.eu/presentation/9200396/BibliographicResource_3000118436165/manifest",
  "@type": "sc:Manifest",
  "label": [
    {
      "@value": "Journal historique et littéraire - 1788-09-01"
    }
  ],
  "metadata": [ ... ]
  "thumbnail": {
    "@type": "dctypes:Image",
    "@id": "https://api.europeana.eu/api/v2/thumbnail-by-url.json?uri=https%3A%2F%2Fiiif.europeana.eu%2Fimage%2FQNYVL2Z2FVHRGMK2UNXIP4DUUOTOVA2ND3WIPJQF6V23SO2CJ5UA%2Fpresentation_images%2Fee0edfa0-0220-11e6-a696-fa163e2dd531%2Fnode-3%2Fimage%2FBNL%2FJournal_historique_et_litt%C3%A9raire%2F1788%2F09%2F01%2F00001%2Ffull%2Ffull%2F0%2Fdefault.jpg&type=TEXT"
  },
  "navDate": "1788-09-01T00:00:00Z",
  "attribution": "Journal historique et littéraire - 1788-09-01 - https://www.europeana.eu/portal/record/9200396/BibliographicResource_3000118436165.html. National Library of Luxembourg. Public Domain - http://creativecommons.org/publicdomain/mark/1.0/",
  "license": "http://creativecommons.org/publicdomain/mark/1.0/",
  "logo": "https://style.europeana.eu/images/europeana-logo-default.png",
  "seeAlso": [ ... ]
  "sequences": [
    {
      "@id": "https://iiif.europeana.eu/presentation/9200396/BibliographicResource_3000118436165/sequence/s1",
      "@type": "sc:Sequence",
      "label": "Current Page Order",
      "startCanvas": "https://iiif.europeana.eu/presentation/9200396/BibliographicResource_3000118436165/canvas/p1",
      "canvases": [
        {
          "@type": "sc:Canvas",
          "@id": "https://iiif.europeana.eu/presentation/9200396/BibliographicResource_3000118436165/canvas/p1",
          "label": "p. 1",
          "height": 1024,
          "width": 686,
          "attribution": "Journal historique et littéraire - 1788-09-01 - https://www.europeana.eu/portal/record/9200396/BibliographicResource_3000118436165.html. National Library of Luxembourg. Public Domain - http://creativecommons.org/publicdomain/mark/1.0/",
          "images": [
            {
              "@type": "oa:Annotation",
              "@id": "https://iiif.europeana.eu/presentation/9200396/BibliographicResource_3000118436165/annotation/p1",
              "motivation": "sc:painting",
              "resource": {
                "@type": "dctypes:Image",
                "@id": "https://iiif.europeana.eu/image/QNYVL2Z2FVHRGMK2UNXIP4DUUOTOVA2ND3WIPJQF6V23SO2CJ5UA/presentation_images/ee0edfa0-0220-11e6-a696-fa163e2dd531/node-3/image/BNL/Journal_historique_et_littéraire/1788/09/01/00001/full/full/0/default.jpg",
                "service": {
                  "@id": "https://iiif.europeana.eu/image/QNYVL2Z2FVHRGMK2UNXIP4DUUOTOVA2ND3WIPJQF6V23SO2CJ5UA/presentation_images/ee0edfa0-0220-11e6-a696-fa163e2dd531/node-3/image/BNL/Journal_historique_et_littéraire/1788/09/01/00001",
                  "profile": "http://iiif.io/api/image/2/level1.json",
                  "@context": "http://iiif.io/api/image/2/context.json"
                }
              },
              "on": "https://iiif.europeana.eu/presentation/9200396/BibliographicResource_3000118436165/canvas/p1"
            }
          ],
          "otherContent": [
            {
              "@id": "https://iiif.europeana.eu/presentation/9200396/BibliographicResource_3000118436165/annopage/1"
            }
          ]
        },
        ...
      ]
    },
    ...
  ]
}

Retrieving full-text

The term full-text is meant to refer to the combination of the textual representation of the digital object plus where parts of the text is present in the original image (represented as annotations). In the EDM profile, the textual representation of the digital object is referred to as Full-text Resource while the relations between the segments of the text and the coordinates in the image are referred to as Annotations.

Annotation Pages

An Annotation Page contains all the annotations that make up the full-text of a Page (ie. image). It is referred to by the Manifest and can be accessed via the following request.

Request

Code Block
https://iiif.europeana.eu/presentation/[RECORD_ID]/manifest

Parameter

Description

RECORD_ID

The identifier of the record which is composed of the dataset identifier plus a local identifier within the dataset in the form of "/DATASET_ID/LOCAL_ID", for more detail see Europeana ID.

Response

The response is a JSON-LD structure composed of the following fields:

Expand
titleResponse fields

Parameter

Datatype

Description

@context

Array of String (URL)

The URL of the JSON-LD context (always with the values "http://iiif.io/api/presentation/3/context.json" and "https://www.europeana.eu/schemas/context/edm.jsonld").

id

String (URI)

The canonical identifier of the Annotation Page.

type

String

The type of the resource. Always set to "AnnotationPage".

items

Array (Annotation)

An array containing all the Annotation that are part of this page.

Annotation

id

String (URI)

The canonical identifier of the Annotation.

type

String

The type of the resource. Always set to "FullTextResource".

motivation

String

The motivation of the annotation, see reference for more information. Always set to "transcribing".

dcType

String

Represents the granularity level of the Annotation, reflecting levels such as: Page, Block, Line and Word.

body

Object (Body)

The reference to the transcribed text.

target

Array (String)

A target can represent an image or just a part of it that is being annotated. For most annotations that are at a level of granularity lower than Page will point to the specific coordinates where the text is found on the image, using the Media Fragments specification.

Body

id

String (URL)

The url of the transcribed text (ie. full-text resource) in case of a Page level annotation, or a segment of the transcribed text using in this case the URI Fragment Identifiers for the text/plain Media Type specification.

language (optional)

String

The language of the segment of the transcription text being annotated. It is represented as a ISO 639 language code.

Example: Requesting an Annotation Page in v2.1.

Request:

Code Block
https://iiif.europeana.eu/presentation/9200396/BibliographicResource_3000118436165/annopage/1
Expand
titleexample response
Code Block
{
  "@context": [ "http://iiif.io/api/presentation/3/context.json", "https://www.europeana.eu/schemas/context/edm.jsonld" ],
  "id": "https://iiif.europeana.eu/presentation/9200396/BibliographicResource_3000118436165/annopage/1"
  "type": "AnnotationPage",
  "items": [
    // full text annotation with no language
    {
      "id": "http://data.europeana.eu/annotation/9200356/BibliographicResource_3000100331503/a4cbbc7a0dc6b056c7bc0",
      "type": "Annotation",
      "motivation": "transcribing",
      "dcType": "Block",
      "body": {
        "id": "http://data.europeana.eu/fulltext/9200356/BibliographicResource_3000100331503/XPTO#char=0,10"
      },
      "target": [ 
        "https://iiif.europeana.eu/presentation/9200356/BibliographicResource_3000100331503/canvas/p1#xywh=13,0,16,10"
      ]
    }
    , ... 
  ]
}

Fulltext Resource

The edm:FullTextResource represents the transcription of a single page of a Newspaper. A full-text resource can be accessed separately from the Annotation List that it is associated to using the following method.

Request

Code Block
https://www.europeana.eu/api/fulltext/[RECORD_ID]/[FULLTEXT_ID]

Parameter

Description

RECORD_ID

The identifier of the record which is composed of the dataset identifier plus a local identifier within the dataset in the form of "/DATASET_ID/LOCAL_ID", for more detail see Europeana ID.

FULLTEXT_ID

The identifier of the full text resource.

Response

The response is a JSON-LD structure composed of the following fields:

Expand
titleResponse fields

Parameter

Datatype

Description

@context

String (URL)

The URL of the JSON-LD context (always with the value "https://www.europeana.eu/schemas/context/edm.jsonld").

id

String (URI)

The canonical identifier of the full-text resource.

type

String

The type of the resource. Always set to "FullTextResource".

language

String

The more predominante language of the transcription text represented as a ISO 639 language code. Parts of the text may be written in different languages. When that is the case, the language information will be indicated as part of the full-text Annotations.

value

String

The transcription text.

Example: Requesting a full-text resource.

Request:

Code Block
https://www.europeana.eu/api/fulltext/9200396/BibliographicResource_3000118435063/8ebb67ccf9f8a1dcc2ea119c60954111
Expand
titleexample response
Code Block
{
  "@context": "https://www.europeana.eu/schemas/context/edm.jsonld",
  "id": "http://data.europeana.eu/fulltext/9200396/BibliographicResource_3000118435063/8ebb67ccf9f8a1dcc2ea119c60954111"
  "type": "FullTextResource",
  "language": "nl",
  "value": "… De ondergeteekende sedert veie jaren drukker van het met Uit. Oetober 11. vervallene Nieuw A. H. _ E. Blad, heeft de eer te berigten, dat hij, bewogen met het lot van eene menigte huisge zinnen die daardoor plotseling bij den naderenden winter hun be staan hebben verloren, besloten heeft tot de uitgaaf eener nieuwe courant onder de benaming van: Het Amsterdamsche Handels- en Effectenblad. en dat hij daartoe de voorloopige medewerking heeft verkregen van belangstellenden, die van oordeel zijn, dat het bestaan van een dagblad als het gewezen Nieuw A. H. _- E. 81. voor het algemeen … "
}

Searching on full-text

The full-text can also be searched using a separate Search API while the Newspapers Thematic Collection is in MVP. It supports the same functionality as the main API but under the following endpoint and with the addition of 2 search fields and 1 profile as described below. For more information on the other methods, see the Search API documentation.

Request

Code Block
https://newspapers.eanadev.org/api/v2/search.json

Fields

Datatype

Description

fulltext

Text

Allows searching on the transcribed text (ie. full-text) of the item.

issued

Date

A true date field reflecting the date of the Newspapers Issue.

Profile

Description

hits

Displays the mentions in the transcribed text where the search keyword was found.

Parameter

Datatype

Description

hit.fl (optional)

List (String)

A comma- or space-separated list of fields from which hit highlighting should be generated. A wildcard of “*” (asterisk) can be used to match multiple fields, such as “fulltext.*” or even “*” to highlight on all fields where highlighting is possible. If omitted default to “*”.

hit.selectors (optional)

Number

Specifies the maximum number of highlighted selectors (ie. snippets in Solr) to generate per result (ie. record). If omitted defaults to 1. It is possible for any number of selectors from 1 to this value to be generated, up to a limit of 10.

Example: Searching on full-text and showing hit highlighting.

Request:

Code Block
https://newspapers.eanadev.org/api/v2/search.json?query=paris&profile=hits&wskey=APIKEY
Expand
titleexample response
Code Block
{
  "apikey": "api2demo",
  "success": true,
  "requestNumber": 999,
  "itemsCount": 12,
  "totalResults": 439085,
  "items": [ ... ],
  "hits": [
    {
      "scope": "/9200303/BibliographicResource_3000059897585",
      "selectors": [
        {
          "field": "rdf:value",
          "exact": "pāri",
          "prefix": "Vi sai mūsu lielajai valstij partijas XVIII kon ferencei par godu gājis ",
          "suffix": " varens sociālis tiskās sacensības vilnis. "
        }
      ]
    },
    ...
  ]
}

Accessing images in high resolution: downloading data

To foster the reuse of the data that is published in Europeana as part of the Newspapers Thematic Collections, we make both the metadata and the full-text available for bulk download as compressed zip files. The metadata is available as CC0 the same way as all the metadata exposed via the API (see Terms of Use) while the full-text is available as Public Domain Mark.

List of datasets

The table below lists all the datasets that are published and available for download. If you are looking for the complete text of a Newspaper then we suggest using the (4) option, as opposed to using (3) where the trascription is partioned per page.

Given the fact that the files are very big and can take many hours to download, as an alternative to download directly via the browser, you can login to the FTP server at "download.europeana.eu" with username "anonymous". This will allow you to resume if the download gets stuck.

dataset number

Metadata1

Full-text (ALTO)2

Page level full-text (EDM)3

Issue level full-text (EDM)4

9200300

download

 (229M) (MD5)

download

 (63G) (MD5)

download

 (116G) (MD5)

download

 (113G) (MD5)

9200301

download

 (37M) (MD5)

download

 (13G) (MD5)

download

 (20G) (MD5)

download

 (20G) (MD5)

9200338

download

 (213M) (MD5)

download

 (158G) (MD5)

download

 (278G) (MD5)

download

 (277G) (MD5)

9200339

download

 (39M) (MD5)

download

 (11G) (MD5)

download

 (21G) (MD5)

download

 (17G) (MD5)

9200355

download

 (212M) (MD5)

download

 (97G) (MD5)

download

 (159G) (MD5)

download

 (157G) (MD5)

9200356

download

 (137M) (MD5)

download

 (40G) (MD5)

download

 (17G) (MD5)

download

 (17G) (MD5)

9200357

download

 (23M) (MD5)

download

 (5G) (MD5)

download

 (9G) (MD5)

download

 (9G) (MD5)

9200396

download

 (4M) (MD5)

download

 (849M) (MD5)

download

 (2G) (MD5)

download

 (1G) (MD5)

Legend:

  1. The original metadata in EDM XML format before being ingested into Europeana. There are slight differences between this data and the one published. For more information see the /wiki/spaces/EF/pages/2385313809.

  2. The full-text encoded using ALTO (Analyzed Layout and Text Object) as it was delivered to Europeana. The ALTO is an open XML Schema meant to describe text coming from OCR and layout information of pages for digitized material. For more information see the official documentation page at the Library of Congress.

  3. The full-text encoded using the EDM profile for IIIF fullltext after being preprocessed for publication in Europeana. A note that as opposed to the format used by the API (ie. JSON-LD), the data is in RDF/XML as it is the format used for ingestion into Europeana.

  4. Very similar to (3) but wih the full-text represented at the Issue level. This means that the edm:FullTextResource will convey the complete transcription of the Newspaper.

Dataset structure

On each compressed zip file, there will typically be a file per each item (ie. metadata or issue level full-text) or page (ie. ALTO and page level full-text) with the following structure:

Item

DATASET_ID/LOCAL_ID.xml

Page

DATASET_ID/LOCAL_ID/PAGE_ID.xml


That structure can be translated into links to the Europeana Collection portal where the item can be displayed or into the several APIs described on this page.

Changelog

As mentioned in the introduction, the IIIF APIs are made up of several distinct APIs, each one with its own project in GitHub and changelog as listed below.

API

Last version

Description

IIIF Manifest API

0.3-alpha (2018-11-27)

Supports only the retrieval of manifests.

IIIF Full-text API

0.5 (2018-11-12)

Supports the retrieval of full-text, both the Annotation Pages and Full-text resources.