Search API Documentation
The Search API provides a way to search for metadata records and media on the Europeana repository. For example, you would use the Search API to get a response to the query give me all the results for the word "Vermeer". Additionaly, it provides an alternative method using the OpenSearch.RSS protocol for easier integration with external services.
The Search API is the easiest API to use and understand. It interacts with Europeana's data in much the same way as the Europeana website does. You can search for keywords, and the API will return all records that match that keyword. You can refine your search with more advanced filters and advanced query syntax. You can choose to only return objects with certain copyright statements, or you can choose to return the results in a language of your choice. This means that with the Search API, you can get a response to the query: 'Give me all objects by Vermeer that are openly licensed and have high-resolution images.'
Before starting to use this API, we recommend reading the overview of the Europeana Data Model, registering for an API key, and reading the Terms of Use. If you want to get started with this API, go to the Getting Started section or try some calls using our Swagger Console.
Getting Started
Request
Every call to the Search API is an HTTPS request using the following base URL:
On top of this base URL, you need two required parameters to make a successful Search API request: a query and an API key. to input these required parameters, use “q=” and “wskey=” attached to that URL, using a question mark “?” to separate the parameters from the base URL and an ampersand “&” to separate parameters from each other
api.europeana.eu/record/v2/search.json?query=Vermeer
Below you’ll find a table with the other standard parameters you can use in your API Search request:
Response
A response from the Search API is always formatted in JSON and will contain fields that present information about the handling of the request, while the concrete information about the record is presented in the "items" field (see Metadata Sets).
Error Responses
An error occurring during processing of an API method is reported by (1) a relevant HTTP status code, (2) a value of the success field and (3) a meaningful error message in the error field. The following table shows the fields appearing within an error response:
The following kinds of error codes can be returned by the Record API:
Query, Filter, and Faceting Fields
Search Fields outside EDM
In addition to the fields defined in EDM, a handful of other administrative fields can also be used to search.
Language-specific Search Fields
In EDM, most of the properties that accept a Literal may be language tagged, meaning the field has a tag that describes the language of the text using the ISO 639-2 standard. To allow for a language-specific search on such properties, the Search API defines a field for each of the language variations that appear in our repository while keeping the base field with all the values in all language variations. As opposed to the base field which typically has datatype Text (some fields may also be defined as String), the language-specific fields are always of type String to allow for faceting with the complete value (with no tokenization), see “datatypes for search fields” below for more details. If a language-specific field is part of a metadata set, it can also be output in the response (see “Language-Specific Result Fields” under the “Metadata Sets” Heading).
The following table shows the base and language-specific search fields for the dc:creator property:
Search Field | Search Datatype | Result Field |
---|---|---|
Text | dcCreator | |
proxy_dc_creator.* | String | dcCreatorLangAware |
Search Fields defined in EDM
EDM defines an extensive list of classes and properties. In the Search API only a subset of these, corresponding to the ones found to be the most commonly used, can be used to search in the repository. These fields are listed in this section.
The ML (ie. multilingual) column of the table below marks the fields that have multilingual variations. To learn more about the type of information that these fields should hold, please refer to the EDM Definition.
Aggregated Fields
Europeana aggregates its data from cultural institutions that can use diverse, fine-grained systems and methodologies. As a result, a link between for example an object and a person may be stored in different specialized fields. To provide simpler views on this data, Europeana has introduced several general Aggregated Fields, such as: title, who, what, when, and where. In these fields, we gather together information from different record fields to make the discovery of objects easier. Title, for example, aggregates data from the dc:title and dcterms:alternative fields which are part of Dublin Core, a popular general standard for describing different types of resources.
Media Search
The Search API allows not only to search on and retrieve metadata added by curators but also offers powerful features based on technical metadata. Technical metadata is metadata which is extracted from media files such as images and videos which are associated with records, such as the width and height of an image. This allows you to search for and filter Europeana records by media information, for instance to only search for records which have extra large images, high-quality audio files, or images that match a particular colour. Besides searching and filtering, faceting is also possible using technical metadata and is part of the default facets provided by the facet profile.
A Europeana metadata record can contain a reference to zero, one or more media files, this means that when a search is made on a technical metadata property or facet (such as image size), a record is returned if one of the media files present in the record match the search query. The following table lists the fields that relate to the metadata extracted from the media resources:
Colour palette
From all records with images, the six most prominent colours are extracted. These colours are then mapped to one of the 120 colours that can be found in the listing here. To search for records where one of the images matches a particular colour you can use the colour palette parameter, you can provide it multiple times. You need to provide a Hex rgb code as value, such as #8A2BE2 or #FFE4C4.
Datatypes for Search Fields
The following datatypes are defined for the search fields used for querying, filtering and faceting.
Reusability
The possible values of the reusability parameter are shown in the following Table:
Profiles
A profile typically determines how extensive the response will be, by either dictating the metadata fields that will be present (ie. minimal, standard and rich) or appending additional data elements such as facets or breadcrumbs. Most facets can be combined with the exception of the metadata facets or combined facets such as rich. The following table lists the profiles supported by the API:
Breadcrumbs
A collection of search queries that were applied to your call.
Metadata Sets
Each item in a search result is represented by a subset of the fields from the corresponding metadata record. The extent of the fields that are present is determined by the Profile chosen.
Result Fields outside EDM
In addition to the fields defined in EDM, a handful of other fields were defined for administrative reasons that are output in the response.
Language-specific Result Fields
The same way as there are separate language-specific fields for searching, there is also a way to distinguish language-specific values for the response. Such fields always end with the suffix "LangAware" and are represented as LangMap. In order to preserve backwards compatibility we have not changed the original fields. This means that fields such as title, description and creator now appear twice in the search response, one with their original field name (dcTitle) and one as a multilingual labelled list (dcTitleLangAware). In the future, we will replace the single-value fields with the correct multilingual ones.
The following table shows the base and language-specific result fields for the dc:creator property:
Result Field | Result Datatype | Search Field |
---|---|---|
Array (String) | proxy_dc_creator | |
LangMap | proxy_dc_creator.* |
Result Fields
The table below lists all the fields that are output by the search divided per profile (metadata set).
JSON Datatypes
The JSON output of this API uses the following datatypes:
Faceting
The number of records that Europeana contains is very big and growing. Therefore we need efficient ways to allow our users to discover what they need easily. One such technique is a faceted indexing system that classifies each record along multiple dimensions. The facets, seen on the side of europeana.eu, can be useful for filtering search results and can also be used by API clients. If you conduct a search for the keyword "paris" and have a look at the TYPE facet, this facet would tell how many items exist within your search result grouped by TYPE (such as IMAGE, VIDEO etc.). All search fields can also be faceted on.
When you search within your result set for a specific facet, the other items in your facet would still exist (if you search for TYPE:IMAGE, then you can still see how many results there are for TYPE:VIDEO etc.). This last functionality, called multi-facets, is not supported for the Technical Metadata fields.
Requesting Facets
Facets can be requested by either setting the facets or the portal profiles with the profile parameter. By default, a predefined set of facets is returned corresponding to the facets seen on the side of the europeana.eu, which correspond to the following search fields:
TYPE, LANGUAGE, COMPLETENESS, CONTRIBUTOR, COUNTRY, DATA_PROVIDER, LANGUAGE, PROVIDER, RIGHTS, UGC, YEAR, COLOURPALETTE, MIME_TYPE, REUSABILITY, IMAGE_SIZE, SOUND_DURATION, VIDEO_DURATION, TEXT_FULLTEXT, LANDINGPAGE, MEDIA, THUMBNAIL, IMAGE_ASPECTRATIO, IMAGE_COLOUR, VIDEO_HD, SOUND_HQ
Facet objects in the Response
When requested, facets appear on the response within the facets field as an Array of Facet objects, which are composed by the following fields:
Individual Facets
It is also possible to select which facets to retrieve beyond (or instead of) the default facet set, via the facet
parameter.
Parameter | Datatype | Description |
---|---|---|
facet | String | A name of an individual field or a comma separated list of fields |
The value of the parameter could be "DEFAULT" (which is a shortcut for the default facet set) or any search field. A remainder that search fields with datatype Text are indexed as tokenized terms which imply that facet values and counts will reflect such terms as opposed to the whole value (ie. phrase) like in the remaining datatypes. This is the reason why the language-specific search fields were added with type string so that faceting could be done on the complete values. These are the fields actually used by the Europeana Collections Portal to display the facet values on the side.
We have aligned the logic for faceting across all fields in the API output to be consistent. Previously, faceting on the 'default' facets (such as TYPE, or RIGHTS) would use a different logic than faceting on custom fields (such as proxy_dc_creator). The difference is that now all other values in a list of facet values are returned (multi-facet).
Multiple Individual Facets
A client can request one or more facets in a single query. This can be done by either duplicating the facet parameter or by combining all the fields needed for faceting as a comma-separated String.
Offset and limit for Facets
A client can request how many facet values to retrieve, and which should be the first one. These parameters can be used to page over all facet values without requesting too many facet values at a time. The table below explains these two parameters. The FACET_NAME constant stands for the field for which the limit applies.
Pagination
The Search API offers two ways of paginating through the result set: basic and cursor-based pagination. The basic pagination is suitable for smaller or user-facing browsing applications which allows for the iteration over the first 1000 results using the start parameter. For larger and/or harvesting applications, the API offers the capability to use cursor-based pagination which allows for a quick iteration over the entire result set.
Pagination | Capabilities | Implementation |
---|---|---|
Basic | Allows to go to a specific offset/page (start=X). | Use the start parameter to set the search result offset, default value is 1. |
Cursor-based | Quickly iterate over the entire result set. | Set the cursor parameter to * to start cursor-based pagination at page 1. |
Query Syntax
Europeana uses the Apache Solr platform to index its data and therefore Apache Lucene Query Syntax is inherently supported by the Search API, although the Solr eDismax query parser is the one currently used by default in the search engine. Advanced users are encouraged to use Lucene and Apache SOLR guides to get the most out of the Europeana repository. For others, we supply a basic guide for querying Europeana.
Basic and phrase search
To look for records that contain a search term in one of the data fields, provide the term as a query parameter:
Syntax: "Mona Lisa"
https://api.europeana.eu/record/v2/search.json?query="Mona Lisa"
Note that like in many other search applications omitting the quotes will result in searching for records that contain the term Mona and the term Lisa but not necessarily both of them together or in that order. We can allow the existence of a number of other words in between by adding that number after the quotes. For example, searching by “Peter Rubens”~1 will return objects about Peter Rubens but also about Peter Paul Rubens.
Search by fields
If you want to limit your search to a specific data field you should provide the name of the field using the following syntax. Use parentheses ( ) to group the keywords to search for in that field. For example, to look for objects whose creator is Leonardo da Vinci:
Syntax: who:("Leonardo da Vinci")
https://api.europeana.eu/record/v2/search.json?query=who:("Leonardo da Vinci")
Boolean Search
To combine several terms in one search one can use boolean operators AND, OR, and NOT (note the case-sensitivity). Use parentheses to group logical conditions. Note that two consecutive terms without any boolean operator in between default to the AND operator.
Syntax: mona AND lisa
https://api.europeana.eu/record/v2/search.json?query=mona+AND+lisa
Boolean operators can also be combined with the search by fields. The following example searches for objects whose location is in Paris or in London:
Syntax: where:(Paris OR London)
https://api.europeana.eu/record/v2/search.json?query=where:(Paris+OR+London)
The boolean NOT operator excludes results that contain the specified word/s after it. For example, looking for objects which contain the term Lisa but do not contain the term Mona is done by the following:
Syntax: lisa NOT mona
https://api.europeana.eu/record/v2/search.json?query=lisa+NOT+mona
Wildcard search
If you are not sure of the spelling of the search terms, you can use wildcards such as * or ? These will work on all words, but not in the first letter of the word.
Wildcard - * - will find words with any number of letters in the place of the asterisk, for example ca* will find cat, cap, cane, cable, and canary.
Wildcard - ? - a single letter wildcard, for example ca?e will find cane, care, case etc.
You can use the tilde symbol - ~ - to find results with a similar spelling. For example, searching Nicolas~ will also include words Nicholaus, Nicolaas, Nikolaus, Nicola, Nicolai
Syntax: Nicolas~
https://api.europeana.eu/record/v2/search.json?query=Nicolas~
Range search
To execute range queries, the range operator should be used. This example will search for objects whose field values fall between a and z:
Syntax: [a TO z]
https://api.europeana.eu/record/v2/search.json?query=[a TO z]
As well as for textual fields it can also be used for numeric values, date ranges, or geographical areas, as shown below. Make sure you URLEncode these queries before putting them in a browser, since the square brackets cannot be part of a URL without being encoded first!
Geographical Bounding Box Search
To search for objects by their geographic location you should specify the bounding box of the area. You need to use the range operator and the pl_wgs84_pos_lat (latitude position) and pl_wgs84_pos_long (longitude position) field. The following example will bring all the objects found between the latitude of 45° and 47° and between the longitude of 7° and 8°:
Syntax: pl_wgs84_pos_lat:[45 TO 47] AND pl_wgs84_pos_long:[7 TO 8]
Timestamp Search
One can also search objects by date. Currently, full-fledge date search is supported only for the fields storing the creation (timestamp_created) and update (timestamp_update) dates of the objects in our database, which are available in two formats: the UNIX epoch timestamp and the ISO 8601 formatted date. To search for objects created or updated on a given date, use the following query:
Syntax: timestamp_created:"2013-03-16T20:26:27.168Z"
https://api.europeana.eu/record/v2/search.json?query=timestamp_created:"2013-03-16T20:26:27.168Z"
Syntax: timestamp_update:"2013-03-16T20:26:27.168Z"
https://api.europeana.eu/record/v2/search.json?query=timestamp_update:"2013-03-16T20:26:27.168Z"
Searching for date range (as [date1 TO date2]):
Syntax: timestamp_created:[2013-11-01T00:00:0.000Z TO 2013-12-01T00:00:00.000Z]
Syntax: timestamp_update:[2013-11-01T00:00:0.000Z TO 2013-12-01T00:00:00.000Z]
Date mathematics
With date mathematics you can formulate questions such as "in the last two months" or "in the previous week". The basic operations and their symbols are addition (+), substraction (-) and rounding (/). Some examples:
now = NOW
tomorrow: NOW+1DAY
one week before now: NOW-1WEEK
the start of current hour: /HOUR
the start of current year: /YEAR
The date units are: YEAR, YEARS, MONTH, MONTHS, DAY, DAYS, DATE, HOUR, HOURS, MINUTE, MINUTES, SECOND, SECONDS, MILLI, MILLIS, MILLISECOND, MILLISECONDS (the plural, singular, and abbreviated forms refer to the same unit).
Let's see how to apply it in Europeana's context.
From xxx up until now
Syntax: timestamp_created:[xxx TO NOW]
From xxx up until yesterday
Syntax: timestamp_created:[xxx TO NOW-1DAY]
Changes in the last two months
Syntax: [NOW-2MONTH/DAY TO NOW/DAY]
https://api.europeana.eu/record/v2/search.json?query=timestamp_created:[NOW-2MONTH/DAY TO NOW/DAY]
You can find more about date mathematics at Solr's API documentation
Query Refinements
So far we have dealt with examples where there was only one query parameter. Sometimes it is useful to split a query into a variable and a constant part. For instance, for an application that accesses only objects located in London, it is possible to have the constant part of the query pre-selecting London-based objects and the variable part selecting objects within this pre-selection.
This can be done using the refinement parameter qf which is appended to the request, besides the query parameter. This example looks for objects which contain the term Westminster and their location is in London:
Syntax: query=Westminster & qf=where:London
https://api.europeana.eu/record/v2/search.json?query=Westminster&qf=where:London
Currently, we can also filter the results by distance using the function distance in the parameter qf. This example will look for objects with the words world war that are located (the object itself or the spatial topic of the resource) in a distance of 200 km to the point with latitude 47 and longitude 12.
Syntax: query=world+war & qf=distance(location,47,12,200)
https://api.europeana.eu/record/v2/search.json?query=world+war&qf=distance(location,47,12,200)
We can also use more specific fields instead of location: currentLocation (with coordinates from edm:currentLocation), and coverageLocation (with coordinates from dcterms:spatial and dc:coverage). For example, qf=distance(currentLocation,47,12,200) will filter the results to those actually located within 200 km of the coordinates indicated.
Sorting
The search results are, by default, ranked by relevance according to their similarity with the contents of the query parameter. It is possible however to use the parameter sort to arrange them according to one or more fields, in ascending or descending order. This example looks for objects containing the words mona and lisa, but sort them according to the field YEAR in ascending order:
Syntax: query=mona+lisa & sort=YEAR+asc
https://api.europeana.eu/record/v2/search.json?query=mona+lisa&sort=YEAR+asc
When we refine by distance (i.e., qf=distance(...)), we can also include distance+asc or distance+desc in the sorting parameter in order to rank the results by the distance to the coordinates.
Syntax: query=world+war & qf=distance(location,47,12,200) & sort=distance+asc
Refinement and sorting parameters can be concatenated. Each such parameter and the mandatory query parameter contributes a breadcrumb object if breadcrumbs are specified in the search profile.
Open Search
Basic search function following the OpenSearch specification, returning the results in XML (RSS) format. This method does not support facet search or profiles. The names of parameters are different from other API call methods, because they match the OpenSearch standard. The OpenSearch response elements can be used by search engines to augment existing XML formats with search-related metadata. The signature of the method is as follows:
https://api.europeana.eu/record/opensearch.rss?searchTerms=TERMS&count=COUNT&startIndex=START
The following parameters are supported by this method:
For the response, see OpenSearch specification.
Libraries and Plugins
Apart from the console, there is a multitude of other ways you can interact with the API. On the libraries and plugins page, you can find libraries that allow you to develop applications with the API in your programming language of choice. Plugins make it easy to integrate the Europeana API into existing applications, such as Wordpress or Google Docs.
Deprecation Information
The following will be deprecated per the given date, ensure that your API clients are updated accordingly:
Date | Deprecation Details |
---|---|
January 2018 | As the API supports HTTPS now for a while, we will start to redirect all non-HTTPS traffic for the API to HTTPS. Ensure your applications follow redirects if needed or adjust the hostname to use HTTPS. |
Roadmap and Changelog
We deploy new versions of the portal and API quite regularly, but not all new versions result in changes in the interface. The current version of the Search API is 2.9.0 (2019-07-15). To see the changes made for this version and also all previous releases, see the API changelog in the project GitHub.
Swagger Console
The Console can be used to try out API calls for the Record and Search APIs. To perform a Search API query, select the ‘/record/v2/search.json’ method under the ‘Search’ header in the console. Once you’ve opened this API call method, don’t forget to click the ‘Try it Out’ button in the top right of the method to be able to edit the query parameters and execute an API call.
(swagger console removed because it was causing issues with very high CPU load)