Evaluation

Evaluation - of both technical aspects of search such as ranking and more general considerations such as user-satisfaction and usability - is a significant concern for Europeana. 

In the DSI projects, the first attempts came from MS30 Search improvement plan, with reporting of progress in MS31 Search improvement report.

The first attempts to define an overall strategy for evaluation of search in Europeana have been done during DSI-2. They are presented in DSI-2 D6.3 Search Improvement Report.

In DSI-3 the progress has been reported in the various iterations of deliverables C.2 'usage patterns' and C.3 'data access patterns': C.3 M4, C.2 M8, C.2 M12. Other interesting reports (not done by the R&D team) can be found in C.2 M4, C.3 M8 and C.3 M12 (e.g, top 10 searches).

In DSI-4 the progress has been reported in the deliverable C.2 Users and Usage Report M5.

In 2015 the company 904Labs created a ground truth from the information collected in our logs. The code to run performance evaluation based on this ground truth can be found in Github. Additional internal discussions about the creation of the ground truth:

Currently the evaluation of the search performance is done based on the user behaviour (clicks as a criteria to define relevance) as collected in our log system, and the code is also on Github. First, the interaction of the users is downloaded with this software. Second, the data is analyzed, previously using ad-hoc code in Python, and currently (since 2019) with a combination of ad-hoc Java code plus the standard software for Information Retrieval Evaluation TREC-EVAL.The complete procedure to reproduce this process is available here.

Several studies have been conducted to describe the users, usage, and information-seeking behaviour of Europeana users, with the purpose of improving the performance and/or usability of the search functionality:

Some related reports are focused exclusively on the usability of the Europeana Website in general:

In this document we can find information about current and past internal evaluations of our services.

Additional general resources: