Evaluation
Evaluation - of both technical aspects of search such as ranking and more general considerations such as user-satisfaction and usability - is a significant concern for Europeana.
In the DSI projects, the first attempts came from MS30 Search improvement plan, with reporting of progress in MS31 Search improvement report.
The first attempts to define an overall strategy for evaluation of search in Europeana have been done during DSI-2. They are presented in DSI-2 D6.3 Search Improvement Report.
In DSI-3 the progress has been reported in the various iterations of deliverables C.2 'usage patterns' and C.3 'data access patterns': C.3 M4, C.2 M8, C.2 M12. Other interesting reports (not done by the R&D team) can be found in C.2 M4, C.3 M8 and C.3 M12 (e.g, top 10 searches).
In DSI-4 the progress has been reported in the deliverable C.2 Users and Usage Report M5.
In 2015 the company 904Labs created a ground truth from the information collected in our logs. The code to run performance evaluation based on this ground truth can be found in Github. Additional internal discussions about the creation of the ground truth:
- Statistical In-Depth Analysis of Europeana’s Query Logs
- Progress Report for Query Analysis
- Bots distribution
- List of query reformulations
- Search evaluation on sampled queries and inferred relevance assessments
- Top 20 queries distributed per language
- Minutes of calls: https://basecamp.com/1768384/projects/5774755/documents
- Testing of the service
Currently the evaluation of the search performance is done based on the user behaviour (clicks as a criteria to define relevance) as collected in our log system, and the code is also on Github. First, the interaction of the users is downloaded with this software. Second, the data is analyzed, previously using ad-hoc code in Python, and currently (since 2019) with a combination of ad-hoc Java code plus the standard software for Information Retrieval Evaluation TREC-EVAL.The complete procedure to reproduce this process is available here.
Several studies have been conducted to describe the users, usage, and information-seeking behaviour of Europeana users, with the purpose of improving the performance and/or usability of the search functionality:
- Europeana 2012-2013: usage and performance update
- http://ciber-research.eu/download/20130623-Europeana_2013_usage_and_performance_update.pdf
- http://ciber-research.eu/Europeana/EuropeanaConnect_D3.1.3_LogAnalysisReport-0.99.p
- Improving Europeana Search Experience Using Query Logs
- http://link.springer.com/chapter/10.1007%2F978-3-642-24469-8_39http://miles.isti.cnr.it/~nardini/wp-content/uploads/2011/06/tpdl2011.pdf
- Qualitative Analysis of Search Patterns and Success in The European Library by Humboldt. Looking at search success, query reformulations, etc. https://www.researchgate.net/publication/280924943_How_We_Are_Searching_Cultural_Heritage_A_Qualitative_Analysis_of_Search_Patterns_and_Success_in_The_European_Library
Some related reports are focused exclusively on the usability of the Europeana Website in general:
- Evaluation Report of the Usability of the Europeana Website
http://pro.europeana.eu/c/document_library/get_file?uuid=ae1d74de-29c1-463c-887e-a6bc6ee0ed7a&groupId=10602 - User Centric Evaluation of the Europeana Digital Library
http://link.springer.com/chapter/10.1007%2F978-3-642-13654-2_19 - Europeana Connect for analysing users’ attitudes and needs and opens new ways of discovering cultural heritage in Europeana: see the results
- [Internal document] Chenchen Sheng's report on Design for User Engagement in Europeana Collection
In this document we can find information about current and past internal evaluations of our services.
Additional general resources: