This page gathers material and results for an experiment in the context of the EuropeanaTech task force on enrichment and evaluation.
The focus is mainly on the method, to try to define and encourage a unified approach to evaluation in our community, and maybe as a way to unify APIs (as a same script should be able to run the different enrichment services). We do not want to limit the scope to a certain country nor even a type enrichment tool (it could be a concept or person detection service).
Current participants: Daniel, Dimitris, Hugo, Nuno, Vladimir, Aitor, Rainer
A line for each enrichment (i.e. link): <ProvidedCHO>;<property>;<target>;<confidence>,<source_text> Meaning:
Example: http://data.theeuropeanlibrary.org/BibliographicResource/2000085482942;dcterms:spatial;http://dbpedia.org/resource/Prague;0.9;Praha |
Europeana | file:enrich.europeana.csv in all.zip | |
TEL | file:enrich.tel.csv in all.zip | |
LoCloud | file:loCloud.zip in all.zip | Two results: One using English background link service and another using the vocabulary match service. There is no source_text in any record, is it possible to fix this? |
Pelagios (Simon Rainer) | file:pelagios-wikidata.csv.zip in all.zip Coreferenced to Geonames and DBPedia: file:enrich.pelagios.coref.csv in all.zip | |
SILK (Daniel) | file:dct_spatial_dbpedia.csv in all.zip | Included for now only dct:spatial enrichments with DBpedia using Silk. |
Ontotext | file:ontotext.tar.gz in all.zip Coreferenced to DBPedia: file:enrich.ontotext.v1.coref.zip in all.zip file:enrich.ontotext.v2.coref.zip in all.zip | Two versions, both using Ontotext's concept extractor. Both versions return rich results for English (general concepts not limited to type), but are limited to Person and Place for other languages. The difference between the versions comes from what we consider "English" or "Other language", either using the record language or the language tag of literals. |
Agreements between results: