Evaluation for wikidata
Evaluator: Automated Evaluation
Evaluated on: 2026-01-06
⚠️ Automated Evaluation: This evaluation was generated automatically using an AI-based system. It is distinct from manual evaluations curated by human experts. Please review findings carefully and report any inaccuracies.
Evaluation Criteria: This evaluation uses the KG-Registry evaluation rubric as described in Cortes et al. (2025) . The rubric assesses knowledge graphs across multiple dimensions including access, provenance, documentation, maintenance, and fitness for purpose.
Access Level and Types
| Question | Answer | Comment |
|---|---|---|
| Access to data outside of the knowledge graph | Y | Wikidata web portal at wikidata.org provides browsing and editing interface for 119+ million data items |
| API or online access to the knowledge graph | Y | SPARQL query service at query.wikidata.org supports complex semantic queries; RESTful action API available |
| Multiple access options available | Y | Five documented access methods: web portal, SPARQL endpoint, query editor, bulk dumps, and REST API |
| Source code availability | Y | Mediawiki-based platform; source code publicly available through Wikimedia organization repositories |
| Downloadable knowledge graph | Y | Complete database dumps in JSON, RDF/XML, TTL formats available as compressed archives at dumps.wikimedia.org |
Section Score: 5/5
Provenance of Nodes and Edges
| Question | Answer | Comment |
|---|---|---|
| Source list provided | Y | Data sourced from Wikipedia articles, Wikimedia projects, and external linked open data sources |
| Source versions information | N | No explicit versioning of upstream Wikipedia or external data sources; continuous updates without version tracking |
| Import dependencies | Y | Explicit relationships documented for Wikipedia article mappings and interlinking with VIAF, GND, and other identifiers |
| Node and edge sources | Y | Each item traceable to Wikimedia source; edit history provides attribution of knowledge statements |
| Edges deduplication | N | While community identifies duplicates, no formal deduplication algorithm documented; merging handled ad-hoc |
| Triples source details | N | RDF export schema partially documented but source attribution for individual statements not explicit |
| Edge type schema | N | Extensive property vocabulary used but mapping to standard ontologies (RDF, OWL) not fully formalized |
Section Score: 3/7
Documented standards, schema, construction
| Question | Answer | Comment |
|---|---|---|
| Biological usable data | Y | Wikidata includes extensive biomedical data: genes, proteins, diseases, drugs, and biological pathways |
| Resolvable IDs | Y | Wikidata IDs (Q-identifiers) are stable and resolvable; cross-references to standard identifiers (Uniprot, NCBI Gene, etc.) |
| Construction documentation | N | While edit history is transparent, formal KG construction methodology not documented; wiki-based approach |
| Transformation documentation | N | RDF/TTL dump generation process not formally documented; no data quality control procedures published |
| Schema used | N | Uses Wikidata property model; mapping to RDF schema and standard ontologies incomplete |
Section Score: 2/5
Update frequency and versioning
| Question | Answer | Comment |
|---|---|---|
| Stable versions | Y | Database dumps published regularly with dates; snapshot releases available for reproducibility |
| Public tracker information | N | Phabricator system used for tracking but not specifically scoped to Wikidata KG development |
| Knowledge graph contact information | Y | Wikimedia Foundation provides support; contact available through wikidata.org/wiki/Wikidata:Contact |
| Updated annually | Y | Continuously updated knowledge base; new dumps published regularly (weekly/monthly snapshot releases) |
| Prior versions access | Y | Historical dumps available through archive; complete edit history accessible for any item or statement |
Section Score: 4/5
Evaluation - Metrics and Fitness for Purpose
| Question | Answer | Comment |
|---|---|---|
| Use case provided | Y | Clear use cases: central structured data repository for Wikipedia, machine-readable linked data access, integration hub |
| Evaluation against other models | N | No formal comparison with other general-purpose knowledge graphs (DBpedia, YAGO); relative completeness not quantified |
| Defined scope | Y | Scope well-defined: comprehensive general knowledge base with 119+ million items covering all domains |
| Multiple evaluation methods | N | No systematic evaluation framework published; quality assessment relies on community and edit history |
| Accuracy metrics | N | No reported accuracy metrics, precision/recall, or data quality benchmarks; community-driven validation |
Section Score: 2/5
License Information
| Question | Answer | Comment |
|---|---|---|
| License |