Evaluation for hetionet
Evaluator: Not specified
Evaluated on: 2025-08-14
This is a manual evaluation intended to identify potential barriers to reuse.
Access Level and Types
Question | Answer | Comment |
---|---|---|
Access to data outside of the knowledge graph | Y | Can access paths, DWPCs, prediction probabilities, network support breakdowns for compound–disease pairs (via Neo4j Browser & guides) |
API or online access to the knowledge graph | Y | Fully hosted on a public Neo4j instance with Cypher queries, guides, tutorials https://neo4j.het.io/browser/ |
Multiple access options available | Y | Downloadable as JSON, Neo4j DB, TSV; also query online in Neo4j Browser; source code & intermediate datasets on GitHub, Zenodo, Figshare |
Source code availability | Y | The source code and scripts are public on hetio and GitHub linked in the paper https://github.com/elifesciences-publications/hetionet |
Downloadable knowledge graph | Y | Multiple export formats (JSON, Neo4j dump, TSV) |
Section Score: 5/5
Provenance of Nodes and Edges
Question | Answer | Comment |
---|---|---|
Source list provided | Y | 29 sources documented; each node/edge carries source information in properties; full list with versions in paper https://elifesciences.org/articles/26726 |
Source versions information | Y | Versions noted: e.g., DrugBank v4.2, SIDER v4.1, LINCS L1000 (Oct 2015), Pathway Commons (with date) |
Import dependencies | Y | Input ontologies and databases fully listed with versions; also intermediate resources described (e.g., STARGEO, PharmacotherapyDB |
Node and edge sources | Y | Node/edge properties include URLs, source, license, confidence scores (for applicable edges) |
Edges deduplication | Y | Merged redundant pathways; multiple studies for same edge consolidated; non-informative gene sets removed |
Triples source details | Y | Explicit per metaedge: e.g., binding affinities (≤1 mM), co-occurrence p-values (MEDLINE), gene interaction specifics |
Edge type schema | Y | Clear metagraph with 11 node types & 24 metaedges; each with documented origin & justification |
Section Score: 7/7
Documented standards, schema, construction
Question | Answer | Comment |
---|---|---|
Biological usable data | Y | Uses standard biomedical IDs: Entrez, UMLS, MeSH, DO, Uberon |
Resolvable IDs | Y | Entrez Gene, DOID, MeSH IDs, InChIKeys used for easy cross-referencing |
Construction documentation | Y | Extensive: paper + Thinklab logs + GitHub issues + detailed guides |
Transformation documentation | Y | Explained pruning (e.g., filtering Uberon terms, merging pathways, restricting GO terms by size) |
Schema used | Y | Metagraph is the explicit schema; node and edge types clearly defined |
Section Score: 5/5
Update frequency and versioning
Question | Answer | Comment |
---|---|---|
Stable versions | Y | Partial - “v1.0” labeled, but no formal version history beyond initial |
Public tracker information | Y | Partial - Thinklab (now static); issues can be filed on GitHub |
Knowledge graph contact information | Y | Daniel Himmelstein and team, contactable via GitHub, Thinklab archives, paper |
Updated annually | N | Only v1.0 publicly released so far |
Prior versions access | N | Early versions mentioned but no archived download versions listed |
Section Score: 3/5
Evaluation - Metrics and Fitness for Purpose
Question | Answer | Comment |
---|---|---|
Use case provided | Y | Nicotine dependence (bupropion), epilepsy predictions (acamprosate) |
Evaluation against other models | Y | Compared to PREDICT, Guney et al., Cheng et al.; used baselines & permutation |
Defined scope | Y | Designed for systematic drug repurposing + broader knowledge integration |
Multiple evaluation methods | Y | DWPC + AUROC + permutation + cross-validation + external test sets (DrugCentral, ClinicalTrials.gov) |
Accuracy metrics | Y | Probability scores, cross-validated elastic net, path-level contribution breakdowns, AUROC |
Section Score: 5/5
License Information
Question | Answer | Comment |
---|---|---|
License | CC0 1.0 |