Evaluation for petagraph
Evaluator: Not specified
Evaluated on: 2025-08-26
This is a manual evaluation intended to identify potential barriers to reuse.
Access Level and Types
Question | Answer | Comment |
---|---|---|
Access to data outside of the knowledge graph | Y | Portions of the graph that are open licensed are available from https://osf.io/6jtc9/files/osfstorage |
API or online access to the knowledge graph | Y | (ish), neo4j dump can be downloaded if you have a UMLS license key, and there are instructions to build a neo4j from sources in the github repository. Ish, because it's not specifically hosted, you need to host it yourself. |
Multiple access options available | Y | (ish) completed graph can only be downloaded as neo4j, components to build the graph can also be downloaded, but that doesn't quite feel like it makes a yes |
Source code availability | Y | ETL of upstream sources lives in https://github.com/x-atlas-consortia/ubkg-etl/ |
Downloadable knowledge graph | Y | With UMLS api key |
Section Score: 5/5
Provenance of Nodes and Edges
Question | Answer | Comment |
---|---|---|
Source list provided | Y | Methods section of paper https://www.nature.com/articles/s41597-024-04070-w#Sec2 |
Source versions information | Y | Ish, some source versions listed in https://github.com/TaylorResearchLab/Petagraph/tree/main/Scientific_Data_2024 |
Import dependencies | Y | Ubkg etl has requirements.txt, petagraph has requirements-test.txt |
Node and edge sources | Y | . They either come from existing ontologies or have one file per datasource with edges and nodes that are being added. |
Edges deduplication | Y | Yes. Bidirectional edges are only for Concept–Concept; other edges are unidirectional. Redundancies are minimized using binning and source normalization |
Triples source details | Y | Methods of paper |
Edge type schema | Y | Custom schema defined in paper |
Section Score: 7/7
Documented standards, schema, construction
Question | Answer | Comment |
---|---|---|
Biological usable data | Y | Csv files |
Resolvable IDs | Y | Concept CUI with codes corresponding to common external ID/s |
Construction documentation | Y | Well documented github |
Transformation documentation | Y | Guidlines for formatting ontologies in user guide, method explain preprocessing steps |
Schema used | Y | Start with ontologies and standards in the UBKG and add in omics data based on their paper defined in the schema |
Section Score: 5/5
Update frequency and versioning
Question | Answer | Comment |
---|---|---|
Stable versions | N | Lists date last updated but no version |
Public tracker information | N | |
Knowledge graph contact information | Y | Contributors listed https://osf.io/6jtc9/ |
Updated annually | Y | Frequent small updates |
Prior versions access | Y | Recent acitivty documents date of all changes and person who made them but prior versions not accessible https://osf.io/6jtc9/ |
Section Score: 3/5
Evaluation - Metrics and Fitness for Purpose
Question | Answer | Comment |
---|---|---|
Use case provided | Y | Several in paper |
Evaluation against other models | N | |
Defined scope | Y | Integrating/analyzing multiomics datasets |
Multiple evaluation methods | Y | Link prediction tasks auROC, precision-recall, top tissues associated w a disease, and shortest path analysis of subgraphs |
Accuracy metrics | Y | . AUC-ROC, Precision-Recall curves, common neighbors vs random, structural metrics compared to randomized graphs |
Section Score: 4/5
License Information
Question | Answer | Comment |
---|---|---|
License | CC BY-NC-ND 4.0 |