Evaluation for petagraph

Evaluator: Not specified

Evaluated on: 2025-08-26

This is a manual evaluation intended to identify potential barriers to reuse.


Access Level and Types

QuestionAnswerComment
Access to data outside of the knowledge graphYPortions of the graph that are open licensed are available from https://osf.io/6jtc9/files/osfstorage
API or online access to the knowledge graphY(ish), neo4j dump can be downloaded if you have a UMLS license key, and there are instructions to build a neo4j from sources in the github repository. Ish, because it's not specifically hosted, you need to host it yourself.
Multiple access options availableY(ish) completed graph can only be downloaded as neo4j, components to build the graph can also be downloaded, but that doesn't quite feel like it makes a yes
Source code availabilityYETL of upstream sources lives in https://github.com/x-atlas-consortia/ubkg-etl/
Downloadable knowledge graphYWith UMLS api key

Section Score: 5/5

Provenance of Nodes and Edges

QuestionAnswerComment
Source list providedYMethods section of paper https://www.nature.com/articles/s41597-024-04070-w#Sec2
Source versions informationYIsh, some source versions listed in https://github.com/TaylorResearchLab/Petagraph/tree/main/Scientific_Data_2024
Import dependenciesYUbkg etl has requirements.txt, petagraph has requirements-test.txt
Node and edge sourcesY. They either come from existing ontologies or have one file per datasource with edges and nodes that are being added.
Edges deduplicationYYes. Bidirectional edges are only for Concept–Concept; other edges are unidirectional. Redundancies are minimized using binning and source normalization
Triples source detailsYMethods of paper
Edge type schemaYCustom schema defined in paper

Section Score: 7/7

Documented standards, schema, construction

QuestionAnswerComment
Biological usable dataYCsv files
Resolvable IDsYConcept CUI with codes corresponding to common external ID/s
Construction documentationYWell documented github
Transformation documentationYGuidlines for formatting ontologies in user guide, method explain preprocessing steps
Schema usedYStart with ontologies and standards in the UBKG and add in omics data based on their paper defined in the schema

Section Score: 5/5

Update frequency and versioning

QuestionAnswerComment
Stable versionsNLists date last updated but no version
Public tracker informationN
Knowledge graph contact informationYContributors listed https://osf.io/6jtc9/
Updated annuallyYFrequent small updates
Prior versions accessYRecent acitivty documents date of all changes and person who made them but prior versions not accessible https://osf.io/6jtc9/

Section Score: 3/5

Evaluation - Metrics and Fitness for Purpose

QuestionAnswerComment
Use case providedYSeveral in paper
Evaluation against other modelsN
Defined scopeYIntegrating/analyzing multiomics datasets
Multiple evaluation methodsYLink prediction tasks auROC, precision-recall, top tissues associated w a disease, and shortest path analysis of subgraphs
Accuracy metricsY. AUC-ROC, Precision-Recall curves, common neighbors vs random, structural metrics compared to randomized graphs

Section Score: 4/5

License Information

QuestionAnswerComment
LicenseCC BY-NC-ND 4.0