Evaluation for rtx-kg2

Evaluator: Not specified

Evaluated on: 2025-08-26

This is a manual evaluation intended to identify potential barriers to reuse.


Access Level and Types

QuestionAnswerComment
Access to data outside of the knowledge graphYRTX-KG2 is able to provide various biomedical information through queries. It is also used as backend to support ARAX's path reasoning and path ranking (https://github.com/RTXteam/RTX) .
API or online access to the knowledge graphYCan access by API query, and Neo4j.
Multiple access options availableYMultiple ways to access including downloadable versions, API (SmartAPI), web browser user interface (seems not currently working).
Source code availabilityYGithub (https://github.com/RTXteam/RTX-KG2)
Downloadable knowledge graphYDownloadable versions are available on Github

Section Score: 5/5

Provenance of Nodes and Edges

QuestionAnswerComment
Source list providedY70 sources Table 1 (UMLS, SemMedDB, ChEMBL, DrugBank, Reactome, SMPDB, and 64 additional knowledge sources).
Source versions informationYIt documents the versions of the upstream sources used (https://github.com/RTXteam/RTX-KG2/blob/master/docs/kg2-versions.md).
Import dependenciesYIn the requirements file.
Node and edge sourcesYNode's ID contains source information and edge contains primary knowledge source.
Edges deduplicationYIt provides a pre-canonicalized graph version (RTX-KG2pre, with semantically duplicated concepts) and a canonicalized version (RTX-KG2c, withthout semantically duplicated concepts)
Triples source detailsYIn the final output KG, each edge includes the source that reated that triple.
Edge type schemaYIt uses Biolink Model for as the schema standard for both nodes and edges.

Section Score: 7/7

Documented standards, schema, construction

QuestionAnswerComment
Biological usable dataYIt is used for other biological applications such as answering translational science questions, drug repositioning, identifying new therapeutic targets, and understanding drug mechanisms.
Resolvable IDsYIt uses resolvable IDs for the entities.
Construction documentationYIt has clear and step by step documentation on construction on its Github repo
Transformation documentationYIn Appendix
Schema usedYBiolink model and extract-transform-load (ETL) approach for construction.

Section Score: 5/5

Update frequency and versioning

QuestionAnswerComment
Stable versionsYIt is using semantic versioning (e.g., KG2.7.3)
Public tracker informationYProvides public tracker for requests, bug reports on it Github repo.
Knowledge graph contact informationYIt provides contact information of the KG2 Team.
Updated annuallyYOnce per month (mentioned in Discussion).
Prior versions accessYThe prior versions are accessible (https://github.com/ncats/translator-lfs-artifacts/blob/main/README.md) with documented changes (https://github.com/RTXteam/RTX-KG2/blob/master/docs/kg2-versions.md).

Section Score: 5/5

Evaluation - Metrics and Fitness for Purpose

QuestionAnswerComment
Use case providedYIt is currently being used by multiple Translator reasoning agents such as ARAX (Autonomous Relay Agent X).
Evaluation against other modelsYIt is compared to four other KGs (Hetionet, SPOKE, the SRI Reference Knowledge Graph, and ROBOKOP)
Defined scopeYIt is a part of NCATS Biomedical Data Translator project to support automated biomedical reasoning and question answering. It aims to create a semantically standardized, computable, and interoperable biomedical KG that supports translational reasoning and biomedical discovery.
Multiple evaluation methodsYIt is not only evaluated with other KGs, but also evaluated on the tools that utilize it such as ARAX, mediKanren, BioThings Explorer, and ARAGORN.
Accuracy metricsYThe nodes and edges contain evidence, provenance, and other information for measuring accuracy and confidence.

Section Score: 5/5

License Information

QuestionAnswerComment
LicenseCC BY 4.0