rtx-kg2

is a Knowledge Graph.

It is part of the Translator collection.

RTX-KG2 is a comprehensive biomedical knowledge graph that integrates information from over 80 structured knowledge sources into a semantically standardized model, supporting translational biomedicine and the ARAX biomedical reasoning system.

Domains

health, biomedical, biological systems, genomics, pharmacology

License

CC BY 4.0

Homepage

rtx-kg2

Repository

GitHub

Infores ID

infores:rtx-kg2

FAIRsharing ID

Unknown

Product Summary

Contacts

Products

From this Resource
ID Name URL Category Format Description
rtx-kg2.graph.nodes RTX-KG2.10.1c KGX JSONL Nodes kg2c-2.10.1-v1.0-nodes.jsonl.gz (359.1 MB) GraphProduct kgx-jsonl Nodes for KGX distribution of the RTX...
rtx-kg2.graph.edges RTX-KG2.10.1c KGX JSONL Edges kg2c-2.10.1-v1.0-edges.jsonl.gz (1.7 GB) GraphProduct kgx-jsonl Edges for KGX distribution of the RTX...
rtx-kg2.code Code for building RTX-KG2 RTX-KG2 ProcessProduct Code for building RTX-KG2, in Python
rtx-kg2.neo4j RTX-KG2 Neo4j arax.ncats.io ProgrammingInterface Neo4j distribution of the RTX-KG2 as ...

Details

RTX-KG2: A Semantically Standardized Knowledge Graph for Translational Biomedicine

RTX-KG2 is the second-generation knowledge graph for the ARAX biomedical reasoning system, developed by the Reasoning Tool X (RTX) team. It integrates information from over 80 biomedical knowledge sources, including ontologies, drug databases, gene-disease associations, protein interactions, and more.

Key Features

  • Comprehensive Integration: Combines data from diverse sources including ChEMBL, DrugBank, DisGeNET, UMLS, Gene Ontology, and many others
  • Semantic Standardization: All entities and relationships are mapped to the Biolink Model framework
  • Rich Connectivity: Contains millions of nodes and edges representing biomedical entities and their relationships
  • Multiple Formats: Available in JSON, KGX format, and as a Neo4j graph database
  • Supporting Translational Medicine: Designed to enable advanced reasoning across complex biomedical knowledge

Knowledge Sources

RTX-KG2 integrates data from numerous biomedical knowledge sources, including but not limited to:

Data Sources

  • ChEMBL: A manually curated database of bioactive molecules with drug-like properties
  • DrugBank: A comprehensive database containing detailed drug and drug target information
  • KEGG: The Kyoto Encyclopedia of Genes and Genomes
  • Reactome: A curated pathway database
  • DisGeNET: A database of gene-disease associations
  • SemMedDB: A repository of semantic predications extracted from biomedical literature
  • UMLS: The Unified Medical Language System
  • UniProtKB: Universal Protein Resource Knowledgebase
  • HMDB: Human Metabolome Database
  • IntAct: Molecular interaction database
  • NCBIGene: NCBI’s gene database
  • SMPDB: Small Molecule Pathway Database

Ontologies

  • Gene Ontology (GO): Comprehensive compendium of gene and gene product attributes
  • MONDO: Monarch Disease Ontology
  • Human Phenotype Ontology (HP): Standardized vocabulary of phenotypic abnormalities
  • ChEBI: Chemical Entities of Biological Interest
  • UBERON: Integrated cross-species anatomy ontology
  • NCBITaxon: NCBI Taxonomy
  • Disease Ontology (DO): Standardized ontology for human disease

Applications

RTX-KG2 serves as the foundation for biomedical reasoning systems, drug repurposing research, and knowledge discovery in the NCATS Biomedical Data Translator project. The knowledge graph enables complex question answering about drugs, diseases, genes, and their relationships.

Evaluation

Is this information incorrect or incomplete? Request an update.

Created: March 09, 2025 | Last modified: October 30, 2025