Evaluation for kg-monarch

Evaluator: Not specified

Evaluated on: 2025-08-14

This is a manual evaluation intended to identify potential barriers to reuse.


Access Level and Types

QuestionAnswerComment
Access to data outside of the knowledge graphYCan access the mapping schema biolink model which provides sematics mappings and ontological relationships, can access downloadable graph dumps in SciGraph or KGX formats.
API or online access to the knowledge graphYCan access by Monarch's own webpage, API, Neo4j, downloadable Graph.
Multiple access options availableYMonarch provides multiple ways to download the data such as KGX TSV, KGX JSON lines, Neo4j Dump, etc.
Source code availabilityYCan access the source code for making the KG on Github.
Downloadable knowledge graphYKGX TSV, KGX JSON lines, Neo4j Dump, etc.

Section Score: 5/5

Provenance of Nodes and Edges

QuestionAnswerComment
Source list providedY33 heterogeneous data sources including HPOA, CTD, OMIM, Orphanet, WormBase, FlyBase, MGI, dictyBase, Xenbase, SGD, RGD, PomBase, ZFIN, NCBI, HGNC, Panther, BGeeDB, Reactome, STRING.
Source versions informationYProvides the version info of the sources used: https://data.monarchinitiative.org/monarch-kg-dev/latest/index.html
Import dependenciesYDeclared dependencies through its build infrastructure in the monarch-ingest and kg-hub Github repo. Specifically, Monarch uses Poetry for the dependencies management.
Node and edge sourcesYNodes and edges contain the most upstream source and knowledge provider information.
Edges deduplicationYDuplicate edge and node management is taken into consideration in the construction of the KG.
Triples source detailsYThe triples created are captured in multiple stages of the KG ingest pipeline including the biolink preddicate mappings and the output.
Edge type schemaYBiolink model. All relationships are mapped to Biolink predicates.

Section Score: 7/7

Documented standards, schema, construction

QuestionAnswerComment
Biological usable dataYMonarch KG is both human-readable and usable. Monarch provides its data in tablular formats that are easy to understand. Also, Monarch KG data and its Monarch initaitive webpage can be used in bioinformatics & biomedical analysis, machine learning, NLP, LLMs, visualization & exploration, etc.
Resolvable IDsYSupports resolvable and externally linked identifiders. Monarch KG uses curie-style IDs that follow identifier conventions and are mappable to URLs for resolution. Also, it includes a column called xref which contains a list of external identifiers.
Construction documentationYMonarch made great effort to document its KG construction including data ingestion, transformation, schema usage, and KG merging.
Transformation documentationYMonarch documented data transforms including the excluded nodes and dangling edges.
Schema usedYMonarch uses a documented schema, Biolink Model, and its monarch-ingest pipeline for construction.

Section Score: 5/5

Update frequency and versioning

QuestionAnswerComment
Stable versionsY
Public tracker informationYMonarch provides public tracker for feature requests, bug reports on it Github repo.
Knowledge graph contact informationYThe contact is Monarch Initiative. Contact information is provided.
Updated annuallyYThe KG is updated every month.
Prior versions accessYPrior versions are accessible. And there is a dashoboard for QC showing what the changes are between any two versions. https://qc.monarchinitiative.org/#monarch?dataset=Development&kgVersion=2025-07-09

Section Score: 5/5

Evaluation - Metrics and Fitness for Purpose

QuestionAnswerComment
Use case providedYMonarch KG is used in Exomiser which is a tool to annotate variants and has a ChatGPT plugin through RAG to support information search. It is also integrated into the GRAPE library for graph analysis and machine learning.
Evaluation against other modelsN
Defined scopeYMonarch aims to harmonize the data across the fields (gene, disease, phenotype across species) to facilitate the discovery of disease mechanism and aid the disease diagnosis.
Multiple evaluation methodsYCreated Monarch Quality Control (QC) Dashboard for quality metrics that are specific to Monarch. In addition, it developed Phenotypic Inference Evaluation Framework (PhEval) to evaluate the analysis of its tool Exomiser.
Accuracy metricsYKind of but not explicit. The edge data contains columns like provided_by, publications, and has_evidence to measure the accuracy or confidence.

Section Score: 4/5

License Information

QuestionAnswerComment
LicenseCC BY 4.0