Evaluation for kg-monarch
Evaluator: Not specified
Evaluated on: 2025-08-14
This is a manual evaluation intended to identify potential barriers to reuse.
Access Level and Types
Question | Answer | Comment |
---|---|---|
Access to data outside of the knowledge graph | Y | Can access the mapping schema biolink model which provides sematics mappings and ontological relationships, can access downloadable graph dumps in SciGraph or KGX formats. |
API or online access to the knowledge graph | Y | Can access by Monarch's own webpage, API, Neo4j, downloadable Graph. |
Multiple access options available | Y | Monarch provides multiple ways to download the data such as KGX TSV, KGX JSON lines, Neo4j Dump, etc. |
Source code availability | Y | Can access the source code for making the KG on Github. |
Downloadable knowledge graph | Y | KGX TSV, KGX JSON lines, Neo4j Dump, etc. |
Section Score: 5/5
Provenance of Nodes and Edges
Question | Answer | Comment |
---|---|---|
Source list provided | Y | 33 heterogeneous data sources including HPOA, CTD, OMIM, Orphanet, WormBase, FlyBase, MGI, dictyBase, Xenbase, SGD, RGD, PomBase, ZFIN, NCBI, HGNC, Panther, BGeeDB, Reactome, STRING. |
Source versions information | Y | Provides the version info of the sources used: https://data.monarchinitiative.org/monarch-kg-dev/latest/index.html |
Import dependencies | Y | Declared dependencies through its build infrastructure in the monarch-ingest and kg-hub Github repo. Specifically, Monarch uses Poetry for the dependencies management. |
Node and edge sources | Y | Nodes and edges contain the most upstream source and knowledge provider information. |
Edges deduplication | Y | Duplicate edge and node management is taken into consideration in the construction of the KG. |
Triples source details | Y | The triples created are captured in multiple stages of the KG ingest pipeline including the biolink preddicate mappings and the output. |
Edge type schema | Y | Biolink model. All relationships are mapped to Biolink predicates. |
Section Score: 7/7
Documented standards, schema, construction
Question | Answer | Comment |
---|---|---|
Biological usable data | Y | Monarch KG is both human-readable and usable. Monarch provides its data in tablular formats that are easy to understand. Also, Monarch KG data and its Monarch initaitive webpage can be used in bioinformatics & biomedical analysis, machine learning, NLP, LLMs, visualization & exploration, etc. |
Resolvable IDs | Y | Supports resolvable and externally linked identifiders. Monarch KG uses curie-style IDs that follow identifier conventions and are mappable to URLs for resolution. Also, it includes a column called xref which contains a list of external identifiers. |
Construction documentation | Y | Monarch made great effort to document its KG construction including data ingestion, transformation, schema usage, and KG merging. |
Transformation documentation | Y | Monarch documented data transforms including the excluded nodes and dangling edges. |
Schema used | Y | Monarch uses a documented schema, Biolink Model, and its monarch-ingest pipeline for construction. |
Section Score: 5/5
Update frequency and versioning
Question | Answer | Comment |
---|---|---|
Stable versions | Y | |
Public tracker information | Y | Monarch provides public tracker for feature requests, bug reports on it Github repo. |
Knowledge graph contact information | Y | The contact is Monarch Initiative. Contact information is provided. |
Updated annually | Y | The KG is updated every month. |
Prior versions access | Y | Prior versions are accessible. And there is a dashoboard for QC showing what the changes are between any two versions. https://qc.monarchinitiative.org/#monarch?dataset=Development&kgVersion=2025-07-09 |
Section Score: 5/5
Evaluation - Metrics and Fitness for Purpose
Question | Answer | Comment |
---|---|---|
Use case provided | Y | Monarch KG is used in Exomiser which is a tool to annotate variants and has a ChatGPT plugin through RAG to support information search. It is also integrated into the GRAPE library for graph analysis and machine learning. |
Evaluation against other models | N | |
Defined scope | Y | Monarch aims to harmonize the data across the fields (gene, disease, phenotype across species) to facilitate the discovery of disease mechanism and aid the disease diagnosis. |
Multiple evaluation methods | Y | Created Monarch Quality Control (QC) Dashboard for quality metrics that are specific to Monarch. In addition, it developed Phenotypic Inference Evaluation Framework (PhEval) to evaluate the analysis of its tool Exomiser. |
Accuracy metrics | Y | Kind of but not explicit. The edge data contains columns like provided_by, publications, and has_evidence to measure the accuracy or confidence. |
Section Score: 4/5
License Information
Question | Answer | Comment |
---|---|---|
License | CC BY 4.0 |