is a Data Source.
SemMedDB is a repository of semantic predications (subject-predicate-object triples) extracted from biomedical literature by SemRep, a natural language processing system. It contains over 130 million semantic predications extracted from more than 37 million PubMed citations, supporting biomedical knowledge discovery, literature-based discovery, and clinical applications.
biomedical, literature, health, clinical, drug discovery, genomics, pharmacology
Unknown
infores:semmeddb
Unknown
| ID | Name | URL | Category | Format | Description |
|---|---|---|---|---|---|
| semmeddb.semrep.tool | SemRep NLP System | SemRep_download.html | ProcessProduct | ❔ | The SemRep natural language processin... |
| ID | Name | URL | Category | Format | Description |
|---|---|---|---|---|---|
| rtx-kg2.graph.nodes | RTX-KG2.10.1c KGX JSONL Nodes | kg2c-2.10.1-v1.0-nodes.jsonl.gz (359.1 MB) | GraphProduct | kgx-jsonl | Nodes for KGX distribution of the RTX... |
| rtx-kg2.graph.edges | RTX-KG2.10.1c KGX JSONL Edges | kg2c-2.10.1-v1.0-edges.jsonl.gz (1.7 GB) | GraphProduct | kgx-jsonl | Edges for KGX distribution of the RTX... |
| rtx-kg2.neo4j | RTX-KG2 Neo4j | arax.ncats.io | ProgrammingInterface | ❔ | Neo4j distribution of the RTX-KG2 as ... |
| epigraphdb.graph | EpiGraphDB Graph Database | graph-database | GraphProduct | neo4j | Integrated graph knowledge base combi... |
| translator.semmeddb.graph | Translator SemMedDB KGX Graph | latest | GraphProduct | kgx-jsonl | KGX JSONL graph package for SemMedDB ... |
| translator.translator_kg.graph | Translator Aggregate KGX Graph | latest | GraphProduct | kgx-jsonl | Aggregated KGX JSONL graph package co... |
The Semantic MEDLINE Database (SemMedDB) is a large-scale repository of semantic predications (subject-predicate-object triples) extracted from the biomedical literature by the SemRep natural language processing system. It provides a structured representation of biomedical knowledge contained in PubMed citations, where concepts are normalized to the Unified Medical Language System (UMLS) Metathesaurus, and their relationships are based on the UMLS Semantic Network.
SemMedDB version 43 (VER43_R) is the final update to the database (as of May 2024), containing data extracted from MEDLINE BASELINE 2022 with PubMed update files through May 8, 2024. The resource is being deprecated and will no longer be maintained as of December 31, 2024, though an archived version will remain available through the Internet Archive.
SemMedDB has the following main tables:
SemMedDB has been used for numerous biomedical knowledge discovery applications including:
SemMedDB is available for download from the National Library of Medicine. A UMLS Terminology Services (UTS) account is required to access the downloads. SemMedDB version 43 (VER43_R) is the final update to the database (as of May 2024).
SemRep is the underlying natural language processing system that extracts semantic predications for SemMedDB. It combines syntactic and semantic principles with structured biomedical domain knowledge contained in the Unified Medical Language System (UMLS) to extract semantic relations from biomedical text. SemRep has been developed at the U.S. National Library of Medicine.
These tools will no longer be maintained as of December 31, 2024. Archived webpage can be found at the Internet Archive. The Indexing Initiative Github repository is under development. Contact NLM Customer Service if you have questions.
Created: May 30, 2025 | Last modified: January 23, 2026