is an Aggregator.
NCBI's portal to information about conditions, phenotypes, and findings in humans related to medical genetics. Aggregates and organizes data from multiple authoritative sources including UMLS, OMIM, HPO, Mondo, Orphanet, GeneReviews, PharmGKB, and community submissions to GTR and ClinVar. Each concept is assigned a distinct Concept Unique Identifier (CUI) and integrated with related information from clinical resources, genetic testing registries, medical literature, and molecular resources.
genomics, clinical, health, biomedical
Unknown
infores:medgen
Unknown
| ID | Name | URL | Category | Format | Description |
|---|---|---|---|---|---|
| medgen.portal | MedGen Portal | medgen | GraphicalInterface | http | Main web portal for searching and bro... |
| medgen.search | Advanced Search | advanced | GraphicalInterface | http | Advanced search interface with suppor... |
| medgen.mgconso | MGCONSO (Concept Names) | MGCONSO.RRF.gz (15.1 MB) | Product | txt | Rich Release Format (RRF) file contai... |
| medgen.mgdef | MGDEF (Definitions) | MGDEF.RRF.gz (4.8 MB) | Product | txt | Rich Release Format (RRF) file contai... |
| medgen.mgrel | MGREL (Relationships) | MGREL.RRF.gz (14.9 MB) | MappingProduct | txt | Rich Release Format (RRF) file contai... |
| medgen.mgsat | MGSAT (Attributes) | MGSAT.RRF.gz (11.2 MB) | Product | txt | Rich Release Format (RRF) file contai... |
| medgen.mgsty | MGSTY (Semantic Types) | MGSTY.RRF.gz (1.6 MB) | Product | txt | Rich Release Format (RRF) file contai... |
| medgen.names | NAMES | NAMES.RRF.gz (3.0 MB) | Product | txt | Rich Release Format (RRF) file contai... |
| medgen.id-mappings | MedGen ID Mappings | MedGenIDMappings.txt.gz (5.7 MB) | MappingProduct | txt | Mappings between MedGen CUIs and exte... |
| medgen.hpo-mapping | MedGen HPO Mapping | MedGen_HPO_Mapping.txt.gz (380.5 KB) | MappingProduct | txt | Mappings between MedGen and Human Phe... |
| medgen.hpo-omim-mapping | MedGen HPO OMIM Mapping | MedGen_HPO_OMIM_Mapping.txt.gz (3.9 MB) | MappingProduct | txt | Combined mappings between MedGen, HPO... |
| medgen.pubmed-links | MedGen PubMed Links | medgen_pubmed_lnk.txt.gz (228.8 MB) | Product | txt | Links between MedGen concepts and Pub... |
| medgen.cui-history | MedGen CUI History | MedGen_CUI_history.txt (86.6 KB) | Product | txt | History file tracking changes to MedG... |
| medgen.uid-cui-history | MedGen UID CUI History | MedGen_UID_CUI_history.txt (56.5 MB) | Product | txt | History of mappings between MedGen UI... |
| medgen.hpo-history | HPO CUI History | HPO_CUI_history.txt (1.2 MB) | Product | txt | History file tracking changes to HPO ... |
| medgen.mondo-history | Mondo CUI History | MONDO_CUI_history.txt (989.1 KB) | Product | txt | History file tracking changes to Mond... |
| medgen.ordo-history | ORDO CUI History | ORDO_CUI_history.txt (1.1 MB) | Product | txt | History file tracking changes to Orph... |
| medgen.sources | MedGen Sources | MedGen_Sources.txt (10.1 KB) | Product | txt | Information about source databases an... |
| medgen.merged | MERGED (Merged CUIs) | MERGED.RRF.gz (46.5 KB) | Product | txt | Merged CUI mappings showing concept c... |
| medgen.csv | CSV Data Files | csv | Product | csv | CSV format data files directory with ... |
| medgen.help | MedGen Documentation | overview | DocumentationProduct | http | Documentation, help pages, and user g... |
| medgen.faq | FAQ | faq | DocumentationProduct | http | Frequently asked questions about MedGen |
| medgen.readme | README | README.txt (16.9 KB) | DocumentationProduct | txt | README file with detailed information... |
| medgen.presentations | MedGen Presentations | presentations | DocumentationProduct | http | Directory of MedGen presentations rel... |
| ID | Name | URL | Category | Format | Description |
|---|---|---|---|---|---|
| pheknowlator.graph | PheKnowLator graph | knowledge_graphs?pageState=(%22StorageObjectListTable%22:(%22f%22:%22%255B%255D%22))&inv=1&invt=Ab5_1Q&project=pheknowlator | GraphProduct | owl | PheKnowLator graph files, including s... |
| ncbigene.mim2gene_medgen | MIM to Gene MedGen Mapping | mim2gene_medgen (932.6 KB) | MappingProduct | tsv | MIM to Gene and MedGen mapping data c... |
| unibiomap.links | UniBioMap Graph Links | unibiomap.links.csv (1.3 GB) | GraphProduct | csv | Core UniBioMap graph edges file. |
| unibiomap.auxs | UniBioMap Graph Auxiliaries | unibiomap.auxs.tsv (563.9 MB) | GraphProduct | tsv | Auxiliary UniBioMap graph annotations... |
| unibiomap.pred | UniBioMap Predicted Graph | unibiomap.pred.csv (2.3 GB) | GraphProduct | csv | Predicted UniBioMap graph edges with ... |
| unibiomap.pred.full | UniBioMap Predicted Graph (Full) | unibiomap.pred.full.csv (5.9 GB) | GraphProduct | csv | Full unfiltered UniBioMap predicted g... |
MedGen is NCBI’s portal to information about conditions, phenotypes, and findings in humans related to medical genetics. It serves as a comprehensive aggregator that organizes and integrates information from multiple authoritative sources, providing a unified view of genetic conditions and their relationships.
MedGen includes diseases such as Mendelian disorders, multi-factorial disorders, chronic disease susceptibilities, somatic phenotypes, and pharmacogenetic responses. The database also includes infectious disease terms to support submitters of the NIH Genetic Testing Registry (GTR) and clinicians looking for tests and terms related to infectious agents in human samples.
Terms from various sources including GTR and ClinVar submissions, Unified Medical Language System (UMLS), Online Mendelian Inheritance in Man (OMIM), Human Phenotype Ontology (HPO), Mondo Disease Ontology, rare disease terms from Orphanet Rare Disease Ontology (ORDO), and other sources are integrated into unique concepts. Each concept is assigned a distinct identifier called the Concept Unique Identifier (CUI) and a preferred name.
The core content of a concept record includes:
MedGen aggregates data from multiple authoritative sources with varying update frequencies:
| Source | Update Frequency | Primary Data Types | Coverage |
|---|---|---|---|
| UMLS | 2x/year | CUI, descriptions, name and ID | 96% |
| HPO | Monthly | Name and ID, Phenotype:Disease relationships | 8% |
| Mondo | Monthly | Name, ID; Orphanet and GARD | 10% |
| Orphanet (ORDO) | Monthly | Name, ID, Mode of Inheritance | 4% |
| OMIM | Daily | Name, ID, Gene:Disease, description | 5% |
| GeneReviews | Weekly | Name, Gene:Disease, description | 0.4% |
| PharmGKB | Monthly | Drug name, ID, Drug:Gene relationships | 0.5% |
| MedGen (internal) | Daily | Name, CN CUI, Drug:Disease, Disease:Disease | 2% |
| NCBI Gene | Daily | Gene symbol, chromosome location | N/A |
Note: MedGen terms can have one or more external identifiers/sources mapped. Percentages are averaged from data analyzed in 2024.
MedGen primarily uses CUIs from the UMLS dataset. When a CUI cannot be found in UMLS to match a record in MedGen, an NCBI-generated CUI is provided instead. These begin with “CN” to clarify they are not UMLS-provided CUIs (which all begin with “C”). CN-type CUIs may be created from submissions in the NIH Genetic Testing Registry or ClinVar that do not match a record in UMLS.
MedGen provides standardized terminology for pharmacogenomics, describing the interaction between an individual’s genetic code and medication response. Using data from PharmGKB, MedGen creates disease records describing abnormal responses to drugs driven by genetic or environmental factors. These terms are created and maintained by MedGen with links to expert clinical recommendations, FDA-approved drug labels, and PharmGKB pages.
MedGen employs both automated and manual curation processes:
When new versions of source data become available, automated pipelines download and process the data, making local copies. The relevant data is subset, existing terms are updated as needed, and new terms are added. Mapping is done between identifiers or concept preferred names, or both, depending on the data sources involved.
The manual curation process follows a standard decision tree to evaluate source terms and potential matches in MedGen. When source data is unclear, curators contact the source to clarify term scope and meaning. MedGen terms are curated to align with sources by splitting or merging terms or creating novel MedGen records as needed.
MedGen integrates with several NCBI resources:
MedGen provides extensive downloadable data through its FTP site, including:
All major data files are provided with gzip compression to reduce file sizes while maintaining complete data integrity.
Created: July 17, 2025 | Last modified: February 26, 2026