msigdb

is a Data Source.

The Molecular Signatures Database (MSigDB) is a comprehensive collection of tens of thousands of annotated gene sets for use with Gene Set Enrichment Analysis (GSEA) software. MSigDB includes curated gene sets from pathway databases, gene ontology annotations, hallmark gene sets, immunologic signatures, regulatory target sets, and cell type-specific signatures derived from single-cell sequencing studies. Available for both human and mouse, with the current public site listing Human MSigDB v2026.1.Hs and Mouse MSigDB v2026.1.Mm.

Homepage

msigdb

Repository

Unknown

Infores ID

infores:msigdb

FAIRsharing ID

Unknown

Product Summary

Products

From this Resource
ID Name URL Category Format Description
msigdb.browser MSigDB Web Browser msigdb GraphicalInterface http Web interface for browsing, searching...
msigdb.downloads.human MSigDB Human Gene Sets Downloads downloads.jsp#msigdb Product mixed Downloadable gene set files in GMT, X...
msigdb.downloads.mouse MSigDB Mouse Gene Sets Downloads downloads.jsp#msigdb Product mixed Downloadable gene set files for mouse...
msigdb.collections.human MSigDB Human Collections Documentation collections.jsp DocumentationProduct http Human MSigDB collection descriptions ...
msigdb.collections.mouse MSigDB Mouse Collections Documentation collections.jsp DocumentationProduct http Mouse MSigDB collection descriptions ...
msigdb.investigate MSigDB Investigate Tool annotate.jsp GraphicalInterface http Interactive tool for computing overla...
msigdb.gene_families MSigDB Gene Families Tool gene_families.jsp GraphicalInterface http Tool for categorizing gene set member...
From other Resources
ID Name URL Category Format Relation Description
ubkg.neo4j UBKG Neo4j Docker Distribution ubkg-downloads.xconsortia.org GraphProduct had primary source Turnkey neo4j distributions that depl...
ubkg.csv UBKG Ontology CSV Files ubkg-downloads.xconsortia.org GraphProduct csv had primary source Ontology CSV files that can be import...
obo-db-ingest.msigdb.tsv msigdb Nodes TSV msigdb.tsv (3.0 MB) Product tsv had primary source msigdb Nodes TSV
prokn.msigdb.msigdb.chr_band_contains_gene.gene.edges ProKN MSIGDB Chromosome Band Edges DDKG_MSIGDB.MSigDB.CHR_BAND_CONTAINS_GENE.Gene.edges.csv (4.9 MB) GraphProduct csv had primary source MSIGDB chromosome band contains gene ...
prokn.msigdb.msigdb.has_marker_gene.gene.edges ProKN MSIGDB Marker Gene Edges DDKG_MSIGDB.MSigDB.HAS_MARKER_GENE.Gene.edges.csv (23.7 MB) GraphProduct csv had primary source MSIGDB marker gene edges
prokn.msigdb.msigdb.has_signature_gene.gene.edges ProKN MSIGDB Signature Gene Edges DDKG_MSIGDB.MSigDB.HAS_SIGNATURE_GENE.Gene.edges.csv (1.6 MB) GraphProduct csv had primary source MSIGDB signature gene edges
prokn.msigdb.msigdb.pathway_associated_with_gene.gene.edges ProKN MSIGDB Pathway Gene Edges DDKG_MSIGDB.MSigDB.PATHWAY_ASSOCIATED_WITH_GENE.Gene.edges.csv (120.1 MB) GraphProduct csv had primary source MSIGDB pathway associated with gene e...
prokn.msigdb.msigdb.targets_expression_of_gene.gene.edges ProKN MSIGDB Targets Expression Edges DDKG_MSIGDB.MSigDB.TARGETS_EXPRESSION_OF_GENE.Gene.edges.csv (149.2 MB) GraphProduct csv had primary source MSIGDB targets expression of gene edges
cfde-gse.graph CFDE-GSE Knowledge Graph GraphProduct neo4j had primary source Neo4j knowledge graph containing inte...
cfde-gse.genesets CFDE Gene Set Collections downloads Product had primary source Standardized gene set collections fro...
pathwaycommons.biopax Integrated BioPAX Model pc-biopax.owl.gz (1.6 GB) Product biopax was derived from PC v14 integrated BioPAX Level 3 unif...
harmonizome.downloads Harmonizome Downloads download Product mixed was derived from Harmonizome 3.0 processed dataset dow...
harmonizome.kg-neo4j Harmonizome Knowledge Graph Neo4j Database harmonizome-kg.maayanlab.cloud GraphProduct neo4j was derived from Neo4j knowledge graph serialization o...
pathwaycommons.downloads Pathway Commons Data Downloads v14 Product mixed was derived from Download directory for Pathway Common...
pathwaycommons.sif SIF Network Format pc-hgnc.sif.gz (9.4 MB) Product sif was derived from PC v14 Simple Interaction Format netw...
pathwaycommons.gmt GMT Gene Set Format pc-hgnc.gmt.gz (256.4 KB) Product was derived from PC v14 Gene Matrix Transposed gene se...
pathwaycommons.txt Extended SIF TXT Format pc-hgnc.txt.gz (110.3 MB) Product txt was derived from PC v14 tab-delimited extended SIF nod...
biobtree.api BioBTree REST API api ProgrammingInterface http had primary source REST API for searching identifiers an...

Details

Molecular Signatures Database (MSigDB)

Overview

The Molecular Signatures Database (MSigDB) is a comprehensive resource of tens of thousands of annotated gene sets designed for use with Gene Set Enrichment Analysis (GSEA) software. Developed as a joint project between UC San Diego and the Broad Institute, MSigDB provides systematic gene set collections that enable researchers to interpret genome-wide expression profiles and identify coordinate changes in gene expression.

Collections

Human Collections (v2026.1.Hs)

  • Hallmark Gene Sets: Coherently expressed signatures representing well-defined biological states or processes
  • Curated Gene Sets: From pathway databases, PubMed publications, and domain experts
  • Regulatory Target Gene Sets: microRNA seed sequences and transcription factor binding sites
  • Computational Gene Sets: Mined from large cancer-oriented expression datasets
  • Gene Ontology Gene Sets: Genes annotated by the same ontology term
  • Oncogenic Signature Gene Sets: From cancer gene perturbations
  • Immunologic Signature Gene Sets: Cell states and perturbations in the immune system
  • Cell Type Signature Gene Sets: Cluster markers from single-cell sequencing studies
  • Positional Gene Sets: Chromosome cytogenetic band locations

Mouse Collections (v2026.1.Mm)

  • Mouse-ortholog versions of hallmark gene sets
  • Curated gene sets from pathways and literature
  • Gene ontology annotations
  • Immunologic signatures
  • Cell type signatures from single-cell studies
  • Regulatory target gene sets
  • Positional gene sets

Features

  • Browse and search gene sets by name, keyword, or collection
  • Examine individual gene set annotations and member genes
  • Compute overlaps between custom gene sets and MSigDB collections
  • Categorize genes by gene families
  • View expression profiles in public compendia
  • Integration with NDEx biological network repository
  • Download gene sets in multiple formats (GMT, XML, etc.)

Access

  • Free registration required for downloads and web tools
  • Used to track usage for funding agency reports
  • No charge for academic and non-commercial use

Use with GSEA

MSigDB gene sets are designed for direct use with GSEA (Gene Set Enrichment Analysis) software to identify whether predefined sets of genes show statistically significant, concordant differences between two biological states.

Funding

Currently funded by NCI’s Informatics Technology for Cancer Research (ITCR) program.

Community Contributions

MSigDB welcomes suggestions and contributions of new gene sets from the research community.

Is this information incorrect or incomplete? Request an update.

Created: October 30, 2025 | Last modified: June 02, 2026