cfde-gse

is a Knowledge Graph.

Common Fund Data Ecosystem Gene Set Enrichment (CFDE-GSE) is a knowledge graph that integrates gene set collections from multiple Common Fund programs to enable cross-program gene set enrichment analysis, functional annotation, and systems-level understanding of biological processes across diverse biomedical research domains.

Domains

biomedical, genomics, systems biology, biological systems

License

CC BY 4.0

Homepage

cfde-gse

Repository

GitHub

Infores ID

Unknown

FAIRsharing ID

Unknown

Product Summary

Products

From this Resource
ID Name URL Category Format Description
cfde-gse.portal CFDE Gene Set Enrichment Portal gse GraphicalInterface http Interactive web interface for cross-p...
cfde-gse.api CFDE-GSE API api ProgrammingInterface http RESTful API for programmatic access t...
cfde-gse.graph CFDE-GSE Knowledge Graph GraphProduct neo4j Neo4j knowledge graph containing inte...
cfde-gse.genesets CFDE Gene Set Collections downloads Product Standardized gene set collections fro...

Details

Common Fund Data Ecosystem Gene Set Enrichment (CFDE-GSE)

CFDE-GSE is a comprehensive knowledge graph and analysis platform that integrates gene set collections from multiple NIH Common Fund programs, enabling researchers to perform cross-program enrichment analysis and discover biological connections across diverse research domains and datasets.

Key Features

Cross-Program Integration

  • Gene sets from 15+ Common Fund programs including GTEx, HuBMAP, LINCS, and others
  • Standardized gene identifiers and annotations across all programs
  • Cross-reference mappings between program-specific terminologies
  • Unified metadata schema for consistent data access

Comprehensive Gene Set Collections

  • Over 100,000 curated gene sets from Common Fund studies
  • Cell type-specific gene signatures from single-cell analyses
  • Tissue-specific expression profiles from bulk RNA-seq
  • Drug perturbation signatures from LINCS
  • Disease-associated gene sets from clinical studies

Advanced Enrichment Analysis

  • Statistical enrichment analysis with multiple correction methods
  • Cross-program comparative enrichment capabilities
  • Network-based enrichment using gene-gene relationships
  • Time-course and dose-response enrichment patterns

Data Sources

Common Fund Programs

  • GTEx: Tissue-specific gene expression signatures
  • HuBMAP: Cell type-specific markers and spatial signatures
  • LINCS: Drug and genetic perturbation signatures
  • 4D Nucleome: Chromatin structure and regulatory signatures
  • Gabriella Miller Kids First: Pediatric disease gene sets
  • Metabolomics: Metabolic pathway gene associations
  • Protein Data Bank: Structural biology gene signatures
  • Stimulating Peripheral Activity to Relieve Conditions (SPARC): Organ system gene sets

External References

  • Gene Ontology biological process and molecular function terms
  • KEGG pathway annotations
  • Reactome pathway database
  • MSigDB hallmark and curated gene sets
  • DisGeNET disease-gene associations

Data Types

Expression Signatures

  • Differential expression results from RNA-seq studies
  • Cell type marker genes from single-cell RNA-seq
  • Tissue-enriched genes from bulk expression profiles
  • Development stage-specific expression patterns

Functional Annotations

  • Pathway membership and functional classifications
  • Protein-protein interaction network modules
  • Regulatory network target gene sets
  • Metabolic pathway enzyme gene groups

Perturbation Responses

  • Drug treatment response signatures
  • Genetic knockout/knockdown effects
  • Environmental stress response patterns
  • Disease state-associated expression changes

Clinical Associations

  • Disease biomarker gene sets
  • Treatment response predictive signatures
  • Prognostic gene expression profiles
  • Pharmacogenomic gene variants

Applications

Cross-Program Discovery

  • Identify shared biological themes across Common Fund studies
  • Discover novel connections between different research domains
  • Validate findings across independent datasets and methodologies
  • Generate hypotheses for collaborative research initiatives

Functional Interpretation

  • Annotate gene lists from omics experiments with Common Fund signatures
  • Identify relevant cell types and tissues for experimental findings
  • Connect molecular signatures to physiological processes and diseases
  • Prioritize genes based on multi-program evidence

Data Integration

  • Harmonize gene expression data across Common Fund programs
  • Enable meta-analyses spanning multiple research initiatives
  • Facilitate data reuse and cross-validation studies
  • Support FAIR data principles implementation

Hypothesis Generation

  • Discover unexpected biological connections through enrichment patterns
  • Identify understudied genes with consistent signatures across programs
  • Generate testable hypotheses for mechanism studies
  • Guide experimental design based on existing Common Fund data

Technical Implementation

CFDE-GSE is implemented as a Neo4j graph database with standardized nodes for genes, gene sets, studies, and biological annotations. The platform includes web-based tools for interactive enrichment analysis and programmatic APIs for computational workflows. All data processing follows CFDE metadata standards and includes comprehensive provenance tracking.

Automated Evaluation

Is this information incorrect or incomplete? Request an update.

Created: September 23, 2025 | Last modified: September 23, 2025