ChEA-KG
ChEA-KG is a comprehensive knowledge graph that integrates chromatin immunoprecipitation sequencing (ChIP-seq) data from the ChEA database with transcription factor binding information, target gene annotations, and regulatory network data to provide detailed insights into transcriptional regulation mechanisms across diverse biological contexts.
Key Features
Comprehensive ChIP-seq Integration
- Over 100,000 ChIP-seq experiments from public repositories
- Transcription factor binding sites mapped across the human genome
- Cell type and tissue-specific regulatory landscapes
- Condition-specific transcriptional programs
Multi-Species Coverage
- Human transcription factor binding data with extensive coverage
- Mouse regulatory networks for comparative analysis
- Cross-species transcription factor ortholog mapping
- Conservation analysis of regulatory relationships
Regulatory Network Construction
- Direct transcription factor-target gene relationships
- Co-regulatory transcription factor modules
- Hierarchical regulatory cascades and feedback loops
- Tissue-specific regulatory network topologies
Data Sources
ChIP-seq Repositories
- ENCODE Project chromatin immunoprecipitation data
- GEO ChIP-seq datasets with standardized processing
- Roadmap Epigenomics Project regulatory landscapes
- CISTROME database curated ChIP-seq experiments
Transcription Factor Annotations
- UniProt transcription factor functional classifications
- Gene Ontology transcriptional regulation terms
- Transcription factor family classifications (TF-Class)
- DNA binding domain structural information
Genomic Annotations
- RefSeq gene models and transcript isoforms
- GENCODE comprehensive gene annotations
- Regulatory element annotations from ENCODE
- Chromatin state segmentations across cell types
Expression Data Integration
- GTEx tissue-specific gene expression profiles
- Single-cell RNA-seq expression atlases
- Perturbation experiments with transcription factor modulation
- Time-course expression studies of regulatory dynamics
Data Types
- Peak coordinates from ChIP-seq experiments
- Binding strength and confidence scores
- Motif occurrence within binding regions
- Chromatin accessibility at binding sites
Regulatory Relationships
- Direct transcription factor-target gene pairs
- Binding distance to transcription start sites
- Correlation between binding and target gene expression
- Context-specific regulatory interactions
Functional Annotations
- Gene Ontology enrichment for target gene sets
- Pathway enrichment analysis for regulatory modules
- Disease associations of transcription factor networks
- Drug target information for regulatory proteins
Comparative Data
- Cross-cell type binding conservation
- Species-specific transcription factor binding patterns
- Evolution of regulatory network architectures
- Regulatory divergence between related cell types
Applications
Regulatory Network Analysis
- Reconstruction of cell type-specific regulatory networks
- Identification of master transcription factors and regulatory hubs
- Analysis of transcriptional regulatory cascades
- Network-based prediction of gene expression changes
Functional Genomics
- Transcription factor enrichment analysis for gene lists
- Prediction of upstream regulators from expression signatures
- Integration of ChIP-seq and RNA-seq data for mechanism discovery
- Identification of regulatory biomarkers for disease states
Drug Discovery
- Target identification through regulatory network analysis
- Mechanism of action prediction for transcriptional modulators
- Drug repurposing based on transcriptional signatures
- Biomarker discovery for drug response prediction
Systems Biology
- Multi-omics integration incorporating transcriptional regulation
- Modeling of regulatory network dynamics
- Prediction of cellular responses to perturbations
- Understanding of transcriptional control in development and disease
Technical Implementation
ChEA-KG is implemented as a Neo4j graph database with nodes representing transcription factors, genes, cell types, and experimental conditions. Relationships capture binding events, regulatory interactions, and functional associations with confidence scores based on experimental evidence quality and reproducibility across studies.
Automated Evaluation