is a Data Source.
The NCBI Reference Sequence Database (RefSeq) provides a comprehensive, integrated, non-redundant, well-annotated set of reference sequences including genomic, transcript, and protein sequences for naturally occurring molecules of the central dogma.
genomics, biomedical, biological systems
Warning: No license entered
Unknown
Unknown
| ID | Name | URL | Category | Format | Description |
|---|---|---|---|---|---|
| ncbigene.gene_refseq_uniprotkb_collab | Gene RefSeq UniProtKB Collaboration Data | gene_refseq_uniprotkb_collab.gz (1.1 GB) | MappingProduct | tsv | Gene to RefSeq/UniProtKB collaboratio... |
| ncbigene.gene2refseq | Gene to RefSeq Mapping | gene2refseq.gz (1.9 GB) | MappingProduct | tsv | Gene to RefSeq mapping data providing... |
| clinicalkg.graph | CKG Graph Dump | 1 | GraphProduct | mixed | Neo4j database dump of the Clinical K... |
| cancer-genome-interpreter.clinicalkg.graph | CKG Graph Dump | 1 | GraphProduct | mixed | Neo4j database dump of the Clinical K... |
| genecards.gene.annotations ⚠ | GeneCards Gene Annotations | www.genecards.org | Product | http | Integrated gene annotation data aggre... |
| string.protein.links | STRING Protein Links | protein.links.v12.0.txt.gz (128.7 GB) | GraphProduct | txt | protein network data (full network, s... |
| string.protein.links.detailed | STRING Protein Links Detailed | protein.links.detailed.v12.0.txt.gz (189.6 GB) | GraphProduct | txt | protein network data (full network, i... |
| string.protein.links.full | STRING Protein Links Full | protein.links.full.v12.0.txt.gz (199.6 GB) | GraphProduct | txt | protein network data (full network, i... |
| string.protein.physical.links | STRING Protein Physical Links | protein.physical.links.v12.0.txt.gz (11.1 GB) | GraphProduct | txt | protein network data (physical subnet... |
| string.protein.physical.links.detailed | STRING Protein Physical Links Detailed | protein.physical.links.detailed.v12.0.txt.gz (13.8 GB) | GraphProduct | txt | protein network data (physical subnet... |
| string.protein.physical.links.full | STRING Protein Physical Links Full | protein.physical.links.full.v12.0.txt.gz (14.5 GB) | GraphProduct | txt | protein network data (physical subnet... |
| string.cog.links | STRING COG Links | COG.links.v12.0.txt.gz (176.8 MB) | GraphProduct | txt | association scores between orthologou... |
| string.cog.links.detailed | STRING COG Links Detailed | COG.links.detailed.v12.0.txt.gz (238.7 MB) | GraphProduct | txt | association scores (incl. subscores p... |
| string.database | STRING Database Network Schema | network_schema.v12.0.sql.gz (262.2 GB) | GraphProduct | ❔ | full database, part II: the networks ... |
| ckg.graph | CKG Graph Database Dump | 1 | GraphProduct | neo4j | Graph database dump and additional re... |
The NCBI Reference Sequence Database (RefSeq) provides a comprehensive, integrated, non-redundant, well-annotated set of reference sequences including genomic, transcript, and protein sequences. RefSeq standards serve as a foundation for functional annotation of genomes and provide stable reference points for mutation analysis, gene expression studies, and polymorphism discovery.
RefSeq provides reference sequence standards for naturally occurring molecules of the central dogma, from chromosomes to mRNAs to proteins. The database includes:
When using RefSeq data, please cite:
For questions, updates, or collaborations:
Created: July 17, 2025 | Last modified: August 07, 2025