is a Data Source.
The NCBI Reference Sequence Database (RefSeq) provides a comprehensive, integrated, non-redundant, well-annotated set of reference sequences including genomic, transcript, and protein sequences for naturally occurring molecules of the central dogma.
Warning: No license entered
Unknown
Unknown
| ID | Name | URL | Category | Format | Relation | Description |
|---|---|---|---|---|---|---|
| ncbigene.gene_refseq_uniprotkb_collab | Gene RefSeq UniProtKB Collaboration Data | gene_refseq_uniprotkb_collab.gz (1.1 GB) | MappingProduct | tsv | had primary source | Gene to RefSeq/UniProtKB collaboratio... |
| ncbigene.gene2refseq | Gene to RefSeq Mapping | gene2refseq.gz (1.9 GB) | MappingProduct | tsv | had primary source | Gene to RefSeq mapping data providing... |
| clinicalkg.graph | CKG Graph Dump | 1 | GraphProduct | mixed | had primary source | Neo4j database dump of the Clinical K... |
| cancer-genome-interpreter.clinicalkg.graph | CKG Graph Dump | 1 | GraphProduct | mixed | had primary source | Neo4j database dump of the Clinical K... |
| genecards.gene.annotations ⚠ | GeneCards Gene Annotations | www.genecards.org | Product | http | had primary source | Integrated gene annotation data aggre... |
| string.protein.links | STRING Protein Links | protein.links.v12.0.txt.gz (128.7 GB) | GraphProduct | txt | had primary source | protein network data (full network, s... |
| string.protein.links.detailed | STRING Protein Links Detailed | protein.links.detailed.v12.0.txt.gz (189.6 GB) | GraphProduct | txt | had primary source | protein network data (full network, i... |
| string.protein.links.full | STRING Protein Links Full | protein.links.full.v12.0.txt.gz (199.6 GB) | GraphProduct | txt | had primary source | protein network data (full network, i... |
| string.protein.physical.links | STRING Protein Physical Links | protein.physical.links.v12.0.txt.gz (11.1 GB) | GraphProduct | txt | had primary source | protein network data (physical subnet... |
| string.protein.physical.links.detailed | STRING Protein Physical Links Detailed | protein.physical.links.detailed.v12.0.txt.gz (13.8 GB) | GraphProduct | txt | had primary source | protein network data (physical subnet... |
| string.protein.physical.links.full | STRING Protein Physical Links Full | protein.physical.links.full.v12.0.txt.gz (14.5 GB) | GraphProduct | txt | had primary source | protein network data (physical subnet... |
| string.cog.links | STRING COG Links | COG.links.v12.0.txt.gz (176.8 MB) | GraphProduct | txt | had primary source | association scores between orthologou... |
| string.cog.links.detailed | STRING COG Links Detailed | COG.links.detailed.v12.0.txt.gz (238.7 MB) | GraphProduct | txt | had primary source | association scores (incl. subscores p... |
| string.database | STRING Database Network Schema | network_schema.v12.0.sql.gz (262.2 GB) | GraphProduct | ❔ | had primary source | full database, part II: the networks ... |
| ckg.graph | CKG Graph Database Dump | 1 | GraphProduct | neo4j | had primary source | Graph database dump and additional re... |
| oma.mapping.refseq ⚠ | OMA to RefSeq Mapping | oma-refseq.txt.gz | MappingProduct | tsv | had primary source | Mapping of OMA identifiers to RefSeq ... |
| pharmebinet.json | PharMeBINet JSON Release | content (1.8 GB) | GraphProduct | json | was derived from | PharMeBINet V2 JSON release published... |
| pharmebinet.tsv | PharMeBINet TSV Release | content (1.8 GB) | GraphProduct | tsv | was derived from | PharMeBINet V2 TSV release published ... |
| pharmebinet.graphml | PharMeBINet GraphML Release | content (1.9 GB) | GraphProduct | mixed | was derived from | PharMeBINet V2 GraphML release publis... |
| pharmebinet.neo4j | PharMeBINet Neo4j Database | content (3.6 GB) | GraphProduct | neo4j | was derived from | PharMeBINet V2 Neo4j database release... |
| pharmebinet.neo4j.dump | PharMeBINet Neo4j Dump | content (3.4 GB) | GraphProduct | neo4j | was derived from | PharMeBINet V2 Neo4j dump release pub... |
| biobtree.api | BioBTree REST API | api | ProgrammingInterface | http | had primary source | REST API for searching identifiers an... |
The NCBI Reference Sequence Database (RefSeq) provides a comprehensive, integrated, non-redundant, well-annotated set of reference sequences including genomic, transcript, and protein sequences. RefSeq standards serve as a foundation for functional annotation of genomes and provide stable reference points for mutation analysis, gene expression studies, and polymorphism discovery.
RefSeq provides reference sequence standards for naturally occurring molecules of the central dogma, from chromosomes to mRNAs to proteins. The database includes:
When using RefSeq data, please cite:
For questions, updates, or collaborations:
Created: July 17, 2025 | Last modified: August 07, 2025