is a Data Source.
It is part of the BER collection.
STRING is a database of known and predicted protein-protein interactions. The interactions include direct (physical) and indirect (functional) associations derived from computational prediction, knowledge transfer between organisms, and interactions aggregated from other primary databases.
Unknown
infores:string
Unknown
| ID | Name | URL | Category | Format | Description |
|---|---|---|---|---|---|
| string.protein.links | STRING Protein Links | protein.links.v12.0.txt.gz (128.7 GB) | GraphProduct | txt | protein network data (full network, s... |
| string.protein.links.detailed | STRING Protein Links Detailed | protein.links.detailed.v12.0.txt.gz (189.6 GB) | GraphProduct | txt | protein network data (full network, i... |
| string.protein.links.full | STRING Protein Links Full | protein.links.full.v12.0.txt.gz (199.6 GB) | GraphProduct | txt | protein network data (full network, i... |
| string.protein.physical.links | STRING Protein Physical Links | protein.physical.links.v12.0.txt.gz (11.1 GB) | GraphProduct | txt | protein network data (physical subnet... |
| string.protein.physical.links.detailed | STRING Protein Physical Links Detailed | protein.physical.links.detailed.v12.0.txt.gz (13.8 GB) | GraphProduct | txt | protein network data (physical subnet... |
| string.protein.physical.links.full | STRING Protein Physical Links Full | protein.physical.links.full.v12.0.txt.gz (14.5 GB) | GraphProduct | txt | protein network data (physical subnet... |
| string.cog.links | STRING COG Links | COG.links.v12.0.txt.gz (176.8 MB) | GraphProduct | txt | association scores between orthologou... |
| string.cog.links.detailed | STRING COG Links Detailed | COG.links.detailed.v12.0.txt.gz (238.7 MB) | GraphProduct | txt | association scores (incl. subscores p... |
| string.protein.info | STRING Protein Info | protein.info.v12.0.txt.gz (1.2 GB) | Product | txt | list of STRING proteins incl. their d... |
| string.protein.sequences | STRING Protein Sequences | protein.sequences.v12.0.fa.gz (12.1 GB) | Product | fasta | sequences of the proteins in STRING (... |
| string.protein.aliases | STRING Protein Aliases | protein.aliases.v12.0.txt.gz (3.2 GB) | Product | txt | aliases for STRING proteins: locus na... |
| string.protein.homology | STRING Protein Homology | protein.homology.v12.0.txt.gz (17.4 GB) | Product | txt | SW alignment scores between proteins ... |
| string.protein.enrichment.terms | STRING Protein Enrichment Terms | protein.enrichment.terms.v12.0.txt.gz (22.0 GB) | Product | txt | list of terms associated with protein... |
| string.clusters.proteins | STRING Clusters Proteins | clusters.proteins.v12.0.txt.gz (13.1 GB) | Product | txt | hierarchical STRING clusters and thei... |
| string.clusters.info | STRING Clusters Info | clusters.info.v12.0.txt.gz (207.8 MB) | Product | txt | hierarchical STRING clusters annotations |
| string.clusters.tree | STRING Clusters Tree | clusters.tree.v12.0.txt.gz (55.3 MB) | Product | txt | hierarchical STRING clusters tree (re... |
| string.protein.network.embeddings | STRING Protein Network Embeddings | protein.network.embeddings.v12.0.h5 (17.9 GB) | Product | hdf5 | cross-species (aligned) eukaryotic pr... |
| string.protein.sequence.embeddings | STRING Protein Sequence Embeddings | protein.sequence.embeddings.v12.0.h5 (38.3 GB) | Product | hdf5 | ProtT5 eukaryotic protein sequence em... |
| string.protein.orthology | STRING Protein Orthology | protein.orthology.v12.0.txt.gz (2.0 GB) | Product | txt | hierarchical eggNOG orthologous group... |
| string.cog.mappings | STRING COG Mappings | COG.mappings.v12.0.txt.gz (720.3 MB) | Product | txt | LCA orthologous groups (COGs,NOGs,KOG... |
| string.species | STRING Species List | species.v12.0.txt (999.2 KB) | Product | txt | organisms in STRING |
| string.species.tree | STRING Species Tree | species.tree.v12.0.txt (92.8 MB) | Product | txt | STRING tree of species |
| string.database.schema | STRING Database Schema | database.schema.v12.0.pdf (51.8 KB) | DocumentationProduct | STRING database schema | |
| string.database.items | STRING Database Items Schema | items_schema.v12.0.sql.gz (42.4 GB) | Product | ❔ | full database, part I: the players (p... |
| string.database | STRING Database Network Schema | network_schema.v12.0.sql.gz (262.2 GB) | GraphProduct | ❔ | full database, part II: the networks ... |
| string.database.evidence | STRING Database Evidence Schema | evidence_schema.v12.0.sql.gz (55.4 GB) | Product | ❔ | full database, part III: interaction ... |
| string.api | STRING REST API | help?subpage=api | ProgrammingInterface | ❔ | RESTful API for programmatic access t... |
| string.web | STRING Web Interface | string-db.org | GraphicalInterface | ❔ | Web interface for searching, visualiz... |
| ID | Name | URL | Category | Format | Relation | Description |
|---|---|---|---|---|---|---|
| ubkg.neo4j | UBKG Neo4j Docker Distribution | ubkg-downloads.xconsortia.org | GraphProduct | ❔ | had primary source | Turnkey neo4j distributions that depl... |
| ubkg.csv | UBKG Ontology CSV Files | ubkg-downloads.xconsortia.org | GraphProduct | csv | had primary source | Ontology CSV files that can be import... |
| spoke.graph | SPOKE Graph | data-tools | GraphProduct | http | had primary source | The SPOKE knowledge graph containing ... |
| bioteque.embeddings | Bioteque Embeddings | embeddings | Product | ❔ | had primary source | Network embeddings of the Bioteque gr... |
| clinicalkg.graph | CKG Graph Dump | 1 | GraphProduct | mixed | had primary source | Neo4j database dump of the Clinical K... |
| kg-monarch.graph | KGX Distribution of KG-Monarch | monarch-kg.tar.gz (220.2 MB) | GraphProduct | kgx | had primary source | KGX Distribution of KG-Monarch |
| kg-monarch.graph.jsonl | KGX JSON-L Distribution of KG-Monarch | monarch-kg.jsonl.tar.gz (301.0 MB) | GraphProduct | kgx-jsonl | had primary source | KGX JSON-Lines Distribution of KG-Mon... |
| kg-monarch.graph.rdf | RDF Distribution of KG-Monarch | monarch-kg.nt.gz (838.5 MB) | GraphProduct | rdfxml | had primary source | RDF Distribution of KG-Monarch |
| kg-monarch.graph.neo4j | Neo4j Dump of KG-Monarch | monarch-kg.neo4j.dump (1.3 GB) | GraphProduct | ❔ | had primary source | Neo4j Dump of KG-Monarch |
| kg-monarch.graph.duckdb | DuckDB database of KG-Monarch | monarch-kg.duckdb (6.4 GB) | GraphProduct | ❔ | had primary source | DuckDB database of KG-Monarch |
| automat.stringdb | stringdb_automat | 4ca5a0ce557e2c18 | GraphProduct | kgx-jsonl | had primary source | STRING-DB Automat |
| pheknowlator.graph | PheKnowLator graph | knowledge_graphs?pageState=(%22StorageObjectListTable%22:(%22f%22:%22%255B%255D%22))&inv=1&invt=Ab5_1Q&project=pheknowlator | GraphProduct | owl | had primary source | PheKnowLator graph files, including s... |
| kg-monarch.graph.jsonl.edges | KGX JSON-L Distribution of KG-Monarch Edges | monarch-kg_edges.jsonl (14.2 GB) | GraphProduct | kgx-jsonl | had primary source | KGX JSON-Lines Distribution of KG-Mon... |
| kg-monarch.graph.jsonl.nodes | KGX JSON-L Distribution of KG-Monarch Nodes | monarch-kg_nodes.jsonl (1.1 GB) | GraphProduct | kgx-jsonl | had primary source | KGX JSON-Lines Distribution of KG-Mon... |
| kg-monarch.graph.neo4j.edges | Neo4j Dump of KG-Monarch Edges | monarch-kg_edges.neo4j.csv (4.1 GB) | GraphProduct | neo4j | had primary source | Neo4j Dump of KG-Monarch Edges |
| kg-monarch.graph.neo4j.nodes | Neo4j Dump of KG-Monarch Nodes | monarch-kg_nodes.neo4j.csv (333.4 MB) | GraphProduct | neo4j | had primary source | Neo4j Dump of KG-Monarch Nodes |
| epigraphdb.graph | EpiGraphDB Graph Database | graph-database | GraphProduct | neo4j | had primary source | Integrated graph knowledge base combi... |
| aop-db.data | AOP-DB Data | adverse-outcome-pathway-database-aop-db-version-2 | Product | ❔ | had primary source | The EPA has developed the Adverse Out... |
| cancer-genome-interpreter.clinicalkg.graph | CKG Graph Dump | 1 | GraphProduct | mixed | had primary source | Neo4j database dump of the Clinical K... |
| genecards.protein.interactions ⚠ | GeneCards Protein Interactions | www.genecards.org | Product | http | had primary source | Protein interaction data aggregated f... |
| unibiomap.links | UniBioMap Graph Links | unibiomap.links.csv (1.3 GB) | GraphProduct | csv | had primary source | Core UniBioMap graph edges file. |
| unibiomap.auxs | UniBioMap Graph Auxiliaries | unibiomap.auxs.tsv (563.9 MB) | GraphProduct | tsv | had primary source | Auxiliary UniBioMap graph annotations... |
| unibiomap.pred | UniBioMap Predicted Graph | unibiomap.pred.csv (2.3 GB) | GraphProduct | csv | had primary source | Predicted UniBioMap graph edges with ... |
| unibiomap.pred.full | UniBioMap Predicted Graph (Full) | unibiomap.pred.full.csv (5.9 GB) | GraphProduct | csv | had primary source | Full unfiltered UniBioMap predicted g... |
| ckg.graph | CKG Graph Database Dump | 1 | GraphProduct | neo4j | had primary source | Graph database dump and additional re... |
| drkg.graph | DRKG graph | drkg.tar.gz (206.6 MB) | GraphProduct | ❔ | had primary source | DRKG graph files, including a TSV of ... |
| petagraph.graph | Petagraph Knowledge Graph (Neo4J) | ubkg-downloads.xconsortia.org | GraphProduct | ❔ | had primary source | A comprehensive multi-omics biomedica... |
| tcrd.database_download ⚠ | TCRD Database Downloads | download | Product | mysql | was influenced by | MySQL database dump files containing ... |
| tcrd.api | Pharos API | api | ProgrammingInterface | http | was influenced by | RESTful API providing programmatic ac... |
| tcrd.documentation ⚠ | TCRD Documentation | tcrd | DocumentationProduct | http | was influenced by | Comprehensive documentation describin... |
| imodulondb.browser | iModulonDB Web Interface | imodulondb.org | GraphicalInterface | http | was informed by | Interactive web interface for browsin... |
| imodulondb.datasets | iModulonDB Datasets | imodulondb.org | Product | mixed | was informed by | Downloadable transcriptomic datasets ... |
| kg-predict.gpkg | GP-KG Knowledge Graph Data | GP_KG.txt (46.2 MB) | GraphProduct | tsv | was derived from | GP-KG tab-delimited knowledge graph c... |
| cardiokg.neo4j | CardioKG Neo4j graph construction scripts | Building%20KG | GraphProduct | neo4j | used | Neo4j construction artifacts for Card... |
| biobtree.api | BioBTree REST API | api | ProgrammingInterface | http | had primary source | REST API for searching identifiers an... |
STRING (Search Tool for the Retrieval of Interacting Genes/Proteins) is a database of known and predicted protein-protein interactions. The database contains information from numerous sources, including experimental repositories, computational prediction methods, and public text collections.
The STRING database currently covers:
STRING integrates and scores interactions from five main sources:
STRING data is available through multiple formats:
All data in STRING is freely available under a Creative Commons Attribution 4.0 International (CC BY 4.0) license.
When using STRING, please cite:
Created: June 04, 2025 | Last modified: February 20, 2026