is an Aggregator.
DISEASES is a weekly updated database that integrates evidence on disease-gene associations from automatic text mining, manually curated literature, cancer mutation data, and genome-wide association studies. It provides confidence scores to facilitate comparison of different types and sources of evidence.
health, genomics, biomedical, literature
Unknown
infores:diseases
Unknown
| ID | Name | URL | Category | Format | Description |
|---|---|---|---|---|---|
| diseases.portal | DISEASES Web Search | diseases.jensenlab.org | GraphicalInterface | http | Web search interface for querying hum... |
| diseases.textmining-full | Text Mining Channel (Full) | human_disease_textmining_full.tsv (1.8 GB) | Product | tsv | Disease-gene associations from text m... |
| diseases.textmining-filtered | Text Mining Channel (Filtered) | human_disease_textmining_filtered.tsv (46.4 MB) | Product | tsv | Disease-gene associations from text m... |
| diseases.knowledge-full | Knowledge Channel (Full) | human_disease_knowledge_full.tsv (6.6 MB) | Product | tsv | Disease-gene associations from manual... |
| diseases.knowledge-filtered | Knowledge Channel (Filtered) | human_disease_knowledge_filtered.tsv (588.0 KB) | Product | tsv | Disease-gene associations from manual... |
| diseases.experiments-full | Experiments Channel (Full) | human_disease_experiments_full.tsv (25.7 MB) | Product | tsv | Disease-gene associations from experi... |
| diseases.experiments-filtered | Experiments Channel (Filtered) | human_disease_experiments_filtered.tsv (2.4 MB) | Product | tsv | Disease-gene associations from experi... |
| diseases.integrated-full | Integrated Channel (Full) | human_disease_integrated_full.tsv (618.3 MB) | Product | tsv | Experimental integrated channel combi... |
| diseases.dictionary | DISEASES Dictionary | diseases_dictionary.tar.gz (15.2 MB) | Product | ❔ | Dictionary of human gene and disease ... |
| amyco.annotations | AmyCo Curated Annotations | ❔ | Product | ❔ | Manually curated disease-gene associa... |
| ID | Name | URL | Category | Format | Description |
|---|---|---|---|---|---|
| clinicalkg.graph | CKG Graph Dump | 1 | GraphProduct | mixed | Neo4j database dump of the Clinical K... |
| cancer-genome-interpreter.clinicalkg.graph | CKG Graph Dump | 1 | GraphProduct | mixed | Neo4j database dump of the Clinical K... |
| spoke.graph | SPOKE Graph | ❔ | GraphProduct | ❔ | The SPOKE knowledge graph containing ... |
| translator.diseases.graph | Translator DISEASES KGX Graph | latest | GraphProduct | kgx-jsonl | KGX JSONL graph package for DISEASES ... |
| string.protein.links | STRING Protein Links | protein.links.v12.0.txt.gz (128.7 GB) | GraphProduct | txt | protein network data (full network, s... |
| string.protein.links.detailed | STRING Protein Links Detailed | protein.links.detailed.v12.0.txt.gz (189.6 GB) | GraphProduct | txt | protein network data (full network, i... |
| string.protein.links.full | STRING Protein Links Full | protein.links.full.v12.0.txt.gz (199.6 GB) | GraphProduct | txt | protein network data (full network, i... |
| string.protein.physical.links | STRING Protein Physical Links | protein.physical.links.v12.0.txt.gz (11.1 GB) | GraphProduct | txt | protein network data (physical subnet... |
| string.protein.physical.links.detailed | STRING Protein Physical Links Detailed | protein.physical.links.detailed.v12.0.txt.gz (13.8 GB) | GraphProduct | txt | protein network data (physical subnet... |
| string.protein.physical.links.full | STRING Protein Physical Links Full | protein.physical.links.full.v12.0.txt.gz (14.5 GB) | GraphProduct | txt | protein network data (physical subnet... |
| string.cog.links | STRING COG Links | COG.links.v12.0.txt.gz (176.8 MB) | GraphProduct | txt | association scores between orthologou... |
| string.cog.links.detailed | STRING COG Links Detailed | COG.links.detailed.v12.0.txt.gz (238.7 MB) | GraphProduct | txt | association scores (incl. subscores p... |
| string.database | STRING Database Network Schema | network_schema.v12.0.sql.gz (262.2 GB) | GraphProduct | ❔ | full database, part II: the networks ... |
| translator.translator_kg.graph | Translator Aggregate KGX Graph | latest | GraphProduct | kgx-jsonl | Aggregated KGX JSONL graph package co... |
| ckg.graph | CKG Graph Database Dump | 1 | GraphProduct | neo4j | Graph database dump and additional re... |
DISEASES is a comprehensive database that integrates disease-gene associations from multiple evidence sources. Maintained by the JensenLab and currently hosted at the Swiss Institute of Bioinformatics (University of Zurich), it provides weekly updated data combining automatic text mining, manually curated knowledge, cancer mutation data, and genome-wide association studies. The resource assigns unified confidence scores to facilitate comparison across different types of evidence.
DISEASES integrates disease-gene associations through four main channels:
All downloadable files contain:
Full datasets: Complete associations from the database Filtered datasets: Non-redundant associations shown in the web interface
Each channel available in two versions:
Additional resources:
Current Maintainer: Qingyao Huang (Swiss Institute of Bioinformatics, University of Zurich)
Original Developers:
Affiliation: Novo Nordisk Foundation Center for Protein Research
Licensed under Creative Commons Attribution 4.0 International (CC BY 4.0)
Primary Publication: Grissa, D., Junge, A., Oprea, T. I., & Jensen, L. J. (2022). DISEASES 2.0: a weekly updated database of disease–gene associations from text mining and data integration. Database, 2022, baac019. https://doi.org/10.1093/database/baac019 (PMID: 35348650)
Original Publication: Pletscher-Frankild, S., Pallejà, A., Tsafou, K., Binder, J. X., & Jensen, L. J. (2015). DISEASES: Text mining and data integration of disease–gene associations. Methods, 74, 83-89. (PMID: 25484339)
Created: June 04, 2025 | Last modified: January 30, 2026