orthodb

is a Data Source.

OrthoDB is a comprehensive database of orthologous protein-coding genes across multiple species, providing evolutionary and functional annotations of orthologous groups.

Domains

biological systems, organisms

License

CC-BY-4.0

Homepage

orthodb

Repository

GitHub

Infores ID

Unknown

FAIRsharing ID

Unknown

Product Summary

Products

From this Resource
ID Name URL Category Format Description
orthodb.site OrthoDB Web Interface www.orthodb.org GraphicalInterface Web interface for exploring OrthoDB data
orthodb.api.sparql OrthoDB SPARQL Endpoint sparql.orthodb.org ProgrammingInterface SPARQL endpoint for querying OrthoDB ...
orthodb.api.rest OrthoDB API api ProgrammingInterface RESTful API for programmatic access t...
orthodb.species OrthoDB Species Data odb12v1_species.tab.gz (629.3 KB) Product tsv Tab-separated file with species infor...
orthodb.ogs OrthoDB Orthologous Groups odb12v1_OGs.tab.gz (138.5 MB) Product tsv Tab-separated file with information a...
orthodb.og2genes OrthoDB OG to Genes Mapping odb12v1_OG2genes.tab.gz (4.6 GB) MappingProduct tsv Tab-separated file mapping orthologou...
orthodb.genes OrthoDB Genes Data odb12v1_genes.tab.gz (4.5 GB) Product tsv Tab-separated file with gene informat...
orthodb.gene_xrefs OrthoDB Gene Cross-references odb12v1_gene_xrefs.tab.gz (4.4 GB) MappingProduct tsv Tab-separated file with gene cross-re...
orthodb.og_xrefs OrthoDB OG Functional Annotations odb12v1_OG_xrefs.tab.gz (333.1 MB) Product tsv Tab-separated file with orthologous g...
orthodb.aa_fasta OrthoDB Protein Sequences odb12v1_aa_fasta.gz (36.3 GB) Product fasta FASTA-formatted amino acid sequences ...
orthodb.cds_fasta OrthoDB CDS Sequences odb12v1_cds_fasta.gz (53.6 GB) Product fasta FASTA-formatted coding sequences for ...

Details

OrthoDB is a comprehensive database of orthologous protein-coding genes across multiple species with a hierarchical catalog of orthologs. It provides evolutionary and functional annotations of orthologous groups at various taxonomic levels, covering Eukaryotes, Prokaryotes, and Viruses.

The database contains information for more than 31,000 species, including:

  • More than 5,800 Eukaryotes
  • More than 18,100 Prokaryotes
  • More than 7,900 Viruses
  • Approximately 162 million genes in total

OrthoDB offers several key features:

  • Hierarchical orthology classification across the tree of life
  • Integration with functional information from GO, InterPro, and COG
  • Full amino acid and coding sequence data
  • Gene and protein cross-references to major databases
  • Orthologous group hierarchies based on phylogenetic relationships

The database is widely used in comparative genomics, molecular evolution studies, functional annotation, and gene family evolution analysis. It serves as a foundation for the popular BUSCO tool (Benchmarking Universal Single-Copy Orthologs) for genome assembly and annotation assessment.

OrthoDB is maintained by the Zdobnov Lab at the University of Geneva and the Swiss Institute of Bioinformatics (SIB), with regular updates incorporating new genome sequences and improving data quality. Access is provided through a user-friendly web interface, a SPARQL endpoint for semantic web queries, a RESTful API for programmatic access, and bulk downloads in various formats.

Is this information incorrect or incomplete? Request an update.

Created: May 07, 2025 | Last modified: December 07, 2025