homologene

is a Data Source.

HomoloGene was NCBI's database of homologs (genes with common ancestry) from completely sequenced eukaryotic genomes. The database organized genes into homology groups representing putative orthologs and paralogs across multiple species. HomoloGene was retired in January 2024 and replaced by the NCBI Orthologs dataset, which is now accessible through NCBI Datasets and the Gene database. The last HomoloGene build (build 68) was produced in 2014 and is no longer aligned with current data in NCBI RefSeq and Gene. Historical data from build 68 remains available on FTP for archival purposes only.

Domains

biological systems, genomics

License

Warning: No license entered

Homepage

homologene

Repository

Unknown

Infores ID

infores:homologene

FAIRsharing ID

Unknown

Product Summary

Products

From this Resource
ID Name URL Category Format Description
homologene.data HomoloGene Data File (Build 68 - Archive) homologene.data (13.2 MB) Product tsv Tab-delimited file containing HomoloG...
homologene.xml HomoloGene XML Data (Build 68 - Archive) homologene.xml.gz (167.7 MB) Product xml XML dump of the HomoloGene build 68 c...
homologene.ftp_archive HomoloGene FTP Archive HomoloGene DocumentationProduct http Complete FTP archive of all HomoloGen...

Details

HomoloGene

Overview

HomoloGene was NCBI’s system for automatically detecting homologs (genes with common ancestry) among the annotated genes of completely sequenced eukaryotic genomes. The database organized genes into homology groups representing putative orthologs (genes in different species that evolved from a common ancestral gene) and paralogs (genes related by duplication within a genome).

Retirement Notice

HomoloGene was officially retired in January 2024. The website has been redirected to the NCBI Orthologs site, which provides updated ortholog information through NCBI Datasets and the Gene database. The last HomoloGene build (build 68) was produced in 2014 and is significantly outdated compared to current genomic annotations.

Replacement Resource

Users should now use the NCBI Orthologs dataset, which provides:

  • Updated ortholog annotations aligned with current RefSeq and Gene data
  • Access through NCBI Datasets API and web interface
  • Integration with NCBI Gene pages
  • More comprehensive species coverage

For more information, see:

Historical Data Content

HomoloGene build 68 (2014) contained:

  • Homology groups for 20 eukaryotic species
  • Putative orthologs and paralogs
  • Protein alignments and similarity scores
  • Gene and protein cross-references
  • Distance analysis results

Data Files (Archive Only)

Historical HomoloGene data files from build 68 include:

  • homologene.data: Tab-delimited file with HID, taxonomy ID, gene ID, gene symbol, protein GI, and protein accession
  • homologene.xml.gz: Gzipped XML file with complete homology group information
  • build_inputs/: Supporting files including protein sequences and taxonomic information

Information Resource ID

This resource has the Information Resource identifier: infores:homologene

Important Notes

  • This resource is RETIRED and should not be used for current research
  • Historical data is retained for archival and reproducibility purposes
  • Data is from 2014 and does not reflect current gene annotations
  • For current ortholog information, use NCBI Orthologs via NCBI Datasets or Gene

Is this information incorrect or incomplete? Request an update.

Created: November 04, 2025 | Last modified: November 04, 2025