interpro

is a Data Source.

InterPro is a database of protein families, domains and functional sites in which identifiable features found in known proteins can be applied to unknown protein sequences.

Domains

biological systems, proteomics, genomics, drug discovery

License

CC0-1.0

Homepage

interpro

Repository

www.ebi.ac.uk

Infores ID

infores:interpro

FAIRsharing ID

Unknown

Product Summary

Products

From this Resource
ID Name URL Category Format Description
interpro.web InterPro Web Interface interpro GraphicalInterface http Web interface for browsing and search...
interpro.api InterPro API api ProgrammingInterface RESTful API for programmatic access t...
interpro.interproscan InterProScan interproscan-5.74-105.0-64-bit.tar.gz (6.7 GB) ProcessProduct java Software package for scanning protein...
interpro.entry_list InterPro Entry List entry.list (2.5 MB) DataModelProduct tsv Complete list of InterPro entries wit...
interpro.xml InterPro XML interpro.xml.gz (38.6 MB) DataModelProduct xml Complete InterPro database in XML for...
interpro.match_complete InterPro Match Complete match_complete.xml.gz (83.1 GB) MappingProduct xml Complete set of matches between prote...
interpro.uniparc_match UniParc Match uniparc_match.tar.gz (308.5 GB) MappingProduct xml InterPro matches for UniParc protein ...
interpro.protein2ipr Protein to InterPro Mappings protein2ipr.dat.gz (19.5 GB) MappingProduct tsv Mappings of protein sequences to Inte...
interpro.parent_child_tree Parent-Child Tree ParentChildTreeFile.txt (612.9 KB) DataModelProduct tsv Hierarchical relationships between In...
interpro.interpro2go InterPro to GO Mappings interpro2go (2.9 MB) MappingProduct tsv Mappings between InterPro entries and...
From other Resources
ID Name URL Category Format Description
spoke.graph SPOKE Graph GraphProduct The SPOKE knowledge graph containing ...
bioteque.embeddings Bioteque Embeddings embeddings Product Network embeddings of the Bioteque gr...
mechreponet.kg MechRepoNet Knowledge Graph publication The MechRepoNet knowledge graph in it...
drugmechdb.graph DrugMechDB Graph Dataset zenodo.8139357 GraphProduct mixed Curated mechanistic drug–disease path...
obo-db-ingest.interpro.obo interpro OBO interpro.obo (1.1 MB) Product obo interpro OBO
obo-db-ingest.interpro.owl interpro OWL interpro.owl (1.3 MB) Product owl interpro OWL
obo-db-ingest.interpro.json interpro OBO Graph JSON interpro.json (1.1 MB) Product json interpro OBO Graph JSON
goa.mapping-files GO Mapping Files external2go MappingProduct txt Files containing transitive assignmen...
kinace.portal KinAce Web Portal kinace.kinametrix.com GraphicalInterface http Interactive web interface for explori...
string.protein.links STRING Protein Links protein.links.v12.0.txt.gz (128.7 GB) GraphProduct txt protein network data (full network, s...
string.protein.links.detailed STRING Protein Links Detailed protein.links.detailed.v12.0.txt.gz (189.6 GB) GraphProduct txt protein network data (full network, i...
string.protein.links.full STRING Protein Links Full protein.links.full.v12.0.txt.gz (199.6 GB) GraphProduct txt protein network data (full network, i...
string.protein.physical.links STRING Protein Physical Links protein.physical.links.v12.0.txt.gz (11.1 GB) GraphProduct txt protein network data (physical subnet...
string.protein.physical.links.detailed STRING Protein Physical Links Detailed protein.physical.links.detailed.v12.0.txt.gz (13.8 GB) GraphProduct txt protein network data (physical subnet...
string.protein.physical.links.full STRING Protein Physical Links Full protein.physical.links.full.v12.0.txt.gz (14.5 GB) GraphProduct txt protein network data (physical subnet...
string.cog.links STRING COG Links COG.links.v12.0.txt.gz (176.8 MB) GraphProduct txt association scores between orthologou...
string.cog.links.detailed STRING COG Links Detailed COG.links.detailed.v12.0.txt.gz (238.7 MB) GraphProduct txt association scores (incl. subscores p...
string.database STRING Database Network Schema network_schema.v12.0.sql.gz (262.2 GB) GraphProduct full database, part II: the networks ...
obo-db-ingest.interpro.tsv interpro Nodes TSV interpro.tsv (681.5 KB) Product tsv interpro Nodes TSV

Details

InterPro is a database of protein families, domains and functional sites in which identifiable features found in known proteins can be applied to unknown protein sequences.

Overview

InterPro is a comprehensive database hosted at the European Bioinformatics Institute (EMBL-EBI) that provides functional analysis of proteins by classifying them into families and predicting the presence of domains and important sites. It integrates predictive protein signatures from multiple partner databases into a unified resource.

Database Components

InterPro integrates signatures from several member databases, including:

  • Pfam: Protein families represented by multiple sequence alignments and hidden Markov models
  • PROSITE: Patterns and profiles for protein families and domains
  • SMART: Identification and annotation of genetically mobile domains
  • PRINTS: Fingerprints for protein sequence classification
  • PANTHER: Protein families classified by function
  • CDD (Conserved Domain Database): Ancient conserved protein domains
  • PIRSF: Hierarchical classification of complete proteins
  • SUPERFAMILY: Structural and functional annotation based on SCOP superfamilies
  • CATH-Gene3D: Protein domain assignments for genomes
  • TIGRFAMs: Protein families based on hidden Markov models
  • HAMAP: High-quality automated annotations of microbial proteins

Features and Applications

InterPro provides:

  • Comprehensive protein annotation
  • Hierarchical classification of proteins
  • Functional and structural insights
  • GO (Gene Ontology) term mappings
  • Pathway associations
  • Taxonomic distribution information
  • Cross-references to other biological databases

Tools and Access

The primary tool for using InterPro signatures is InterProScan, which allows users to scan their protein sequences against all InterPro’s signatures simultaneously. InterPro data can be accessed via:

  • Web interface for interactive browsing and searching
  • Programmatic access via REST API
  • Downloadable datasets in various formats (OBO, OWL, JSON, XML)
  • InterProScan software package for local installation and high-throughput analysis

InterPro is widely used in genome annotation projects, comparative genomics studies, structural biology research, and functional characterization of proteins across all taxonomic groups.

Is this information incorrect or incomplete? Request an update.

Created: March 09, 2025 | Last modified: February 18, 2026