is a Data Source.
Pathway Figure OCR (PFOCR) is a resource that extracts biological pathway information from figures in scientific publications using optical character recognition (OCR) and machine learning. PFOCR automatically identifies pathway diagrams in published literature, extracts gene and protein names from pathway figures, and creates structured pathway data. The resource enables discovery of pathway knowledge that exists only in figure format and is not captured in article text or structured databases.
infores:pfocr
Unknown
| ID | Name | URL | Category | Format | Description |
|---|---|---|---|---|---|
| pfocr.web | PFOCR Web Interface | pfocr.wikipathways.org | GraphicalInterface | http | Web interface for searching and brows... |
| pfocr.database_repository | PFOCR Database Repository | pfocr-database | Product | http | GitHub repository containing the Jeky... |
| pfocr.search_json | PFOCR Search JSON | search.json (53.3 MB) | Product | json | Search metadata JSON used by the PFOC... |
| pfocr.figure_info_json | PFOCR Figure Information JSON | getFigureInfo.json (56.4 MB) | Product | json | JSON file containing all PFOCR figure... |
| pfocr.gmt | PFOCR GMT Gene Sets | current | Product | txt | Current GMT release of PFOCR pathway ... |
| pfocr.api | PFOCR API | help.html#download | ProgrammingInterface | http | JSON endpoints and help documentation... |
| ID | Name | URL | Category | Format | Relation | Description |
|---|---|---|---|---|---|---|
| harmonizome.downloads | Harmonizome Downloads | download | Product | mixed | was derived from | Harmonizome 3.0 processed dataset dow... |
| harmonizome.kg-neo4j | Harmonizome Knowledge Graph Neo4j Database | harmonizome-kg.maayanlab.cloud | GraphProduct | neo4j | was derived from | Neo4j knowledge graph serialization o... |
Pathway Figure OCR (PFOCR) is a resource that extracts biological pathway information from figures in scientific publications using optical character recognition (OCR) and machine learning.
PFOCR addresses the challenge that much pathway knowledge exists only in published figures and is not captured in article abstracts or structured databases. By automatically processing pathway diagrams, PFOCR makes this “hidden” knowledge discoverable and machine-readable.
Search and browse pathway information extracted from literature figures with visualization of source figures and extracted data.
Structured pathway data extracted from literature figures, including gene/protein interactions and pathway relationships in machine-readable formats.
Programmatic access to PFOCR data for integration with pathway analysis tools and knowledge graphs.
This resource has the Information Resource identifier: infores:pfocr
Database repository: https://github.com/wikipathways/pfocr-database
Created: November 05, 2025 | Last modified: June 02, 2026