vectology

is a General purpose Resource.

Vectology is a software platform and API for exploring relationships among biomedical variables using sentence embedding models derived from biomedical literature. It converts brief variable descriptions into vector representations enabling similarity search, recommendation, and relational insight without manual ontology annotation.

Domains

biomedical, genomics, health, investigations

License

Warning: No license entered

Homepage

vectology

Repository

Unknown

Infores ID

Unknown

FAIRsharing ID

Unknown

Product Summary

Contacts

Products

From this Resource
ID Name URL Category Format Description
vectology.api Vectology API vectology-api.mrcieu.ac.uk ProgrammingInterface Public API providing access to senten...
vectology.docs Vectology Project Page vectology DocumentationProduct http Project information page including de...
From other Resources
ID Name URL Category Format Description
epigraphdb.graph EpiGraphDB Graph Database graph-database GraphProduct neo4j Integrated graph knowledge base combi...

Details

Overview

Vectology provides a data-driven alternative to manual expert annotation of short biomedical variable descriptions. Using precomputed sentence embedding models trained on biomedical literature, it maps each variable description to a dense vector. Vector similarity operations enable identification of conceptually related variables, recommendation, and exploration of relationships between sets of variables.

Methodology

  1. Ingest variable text descriptions from biomedical datasets.
  2. Generate sentence embeddings using pretrained biomedical language models.
  3. Store vector representations to support similarity and distance queries.
  4. Provide API endpoints and a web UI for searching, comparing, and recommending related variables.

Use Cases

  • Rapidly find conceptually similar phenotypic or exposure variables.
  • Recommend additional variables for inclusion in analyses.
  • Cluster or visualize variable sets in embedding space.
  • Support ontology mapping or curation triage by highlighting nearest neighbors.

Abstract (from methods paper)

Many biomedical data sets contain variables that are identified by simple, and often short, descriptions. Traditionally these would either be manually annotated and/or assigned to ontologies using expert knowledge, facilitating interactions with other data sets and gaining an understanding of where these variables lie in the biomedical knowledge space. An alternative approach is to utilise sentence embedding methods and convert these variables into vectors, calculated from precomputed models derived from biomedical literature. This provides a data-driven alternative to manual expert annotation, automatically harnessing the expert knowledge captured in the existing literature. These vectors, representing the biomedical space embodied by each specific piece of text, enable us to apply methods for exploring relationships between variables in vector space, notably comparing distances between vectors. From here, it is possible to recommend a set of variables as the most conceptually similar to a given piece of text or existing vector, whilst also gaining insight into how a group of variables are related. Vectology is made available via an API (http://vectology-api.mrcieu.ac.uk/) and basic usage can be explored via a web application (http://vectology.mrcieu.ac.uk).

Citation

Elsworth B, Liu Y, Gaunt TR. Vectology – exploring biomedical variable relationships using sentence embedding and vectors. MRC Integrative Epidemiology Unit, University of Bristol. (Manuscript PDF, DSRS Turing 2019 proceedings excerpt.)

Contact

Feedback or issues can be submitted via the project page or by emailing the listed maintainers (ben.elsworth@bristol.ac.uk, yi6240.liu@bristol.ac.uk, Tom.Gaunt@bristol.ac.uk).

Is this information incorrect or incomplete? Request an update.

Created: September 03, 2025 | Last modified: September 03, 2025