kg_microbe.transform_utils.traits package

Submodules

kg_microbe.transform_utils.traits.traits module

Transform the traits data from NCBI and GTDB.

class kg_microbe.transform_utils.traits.traits.TraitsTransform(input_dir: str, output_dir: str, nlp=True)

Bases: Transform

Ingest traits dataset (NCBI/GTDB).

Essentially just ingests and transforms this file: https://github.com/bacteria-archaea-traits/bacteria-archaea-traits/blob/master/output/condensed_traits_NCBI.csv And extracts the following columns:

  • tax_id

  • org_name

  • metabolism

  • pathways

  • shape

  • carbon_substrates

  • cell_shape

  • isolation_source

Also implements:
  • OGER to run NLP via the ‘nlp_utils’ module and

  • ROBOT using ‘robot_utils’ module.

run(data_file: Optional[str] = None)

Call method and perform needed transformations for trait data (NCBI/GTDB).

Parameters

data_file – Input file name.

Module contents

Initialize the traits transform.

class kg_microbe.transform_utils.traits.TraitsTransform(input_dir: str, output_dir: str, nlp=True)

Bases: Transform

Ingest traits dataset (NCBI/GTDB).

Essentially just ingests and transforms this file: https://github.com/bacteria-archaea-traits/bacteria-archaea-traits/blob/master/output/condensed_traits_NCBI.csv And extracts the following columns:

  • tax_id

  • org_name

  • metabolism

  • pathways

  • shape

  • carbon_substrates

  • cell_shape

  • isolation_source

Also implements:
  • OGER to run NLP via the ‘nlp_utils’ module and

  • ROBOT using ‘robot_utils’ module.

run(data_file: Optional[str] = None)

Call method and perform needed transformations for trait data (NCBI/GTDB).

Parameters

data_file – Input file name.