kg_bacdive.transform_utils.traits package
Submodules
kg_bacdive.transform_utils.traits.traits module
Transform the traits data from NCBI and GTDB.
- class kg_bacdive.transform_utils.traits.traits.TraitsTransform(input_dir, output_dir, nlp=True)
Bases:
Transform
Ingest traits dataset (NCBI/GTDB).
Essentially just ingests and transforms this file: https://github.com/bacteria-archaea-traits/bacteria-archaea-traits/blob/master/output/condensed_traits_NCBI.csv And extracts the following columns:
tax_id
org_name
metabolism
pathways
shape
carbon_substrates
cell_shape
isolation_source
- Also implements:
OAK to run NLP via the ‘ner_utils’ module and
ROBOT using ‘robot_utils’ module.
- run(data_file=None)
Call method and perform needed transformations for trait data (NCBI/GTDB).
- Parameters:
data_file (
Union
[Path
,None
,str
]) – Input file name.
Module contents
Traits transform.
- class kg_bacdive.transform_utils.traits.TraitsTransform(input_dir, output_dir, nlp=True)
Bases:
Transform
Ingest traits dataset (NCBI/GTDB).
Essentially just ingests and transforms this file: https://github.com/bacteria-archaea-traits/bacteria-archaea-traits/blob/master/output/condensed_traits_NCBI.csv And extracts the following columns:
tax_id
org_name
metabolism
pathways
shape
carbon_substrates
cell_shape
isolation_source
- Also implements:
OAK to run NLP via the ‘ner_utils’ module and
ROBOT using ‘robot_utils’ module.
- run(data_file=None)
Call method and perform needed transformations for trait data (NCBI/GTDB).
- Parameters:
data_file (
Union
[Path
,None
,str
]) – Input file name.