kg_covid_19.transform_utils.pharmgkb package¶
Submodules¶
kg_covid_19.transform_utils.pharmgkb.pharmgkb module¶
-
exception
kg_covid_19.transform_utils.pharmgkb.pharmgkb.
CantFindPharmGKBKey
¶ Bases:
Exception
-
class
kg_covid_19.transform_utils.pharmgkb.pharmgkb.
PharmGKB
(input_dir: str = None, output_dir: str = None)¶ Bases:
kg_covid_19.transform_utils.transform.Transform
-
get_uniprot_id
(this_id: str, pharmgkb_prefix: str = 'PHARMGKB')¶
-
make_id_mapping_file
(map_file: str, sep: str = '\t', pharmgkb_id_col: str = 'PharmGKB Accession Id', id_key: str = 'Cross-references', id_sep: str = ',', id_key_val_sep: str = ':') → dict¶ Fxn to parse gene ID mappings or drug ID mapping for PharmGKB ids This is to parse both genes.tsv and drugs.tsv files
- Parameters
map_file – genes.tsv file, containing mappings
pharmgkb_id_col – column containing pharmgkb, to be used as key for map
sep – separator between columns [ ]
id_key – column name that contains ids [Cross-references]
id_sep – separator between each id key:val pair [,]
id_key_val_sep – separator between key:val pair [:]
- Returns
-
make_pharmgkb_chemical_node
(fh: TextIO, chem_id: str, name: str, biolink_type: str) → None¶ Write out node for gene :param fh: file handle to write out gene :param id: pharmgkb gene id :param name: gene name :param biolink_type: biolink type for Chemical :return: None
-
make_pharmgkb_edge
(fh: TextIO, line_data: dict) → None¶
-
make_pharmgkb_gene_node
(fh: TextIO, this_id: str, name: str, biolink_type: str) → None¶ Write out node for gene :param fh: file handle to write out gene :param this_id: pharmgkb gene id :param name: gene name :param biolink_type: biolink type for Gene (from make_gene_id_mapping_file()) :return: None
-
make_preferred_drug_id
(pharmgkb_id: str, drug_id_map: dict, preferred_ids: dict = {'CHEMBL': 'CHEMBL', 'ChEBI:CHEBI': 'CHEBI', 'DrugBank': 'DRUGBANK', 'PubChem Compound:': 'PUBCHEM'}, pharmgkb_prefix: str = 'PHARMGKB') → str¶ Given a drug id, convert it to a cross-referenced ID, in this order of preference:
CHEBI > CHEMBL > DRUGBANK > PUBCHEM
:param pharmgkb_id :param drug_id_map - map of pharmgkb ids to cross-referenced IDs :param preferred_ids - dict of preferred ids in desc order of preference
‘their string’ -> ‘canonical CURIE prefix’ wow, they don’t make this easy
:param pharmgkb_prefix thing to prepend to pharmgkb id (‘PHARMGKB’) :return: preferred_id: preferred cross-referenced ID
-
parse_pharmgkb_line
(this_line: str, header_items) → dict¶ Parse a single line from relationships.tsv and return a dict with data
- Parameters
this_line – line from relationship.tsv to parse
header_items – header from relationships.tsv
- Returns
dict with key value containing data
-
run
(data_file: Optional[str] = None)¶
-
-
exception
kg_covid_19.transform_utils.pharmgkb.pharmgkb.
PharmGKBFileError
¶ Bases:
Exception
-
exception
kg_covid_19.transform_utils.pharmgkb.pharmgkb.
PharmGKBInvalidEdge
¶ Bases:
Exception
-
exception
kg_covid_19.transform_utils.pharmgkb.pharmgkb.
PharmKGBInvalidNodeType
¶ Bases:
Exception
Module contents¶
-
class
kg_covid_19.transform_utils.pharmgkb.
PharmGKB
(input_dir: str = None, output_dir: str = None)¶ Bases:
kg_covid_19.transform_utils.transform.Transform
-
get_uniprot_id
(this_id: str, pharmgkb_prefix: str = 'PHARMGKB')¶
-
make_id_mapping_file
(map_file: str, sep: str = '\t', pharmgkb_id_col: str = 'PharmGKB Accession Id', id_key: str = 'Cross-references', id_sep: str = ',', id_key_val_sep: str = ':') → dict¶ Fxn to parse gene ID mappings or drug ID mapping for PharmGKB ids This is to parse both genes.tsv and drugs.tsv files
- Parameters
map_file – genes.tsv file, containing mappings
pharmgkb_id_col – column containing pharmgkb, to be used as key for map
sep – separator between columns [ ]
id_key – column name that contains ids [Cross-references]
id_sep – separator between each id key:val pair [,]
id_key_val_sep – separator between key:val pair [:]
- Returns
-
make_pharmgkb_chemical_node
(fh: TextIO, chem_id: str, name: str, biolink_type: str) → None¶ Write out node for gene :param fh: file handle to write out gene :param id: pharmgkb gene id :param name: gene name :param biolink_type: biolink type for Chemical :return: None
-
make_pharmgkb_edge
(fh: TextIO, line_data: dict) → None¶
-
make_pharmgkb_gene_node
(fh: TextIO, this_id: str, name: str, biolink_type: str) → None¶ Write out node for gene :param fh: file handle to write out gene :param this_id: pharmgkb gene id :param name: gene name :param biolink_type: biolink type for Gene (from make_gene_id_mapping_file()) :return: None
-
make_preferred_drug_id
(pharmgkb_id: str, drug_id_map: dict, preferred_ids: dict = {'CHEMBL': 'CHEMBL', 'ChEBI:CHEBI': 'CHEBI', 'DrugBank': 'DRUGBANK', 'PubChem Compound:': 'PUBCHEM'}, pharmgkb_prefix: str = 'PHARMGKB') → str¶ Given a drug id, convert it to a cross-referenced ID, in this order of preference:
CHEBI > CHEMBL > DRUGBANK > PUBCHEM
:param pharmgkb_id :param drug_id_map - map of pharmgkb ids to cross-referenced IDs :param preferred_ids - dict of preferred ids in desc order of preference
‘their string’ -> ‘canonical CURIE prefix’ wow, they don’t make this easy
:param pharmgkb_prefix thing to prepend to pharmgkb id (‘PHARMGKB’) :return: preferred_id: preferred cross-referenced ID
-
parse_pharmgkb_line
(this_line: str, header_items) → dict¶ Parse a single line from relationships.tsv and return a dict with data
- Parameters
this_line – line from relationship.tsv to parse
header_items – header from relationships.tsv
- Returns
dict with key value containing data
-
run
(data_file: Optional[str] = None)¶
-