Known Cancer Gene Annotation
knownCancer¶
Short description¶
Annotates mutations with COSMIC and OncoKB cancer-related annotations to identify known cancer genes.
Signature¶
def knownCancer(self, annotation_table, output_path=None, compress_output=True, join_column="Hugo_Symbol", oncokb_table=None, in_place=False):
Parameters¶
Parameter | Type | Required | Description |
---|---|---|---|
annotation_table |
str | Path |
Yes | Path to the COSMIC annotation table (.tsv or .tsv.gz format). |
output_path |
str | Path |
No | Output file path. If not provided, saves with default naming convention. |
compress_output |
bool |
No | Whether to compress the output file with gzip (default: True). |
join_column |
str |
No | Column name to use for joining (default: "Hugo_Symbol"). |
oncokb_table |
str | Path |
No | Path to the OncoKB cancer gene list table (.tsv). Adds OncoKB annotations if provided. |
in_place |
bool |
No | If True, replaces self.data with annotated data. If False, returns annotated DataFrame (default: False). |
Return value¶
Returns pd.DataFrame
if in_place=False
, containing the annotated mutation data with cancer gene annotations. Returns None
if in_place=True
and updates self.data
directly.
Exceptions¶
List only those the user should handle:
FileNotFoundError
: if annotation files don't exist.ValueError
: if join column is not found in DataFrame.