Mutational Signature Analysis Example¶
This notebook demonstrates the mutational signature analysis visualizations. Somatic SNVs arise from different mutational processes (aging, UV exposure, APOBEC activity, etc.), each leaving a characteristic pattern across 96 trinucleotide mutation contexts.
For detailed documentation, see the Mutational Signature Analysis API Reference.
Load Data¶
Load the MAF file and create a PyMutation object.
from pyMut.input import read_maf
maf_path = "../../../src/pyMut/data/examples/MAF/ms.maf.gz"
print(f'📂 Loading file: {maf_path}')
# Read the MAF file and create the PyMutation object
py_mutation = read_maf(maf_path,assembly="37")
2025-11-12 21:00:51,703 | INFO | pyMut.input | Starting MAF reading: ../../../src/pyMut/data/examples/MAF/ms.maf.gz 2025-11-12 21:00:51,704 | INFO | pyMut.input | Loading from cache: ../../../src/pyMut/data/examples/MAF/.pymut_cache/ms.maf_054e7f928904eed3.parquet 2025-11-12 21:00:51,819 | INFO | pyMut.input | Cache loaded successfully in 0.12 seconds
📂 Loading file: ../../../src/pyMut/data/examples/MAF/ms.maf.gz
Configure Plot Settings¶
Enable high-quality rendering settings for publication-ready figures.
py_mutation.configure_high_quality_plots()
Complete Mutational Signature Analysis¶
The mutational_signature_analysis() function generates a comprehensive five-panel figure:
- Panel A: Signature profiles showing trinucleotide distribution for each signature
- Panel B: Cosine similarity heatmap comparing extracted signatures with COSMIC
- Panel C: Heatmap of relative signature contributions per sample
- Panel D: Stacked bar chart of signature distribution across samples
- Panel E: Donut plot of cohort-level signature proportions
Key Parameters:
ref_genome: Path to reference genome FASTA file (required)cosmic_path: Path to COSMIC signature catalog TSV file (required for comparison)n_signatures: Number of signatures to extract. Default:3figsize: Figure size. Default:(24, 12)title: Plot title. Default:"Mutational Signature Analysis"
py_mutation.mutational_signature_analysis(
ref_genome="../../../src/pyMut/data/resources/genome/GRCh37/Homo_sapiens.GRCh37.dna.toplevel.fa",
cosmic_path="../../../src/pyMut/data/examples/COSMIC_catalogue-signatures_SBS96_v3.4/COSMIC_72.tsv",
)
2025-11-12 21:01:16,705 | INFO | pyMut.core | Generating complete mutational signature analysis... /home/xuscbart/miniconda3/envs/PyMutTFG/lib/python3.10/site-packages/sklearn/decomposition/_nmf.py:1728: ConvergenceWarning: Maximum number of iterations 500 reached. Increase it to improve convergence. warnings.warn( 2025-11-12 21:01:21,212 | INFO | pyMut.analysis.mutational_signature | NMF signature extraction completed in 0.01 seconds 2025-11-12 21:01:21,212 | INFO | pyMut.visualizations.mutational_signature_analysis | Generating complete mutational signature analysis... 2025-11-12 21:01:21,227 | INFO | pyMut.visualizations.mutational_signature_analysis | Generating signature bar chart... 2025-11-12 21:01:21,377 | INFO | pyMut.visualizations.mutational_signature_analysis | Signature bar chart generated in 0.15s 2025-11-12 21:01:21,382 | INFO | pyMut.visualizations.mutational_signature_analysis | Generating cosine similarity heatmap... 2025-11-12 21:01:21,646 | INFO | pyMut.visualizations.mutational_signature_analysis | Cosine similarity heatmap generated in 0.26s 2025-11-12 21:01:21,652 | INFO | pyMut.visualizations.mutational_signature_analysis | Generating signature contribution heatmap... 2025-11-12 21:01:21,917 | INFO | pyMut.visualizations.mutational_signature_analysis | Signature contribution heatmap generated in 0.27s 2025-11-12 21:01:21,926 | INFO | pyMut.visualizations.mutational_signature_analysis | Generating signature stacked bar chart... 2025-11-12 21:01:22,162 | INFO | pyMut.visualizations.mutational_signature_analysis | Signature stacked bar chart rendered in 0.24s 2025-11-12 21:01:22,171 | INFO | pyMut.visualizations.mutational_signature_analysis | Generating signature donut plot... 2025-11-12 21:01:22,179 | INFO | pyMut.visualizations.mutational_signature_analysis | Signature donut plot rendered in 0.01s /home/xuscbart/pyMut/src/pyMut/visualizations/mutational_signature_analysis.py:1932: UserWarning: This figure includes Axes that are not compatible with tight_layout, so results might be incorrect. fig.tight_layout(rect=(0, 0, 1, 0.8)) 2025-11-12 21:01:22,181 | INFO | pyMut.visualizations.mutational_signature_analysis | Complete mutational signature analysis rendered in 0.97s 2025-11-12 21:01:22,190 | INFO | pyMut.core | Complete mutational signature analysis generated in 5.49s
Individual Panel Components¶
Each panel from the complete analysis is also available as a standalone function.
Panel A: Signature Bar Chart¶
Multi-panel visualization showing each mutational signature as a bar chart with 96 trinucleotide contexts. Bars are grouped by the 6 substitution classes (C>A, C>G, C>T, T>A, T>C, T>G).
Parameters:
ref_genome: Path to reference genome FASTA file (required)n_signatures: Number of signatures to extract. Default:3cosmic_path: Path to COSMIC catalog (optional, for signature annotation)figsize: Figure size. Default: auto-calculatedtitle: Plot title. Default:"Mutational Signature Profiles"
py_mutation.signature_bar_chart(
ref_genome="../../../src/pyMut/data/resources/genome/GRCh37/Homo_sapiens.GRCh37.dna.toplevel.fa"
)
2025-11-12 21:00:51,837 | INFO | pyMut.core | Generating signature bar chart... /home/xuscbart/miniconda3/envs/PyMutTFG/lib/python3.10/site-packages/sklearn/decomposition/_nmf.py:1728: ConvergenceWarning: Maximum number of iterations 500 reached. Increase it to improve convergence. warnings.warn( 2025-11-12 21:00:56,411 | INFO | pyMut.analysis.mutational_signature | NMF signature extraction completed in 0.01 seconds 2025-11-12 21:00:56,412 | INFO | pyMut.visualizations.mutational_signature_analysis | Generating signature bar chart... 2025-11-12 21:00:56,573 | INFO | pyMut.visualizations.mutational_signature_analysis | Signature bar chart generated in 0.16s 2025-11-12 21:00:56,574 | INFO | pyMut.core | Signature bar chart generated in 4.74s
Panel B: Cosine Similarity Heatmap¶
Heatmap comparing extracted signatures with the COSMIC catalog of known mutational signatures. High similarity values (≥ 0.85) suggest a close match to a known biological process.
Parameters:
ref_genome: Path to reference genome FASTA file (required)cosmic_path: Path to COSMIC signature catalog (required)n_signatures: Number of signatures to extract. Default:3figsize: Figure size. Default:(14, 3)title: Plot title. Default:"Cosine Similarity: Extracted vs COSMIC Signatures"
py_mutation.cosine_similarity_heatmap(
ref_genome="../../../src/pyMut/data/resources/genome/GRCh37/Homo_sapiens.GRCh37.dna.toplevel.fa",
cosmic_path="../../../src/pyMut/data/examples/COSMIC_catalogue-signatures_SBS96_v3.4/COSMIC_72.tsv",
)
2025-11-12 21:00:57,037 | INFO | pyMut.core | Generating cosine similarity heatmap... /home/xuscbart/miniconda3/envs/PyMutTFG/lib/python3.10/site-packages/sklearn/decomposition/_nmf.py:1728: ConvergenceWarning: Maximum number of iterations 500 reached. Increase it to improve convergence. warnings.warn( 2025-11-12 21:01:01,602 | INFO | pyMut.analysis.mutational_signature | NMF signature extraction completed in 0.01 seconds 2025-11-12 21:01:01,602 | INFO | pyMut.visualizations.mutational_signature_analysis | Generating cosine similarity heatmap... 2025-11-12 21:01:01,783 | INFO | pyMut.visualizations.mutational_signature_analysis | Cosine similarity heatmap generated in 0.18s 2025-11-12 21:01:01,791 | INFO | pyMut.core | Cosine similarity heatmap generated in 4.75s
Panel C: Signature Contribution Heatmap¶
Heatmap showing the proportion of mutations in each sample attributable to each signature. Values are column-normalized (each sample sums to 1).
Parameters:
ref_genome: Path to reference genome FASTA file (required)n_signatures: Number of signatures to extract. Default:3cosmic_path: Path to COSMIC catalog (optional, for signature alignment)cmap: Colormap name. Default:'Blues'show_values: Display numerical values in cells. Default:Falsefigsize: Figure size. Default: auto-calculatedtitle: Plot title. Default:"Signature Contributions per Sample"
py_mutation.signature_contribution_heatmap(
ref_genome="../../../src/pyMut/data/resources/genome/GRCh37/Homo_sapiens.GRCh37.dna.toplevel.fa"
)
2025-11-12 21:01:02,048 | INFO | pyMut.core | Generating signature contribution heatmap... /home/xuscbart/miniconda3/envs/PyMutTFG/lib/python3.10/site-packages/sklearn/decomposition/_nmf.py:1728: ConvergenceWarning: Maximum number of iterations 500 reached. Increase it to improve convergence. warnings.warn( 2025-11-12 21:01:06,499 | INFO | pyMut.analysis.mutational_signature | NMF signature extraction completed in 0.01 seconds 2025-11-12 21:01:06,500 | INFO | pyMut.visualizations.mutational_signature_analysis | Generating signature contribution heatmap... 2025-11-12 21:01:06,633 | INFO | pyMut.visualizations.mutational_signature_analysis | Signature contribution heatmap generated in 0.13s 2025-11-12 21:01:06,634 | INFO | pyMut.core | Signature contribution heatmap generated in 4.59s
Panel D: Stacked Bar Chart per Sample¶
Stacked bar chart showing signature contributions across samples. Each bar represents one sample with colored segments for each signature's proportion.
Parameters:
ref_genome: Path to reference genome FASTA file (required)n_signatures: Number of signatures to extract. Default:3sort_samples: Sort samples by dominant signature. Default:Falsemax_samples: Maximum number of samples to display. Default:Nonefigsize: Figure size. Default: auto-calculatedtitle: Plot title. Default:"Signature Contributions per Sample"
py_mutation.signature_stacked_bar_chart(
ref_genome="../../../src/pyMut/data/resources/genome/GRCh37/Homo_sapiens.GRCh37.dna.toplevel.fa"
)
2025-11-12 21:01:06,836 | INFO | pyMut.core | Generating signature stacked bar chart... /home/xuscbart/miniconda3/envs/PyMutTFG/lib/python3.10/site-packages/sklearn/decomposition/_nmf.py:1728: ConvergenceWarning: Maximum number of iterations 500 reached. Increase it to improve convergence. warnings.warn( 2025-11-12 21:01:11,360 | INFO | pyMut.analysis.mutational_signature | NMF signature extraction completed in 0.01 seconds 2025-11-12 21:01:11,361 | INFO | pyMut.visualizations.mutational_signature_analysis | Generating signature stacked bar chart... 2025-11-12 21:01:11,476 | INFO | pyMut.visualizations.mutational_signature_analysis | Signature stacked bar chart rendered in 0.12s 2025-11-12 21:01:11,477 | INFO | pyMut.core | Signature stacked bar chart generated in 4.64 seconds
Panel E: Donut Plot¶
Donut chart showing the overall distribution of mutational signatures across the entire cohort. Unlike per-sample views, this reflects both activity and prevalence of each signature.
Parameters:
ref_genome: Path to reference genome FASTA file (required)n_signatures: Number of signatures to extract. Default:3figsize: Figure size. Default:(8, 8)title: Plot title. Default:"Relative Contribution of Mutational Signatures"
py_mutation.signature_donut_plot(
ref_genome="../../../src/pyMut/data/resources/genome/GRCh37/Homo_sapiens.GRCh37.dna.toplevel.fa"
)
2025-11-12 21:01:11,896 | INFO | pyMut.core | Generating signature donut plot... /home/xuscbart/miniconda3/envs/PyMutTFG/lib/python3.10/site-packages/sklearn/decomposition/_nmf.py:1728: ConvergenceWarning: Maximum number of iterations 500 reached. Increase it to improve convergence. warnings.warn( 2025-11-12 21:01:16,473 | INFO | pyMut.analysis.mutational_signature | NMF signature extraction completed in 0.01 seconds 2025-11-12 21:01:16,473 | INFO | pyMut.visualizations.mutational_signature_analysis | Generating signature donut plot... 2025-11-12 21:01:16,498 | INFO | pyMut.visualizations.mutational_signature_analysis | Signature donut plot rendered in 0.03s 2025-11-12 21:01:16,499 | INFO | pyMut.core | Signature donut plot generated in 4.60 seconds