ID8 · GRCh37 · COSMIC v93

Mutational profile

The height of each mutational profile bar represents the proportion of one ID mutation type among all ID mutation types in the signature. Although there is no single intuitive and naturally constrained set of ID mutation types (as there arguably are for SBSs and DBSs), an 83 subclass categorisation of ID mutations was designed.

The 83 ID classification incorporates the prior knowledge that IDs commonly have sizes of 1-10 bps, that both insertions and deletions exist, that IDs of C and T occur at different rates, that IDs preferentially occur at repetitive elements, that the length of the repeat unit may influence the likelihood of an ID occurring, that the number of repeat units in a repeat stretch may influence the likelihood of an ID occurring, that IDs are also fostered in some instances by overlapping sequence microhomologies at the ID boundaries and that different mutational processes may, in principle, be differently influenced by these features. We therefore designed an 83 subclass categorisation of IDs that allows some exploration of all the above possibilities, while constraining the number of categories in order to accommodate the relatively small numbers of IDs (compared to substitutions) found in most genomes. This classification categorises IDs of lengths from 1bp to >5bp, for 1bp IDs classifies them as T or C and the number of single base repeats they occur in from 0 to >5, categorises lengths of non-single base repeat units from 2bp to >5bp and the number of repeats from 1 to >5 and size of microhomology from 0bp to >5bp. We recognise that different classifications of IDs may be preferred by others. The ID mutation types are enumerated in the following Excel document.

v3.2_ID8_PROFILE_GA_GRCh37.jpg

Genome: GRCh37

Proposed aetiology

Appears to be caused by at least two underlying mechanisms. In the vast majority of tumours the aetiology is unknown, but may involve repair of DNA double strand breaks by non-homologous DNA end-joining mechanisms. The double strand breaks could, in principle, be radiation induced and the features of ID8 mutations have some similarities to those of radiation induced mutations. The small number of tumors with the somatic p.K743N mutation in topoisomerase TOP2A, which always also have ID17, have a form of ID8 that shows evidence of transcription-associated damage and that has a strong enrichment for deletions of 6 to 8 base pairs compared to other ID8-mutated tumours. The vast majority of ID8 tumours (which do not have ID17) do not show evidence of transcription-coupled damage and have a markedly different size distribution of deletions longer than 5 base pairs.

Comments

The ID8 mutation burden is correlated with age of cancer diagnosis and this clock-like behaviour suggests that ID8 mutations may accumulate in normal cells.

Acceptance criteria

Summary of the technical and experimental evidence available in the scientific literature regarding the validation of the mutational signature.

Supporting evidence for mutational signature validity

Validated evidence for real signature
Unclear evidence for real signature
Evidence for artefact signature
Background Identification study First included in COSMIC
Alexandrov et al. 2020 Nature v3
Identification NGS technique Different variant callers Multiple sequencing centres
WES & WGS Yes Yes
Technical validation Validated in orthogonal techniques Replicated in additional studies Extended context enrichment
Yes Yes Enrichment in 6-8 bp deletions (in cases with TOP2A mutation)
Proposed aetiology Mutational process Support
DSB repair by NHEJ / TOP2A mutation Statistical association
Experimental validation Experimental study Species
- -

Tissue distribution

Numbers of mutations per megabase attributed to the mutational signature across the cancer types in which the signature was found. Each dot represents an individual sample and only samples where the signature is found are shown. The number of mutations per megabase was calculated by assuming that an average whole-exome has 30 Mb with sufficient coverage, whereas an average whole-genome has 2,800 Mb with sufficient coverage.

The numbers below the dots for each cancer type indicate the number of high confidence tumours in which the signature was attributed (above the blue horizontal line) and the total number of high confidence tumours analysed (below the blue horizontal line). Only high confidence data are displayed: samples with reconstruction accuracy >0.90.

v3.2_ID8_TISSUE.jpg

Associated signatures

In some cases appears to be caused by the same rare topoisomerase mutations that are associated with ID17.