# Mutational Signatures (v3 - May 2019)

## Doublet Base Substitution (DBS) Signatures

Each vignette has two figures:

The **first figure** (eg DBS1 in the first vignette) shows the
signature profile, where the height of each bar represents the proportion of
one DBS mutation type among all DBS mutation types in the signature. There
are 78 strand-agnostic DBS mutation types. These are enumerated in an
Excel document.

The reason there are 78 strand-agnostic DBS mutation types is as follows. First, there are 4 x 4 = 16 possible source doublet bases. Of these, AT, TA, CG, and GC are their own reverse complement. We can represent the remaining 12 as 6 possible strand-agnostic doublets (eg AC represents both AC and its reverse complement, GT). Thus, there are 4+6=10 source doublet bases. Because they are their own reverse complements, AT, TA, CG, and GC can each be substituted by only 6 doublets (see enumeration in the accompanying Excel document). For example, AT can be substituted by 3 doublets starting with C: CA, CC, CG. But AT can be substituted by only 2 doublets starting with G: GA and GC. This is because the mutation from AT>GG is already represented by its reverse complement, AT>CC. Similarly AT can be substituted by only 1 doublet starting with T: TA. This is because AT>TC is represented by its reverse complement, AT>GA, and AT>TG is represented by AT>CA. For the remaining doublets, which are not their own reverse-complements, there are 3x3 = 9 possible DBS mutation types. Thus, in total there are 4 x 6 + 6 x 9 = 78 strand-agnostic DBS mutation types.

The **second figure**, "Cancer types in which the signature is
found," shows the numbers of mutations per megabase attributed to each
mutational signature in samples with the signature. Only those cancer types
with tumors in which signature activity is attributed are shown. The numbers
below the dots for each cancer type indicate the number of tumors in which
the signatures was attributed (above the blue horizontal bar) and the total
number of tumors analyzed (below the blue bar).

The signatures are available in numerical form from synapse.org ID syn12009743.