Datasheets

Full description of new v79 content

Summary

  1. New fully curated cancer genes;
    • PRKACA - 2,145 samples, 286 mutations, 55 papers
    • AR - 3,226 samples, 598 mutations, 141 papers
  2. Substantial updates to GNAS, GNAQ, and GNA11
  3. Curated Gene Fusions;
  4. Cancer Gene Census;
    • 7 new genes added
  5. Drug Resistance; 1 new gene (SMO) and 1 new drug (Vismodegib), 19 new unique resistance mutations curated.
  6. Removal of duplicate mutations
  7. Genome Data
    • ICGC release 22; August 23rd 2016
      • Mutations; 2 new studies
      • Copy Number Variants; 2 new studies, 3 studies updated
      • Structural Variants; 1 new study
    • Systematic Screens; 9 new papers (265 new genomes)

COSMIC's cancer genome data is interpreted into standardised annotations from a variety of sources, described here.


Curated Genes

PRKACA

Activating mutations in PRKACA have been identified in adrenocortical adenomas associated with Cushing’s syndrome, with reported frequencies of up to 65%. PRKACA encodes the catalytic subunit alpha of protein kinase A (PKA). A hotspot mutation in PRKACA at L206R occurs at a specific binding site for the interaction between catalytic and regulatory subunits of PKA. The mutation results in constitutive activation of the C-alpha subunit leading to cAMP-independent activity of the PKA and enhanced cortisol production.

AR

Androgen receptor (AR) dysregulation plays a key role in prostate cancer development and progression. AR is a transcription factor that is essential for normal prostate cell growth and survival. Somatic mutations in AR are considered to be rare in primary prostate cancer but they are more common in patients with advanced or castrate resistant prostate cancer. A portion of this literature is now represented in COSMIC. Mutations are most commonly found in the ligand binding domain.

Updates to GNAS, GNAQ, GNA11

GNAS, GNAQ and GNA11 expertly curated data have been substantially updated. 60 references reporting mutation screening for these genes are included in this release.

GNAS, encoding the alpha-subunit of the stimulatory G-protein, has mutation hotspots at codons 201 and 227 which result in an overactive G protein leading to abnormal cell growth. These mutations are frequently found in endocrine tumours such as growth hormone-secreting pituitary adenoma and in pancreatic intraductal papillary mucinous neoplasm, a precursor to pancreatic adenocarcinoma. GNAS mutations also occur in fibrous dysplasia of bone where they can support the differential diagnosis of fibrous dysplasia from ossifying fibroma. Activating GNAS mutations occurring in early development lead to mosaicism and multiple clinical manifestations, including polyostotic fibrous dysplasia, endocrine tumours and hormone hypersecretion, and cafe-au-lait skin pigmentation (McCune-Albright syndrome).

Activating mutations in GNAQ and GNA11 are driver mutations in uveal melanoma, with 85% having a mutation in one of these genes. Their occurrence is mutually exclusive and there is a recurrent mutation at Q209 and less frequently at R183. Mutations at these positions lead to a constitutive activation of the Gαq and Gα11 subunits by preventing their intrinsic GTPase activity and the return to an inactive state. The hotspot mutations in GNAQ or GNA11 are also common in benign nevi such as blue nevi. For GNA11 Q209 the frequency of mutations increases progressively from blue nevus to primary uveal melanoma to metastasis; an inverse pattern to that seen with GNAQ Q209 mutations. Mutations in GNAQ and GNA11 at codon 209 are also found in primary melanocytic neoplasms of the central nervous system, a range of rare neoplasms derived from scattered melanocytes located in the leptomeninges.


Curated Gene Fusions

CBFA2T3-GLIS2

CBFA2T3-GLIS2 is a recurrent fusion in paediatric non-Down’s syndrome acute megakaryoblastic leukaemia, where it confers poor prognosis. CBFA2T3, a member of the ETO family of nuclear corepressors, fuses to GLIS2, a member of the GLI-similar zinc finger protein family encoding a nuclear transcription factor with five C2H2-type zinc finger domains. The resultant protein retains the three CBFA2T3 N-terminal nervy homology regions that mediate protein interactions and the GLIS2 C-terminal zinc finger domains. This fusion has also been reported at low frequency in paediatric cytogenetically normal acute myeloid leukaemia.


Cancer Gene Census

Complete census list available here

Genes added:

EPAS1 - Endothelial PAS domain protein 1

Transcription factor involved in the induction of oxygen regulated genes.

PTPRT - Protein tyrosine phosphatase, receptor type T

Tyrosine phosphatase, dephosphorylates STAT3.

PPM1D - Protein phosphatase, Mg2+/Mn2+ dependent 1D

Regulator of TP53 and CHK2 by dephosphorylation, dephosphorylates ATM and H2AX.

BTK - Bruton tyrosine kinase

Non-receptor tyrosine kinase of the Tec family, plays role in BCR signalling, essential for B-cell development, negatively regulates beta catenin-dependent Wnt signalling in colorectal cancer.

PREX2 - Phosphatidylinositol-3,4,5-trisphosphate dependent Rac exchange factor 2

Guanine nucleotide exchange factor (GEF) for Rac1, activates Rac1, activates PI3K pathway by inhibition of PTEN

TP63 - Tumour protein p63

Sequence specific DNA binding transcriptional activator or repressor, TAp63 isoform can activate p53 target genes and induce apoptosis, deltaNp63 acts in dominant negative way to counteract activator function.

QKI - QKI, KH domain containing RNA binding

RNA-binding protein, crucial for myelinisation and differentiation of oligodendrocytes, regulates the alternative splicing of NUMB via binding to two RNA elements in its pre-mRNA.


Genetics of Drug Resistance

We now include drug resistance data for the gene SMO (Vismodegib) as well as updates for EGFR (Gefitinib,Erlotinib and Afatinib), ESR1 (Endocrine therapy) and ALK (Alectinib).

All drug resistance data is detailed here, describing our curations across 11 genes and 21 pharmaceuticals. Links are provided to explore this information in detail, with charts showing the landscape of resistance to drugs targeting mutations in the gene of interest.


Removal of duplicated mutations

For previous releases, the process of uploading and annotating mutations from whole genome screens (ICGC/TCGA) has introduced some duplication of mutations. This has not affected overall mutation counts, but for any specific mutation-transcript annotation there may be multiple IDs rather than 1. We have been working to merge these duplicates, however, this process is complex and has not been completed in this release.

Some duplication has also been present in the mutation download files. Again, merging duplicated rows is a work in progress and has not been completed in this release.


ICGC release 22

Simple Somatic Mutations

New Studies:

COSU668 BRCA-FR COSU669 BRCA-KR

Copy Number Variants

New Studies:

COSU661 PAEN-IT COSU668 BRCA-FR

Updated Studies:

COSU382 PACA-CA COSU656 SKCA-BR COSU589 THCA-SA (removed)

Structural Variants

New Studies:

COSU668 BRCA-FR


Systematic Screen Papers

Follow links below to the 9 papers which are new in v79, or view the full table of papers here.

COSP40874 COSP41819 COSP41825 COSP42045 COSP42051 COSP41614 COSP39263 COSP41476 COSP41563


Cell Lines Project

Cell line 'GT3TKB' (COSS907041) has been removed from the database. The copy number and genotype data have been removed for cell line 'MOG-G-CCM' (COSS908144). This is because checks have not verified the identity of GT3TKB, and the Affymetrix SNP6 array data for MOG-G-CCM is actually derived from MOG-G-UVW.


Full description of new v78 content

Summary

  1. New fully curated cancer genes;
    • HIF1A - 1,782 samples, 196 mutations, 56 papers
    • MTOR - 3,239 samples, 634 mutations, 132 papers
    • PTPN13 - 1,761 samples, 429 mutations, 85 papers
  2. Curated Gene Fusions;
    • ETV6-RUNX1 - 2,276 samples, 357 mutations, 37 papers
  3. Cancer Gene Census;
    • 9 new genes added, 1 removed
  4. Drug Resistance; 1 new gene (FLT3) and 2 new drugs (Quizartinib and Sorafenib), 76 new unique resistance mutations curated.
  5. Mouse insertional mutagenesis; the latest update has mouse data for 5,600 additional genes
  6. Genome Data
    • Copy Number Variation; Re-analysis of all TCGA studies with ASCAT v2
    • ICGC release 21; May 16th 2016
      • Gene Expression; 5 studies updated
      • Methylation; 3 studies updated
      • Copy Number Variants; 10 new studies, 28 studies updated
      • Structural Variants; 2 new studies
    • Systematic Screens; 31 new papers (1,165 new samples)

COSMIC's cancer genome data is interpreted into standardised annotations from a variety of sources, described here.


Curated Genes

HIF1A

Hypoxia inducible factor 1, alpha subunit, (HIF1A) drives cancer progression and resistance to therapy by mediating metabolic responses to intratumoural hypoxia and oncogenic mutations. Somatic missense mutations in HIF1A gene have been found in endometrioid carcinoma and glioblastoma and in many other cancers in lower levels. Other recognised mechanisms of HIF1A activation are indirect e.g. increased HIF-1α synthesis and stability. HIF1A amplification and fusion genes have also been detected in cancer tissues.

MTOR

Mechanistic Target of Rapamycin (mTOR) is a conserved serine/threonine kinase involved in regulation of cellular processes including growth and proliferation. Somatic missense mutations are found in many cancer types including renal cell carcinoma and germ cell tumours. Mutations occur across the gene but recurrent activating mutations are often clustered in several domains in the C terminal half of the gene and are targets for first and second generation mTOR inhibitors such as everolimus (a rapamycin structural analogue), pazopanib and AZD8055. Studies have shown that these mutations result in hyperactivation through a variety of mechanisms and predict rapamycin sensitivity. Acquired resistant mutations have been observed in the FRB and kinase domains following mTOR inhibitor therapy.

PTPN13

PTPN13 encodes a member of the protein tyrosine phosphatase (PTP) family. The large intracellular protein has a catalytic PTP domain at its C-terminus and two major structural domains: a region with five PDZ domains and a FERM domain that binds to plasma membrane and cytoskeletal elements. PTPN13 has been identified as a tumour suppressor gene, with mutations detected in non-small cell lung cancer, as well as gastric and colorectal cancers.


Curated Gene Fusions

ETV6/RUNX1

ETV6-RUNX1 (TEL-AML1) fusions are now represented in COSMIC. A proportion of the literature reporting on this fusion has been curated. ETV6-RUNX1 is the most common fusion in paediatric acute lymphoblastic leukaemia (ALL), occurring in 25% of cases. It is specific to B lineage ALL and is associated with favourable prognosis with conventional therapies. In the fusion, the helix-loop-helix domain and the exon 5 coded central region of ETV6 typically fuses to almost the entire RUNX1 including the DNA-binding domain. While the ETV6 breakpoint is consistent at exon 5 in fusion transcripts, RUNX1 has a major breakpoint at exon 2 and a minor breakpoint at exon 3.


Cancer Gene Census

Complete census list available here

Genes added:

APOBEC3B - Polipoprotein B mRNA editing enzyme catalytic subunit 3B

Effector protein in the innate immune response to virus infection, source of DNA damage and mutagenesis in breast, head/neck, cervix, bladder, lung, cervical and ovarian cancer, germline deletion is associated with an increased risk of breast cancer.

B2M - Beta-2-microglobulin

Inactivating mutations in 5’ region of the gene lead to impaired expression of HLA class I antigen, thus helping tumour to avoid the immune destruction; involved in gene fusions.

BCORL1 - BCL6 corepressor-like 1

Transcriptional co-repressor that contributes to the repression of E-cadherin, promoting cell migration and invasion; involved in gene fusions.

DDR2 - Discoidin domain receptor tyrosine kinase 2

Cell surface receptor for fibrillar collagen, regulates cell proliferation and extracellular matrix synthesis, facilitates invasion and metastasis via activating ERK signalling and stabilizing SNAIL1.

DDX3X - DEAD-box helicase 3, X-linked

Interacts with RNA and ribosomal machinery to help remodel the translation landscape in response to stress, positively regulates DNA damage-induced apoptosis and p53 stabilization

DROSHA - Drosha ribonuclease III

Part of micro-RNA biogenesis machinery, recurrent mutation (E1147K) affecting a metal-binding residue of the RNase IIIb domain leads to a predominant downregulation of a subset of miRNAs and promotes tumorogenesis; involved in gene fusions

KEAP1 - Kelch like ECH associated protein 1

Adaptor for CUL3 ubiquitin ligase in a complex targeting Nrf2, IKKB, Bcl2, master regulator of cytoprotective gene expression through Nrf2, LOF mutations reduce its affinity to Nrf2

LRP1B - LDL receptor related protein 1B

Transmembrane receptor, inactivation results in changes to the tumour environment that confer cancer cells an increased growth and invasive capacity; involved in gene fusions

MAPK1 - Mitogen-activated protein kinase 1

Component of the MAPK pathway downstream of RAS, RAF and MEK, promotes cell survival, migration and invasiveness, recurrent E322K is located in the cytoplasmic retention motif, causes constitutive activation and results in enhanced phosphorylation of EGFR

Genes Removed:

CHN1


ICGC release 21

Gene Expression - Updated Studies:

COSU329 GBM-US COSU376 COAD-US COSU418 LUSC-US COSU540 SKCM-US COSU545 LGG-US

Methylation - Updated Studies:

COSU414 BRCA-US COSU416 KIRC-US COSU417 LUAD-US

Copy Number Variants

New Studies:

COSU537 PRAD-CA COSU543 KIRP-US COSU652 BRCA-EU COSU657 WT-US COSU662 CHOL-US COSU663 MESO-US COSU664 PCPG-US COSU665 TGCT-US COSU666 THYM-US COSU667 UVM-US

Updated Studies:

COSU329 GBM-US COSU331 OV-US COSU375 READ-US COSU376 COAD-US COSU377 LAML-US COSU381 LICA-FR COSU382 PACA-CA COSU413 BLCA-US COSU414 BRCA-US COSU415 CESC-US COSU416 KIRC-US COSU417 LUAD-US COSU418 LUSC-US COSU419 UCEC-US COSU435 PRAD-US COSU540 SKCM-US COSU541 STAD-US COSU542 THCA-US COSU545 LGG-US COSU627 HNSC-US COSU628 LIHC-US COSU629 PAAD-US COSU631 ACC-US COSU632 DLBC-US COSU633 ESCA-US COSU634 KICH-US COSU635 SARC-US COSU636 UCS-US

Structural Variants - New Studies:

COSU652 BRCA-EU COSU661 PAEN-IT


Systematic Screen Papers

Follow links below to the 31 papers which are new in v78, or view the full table of papers here.

COSP34346 COSP38283 COSP41536 COSP36571 COSP34281 COSP36571 COSP41431 COSP41400 COSP38252 COSP40071 COSP41130 COSP41412 COSP41015 COSP40300 COSP41173 COSP40999 COSP40684 COSP40463 COSP37931 COSP40448 COSP40557 COSP40391 COSP40527 COSP40657 COSP40767 COSP40422 COSP40595 COSP40526 COSP40574 COSP41195 COSP38745


Full description of new v77 content

Summary

  1. New fully curated cancer genes;
    • TBX3 - 55 papers, 2,724 samples, 241 mutations.
    • NFKBIE - 34 papers, 1,710 samples, 176 mutations.
    • ATR - 106 papers, 3,379 samples, 646 mutations.
  2. Curated Fusions;
  3. Genome Data
    • 16 New Systematic Screen papers (741 genome-wide tumour analyses).
  4. Genetics of Drug Resistance
    • New Content: Drug resistance data for 9 therapeutic target genes and 18 drugs
  5. Cancer Gene Census
    • 23 new cancer-causing genes annotated, with evidence.

COSMIC's cancer genome data is interpreted into standardised annotations from a variety of sources, described here.


Curated Genes & Fusions

TBX3

TBX3 is a member of the T-box family of genes that encode developmentally important transcription factors. In cancer TBX3 has been found to have a role in cell proliferation, invasion and migration. Somatic mutations of TBX3 are found at low level in mainly epithelial tumours in organs such as breast, large intestine and skin. It has been proposed to function as either an oncogene or a tumour suppressor depending on the cellular context.

NFKBIE

NFKBIE encodes a protein which inhibits downstream nuclear factor-kappa-B signalling by sequestering NFKB transcription factors in the cytoplasm. It is a tumour suppressor gene with recurrent inactivating mutations in 10% of chronic lymphocytic leukaemia patients where NFKBIE appears to be an early-mutated gene. A frameshift deletion at Y254 is a common mutation. Recurrent missense mutations of NFKBIE have been identified in desmoplastic melanoma. These cases lack the common oncogenic mutations in BRAF and NRAS but do have a UV mutational signature.

ATR

ATR is a serine-threonine kinase belonging to the PIK family and closely related to ATM. ATR is involved the DNA damage response, maintaining genomic stability by preventing mitosis in response to radiation and other damage. Somatic heterozygous insertions/deletions in a coding microsatellite repeat resulting in a truncated protein are found in some MSI-high endometrial and stomach cancers which have evidence of defective mismatch repair. These mutations are also thought to confer selective advantage to tumours due to their resistance to radiation and topoisomerase inhibitors. Missense mutations in ATR have also been observed in cancers, including myelomas and epithelial ovarian cancer.

STIL-TAL1

STIL-TAL1 fusions are exclusive to T cell acute lymphoblastic leukaemia (T-ALL), with greater frequency in childhood than adult T-ALL. The breakpoint in TAL1, which encodes a transcription factor of the basic helix-loop-helix family, occurs in the 5’-UTR, placing its whole coding sequence under the control of the regulatory elements of STIL. While the genomic breakpoint in STIL usually occurs in intron 1, the genomic breakpoint in TAL1 occurs predominantly at one of 2 sites in the UTR in 95% of cases. DNAJB1-PRKACA

Recurrent DNAJB1-PRKACA fusions have been identified in fibrolamellar hepatocellular carcinoma, a rare, highly aggressive subtype of liver cancer affecting adolescents and young adults. The fused protein contains the amino-terminal domain of DNAJB1, which encodes a heat shock protein, fused to PRKACA, the catalytic domain of protein kinase A (PKA). The chimeric protein results in activation of cAMP-dependent PKA signalling.


Systematic Screen Papers

Follow links below to the 16 papers which are new in v77, or view the full table of papers here.

COSP41006 COSP40339 COSP40831 COSP41071 COSP40943 COSP28587 COSP40322 COSP40865 COSP38049 COSP40462 COSP40620 COSP39553 COSP40729 COSP40841 COSP40551 COSP40312


Drug Resistance

Follow the links below to the 9 genes with drug resistance data -

ABL1 ALK BRAF EGFR ESR1 KIT MAP2K1 MAP2K2 PDGFRA

Drugs: Vemurafenib, AZD9291, Ceritinib, Erlotinib, Gefitinib, Imatinib, Nilotinib, Tyrosine kinase inhibitor - NS, Afatinib, Endocrine therapy, Alectinib, PD0325901, Dasatinib, Crizotinib, Selumetinib, Sunitinib, Dabrafenib, Bosutinib


Cancer Gene Census

23 genes have been added to the Cancer Gene Census:

AR CHD4 CTCF CXCR4 ERBB4 FAT1 FAT4 HIF1A LEF1 LZTR1 MTOR NCOR2 PRKACA PTK6 PTPN13 RBM10 SDHA SMAD2 SMAD3 TGFBR2 USP8 ZFHX3


Full description of new v76 content

Summary

  1. Curated Genes;
    • PPP6C - 33 papers, 1023 samples, 164 mutations.
    • SPOP - 74 papers, 2230 samples, 224 mutations.
  2. Genome Data
    • ICGC release 20; November 27th 2015
      • 2 new studies
      • Gene Expression; 17 studies updated (187 new samples, 227,023 new variants)
      • Structural Mutations; 2 studies updated (243 new samples, 18,369 new variants)
      • Copy Number Variants; 18 studies updated (1,188 new samples, 44,689 new variants)
    • 17 New Systematic Screen papers (238 new samples).

COSMIC's cancer genome data is interpreted into standardised annotations from a variety of sources, described here.


Curated Genes

PPP6C

PPP6C encodes the catalytic subunit of the PP6 serine threonine phosphatase complex, a component of a signalling pathway regulating cell cycle progression. Recurrent mutations in PPP6C have been detected in melanomas, with hot spots including R264. Clustering of mainly missense mutations occur in or near highly conserved positions in the catalytic site and the surrounding substrate recognition area.

SPOP

Speckle-type POZ protein (SPOP) is an adaptor protein of the CUL3-RBX1 E3 ubiquitin ligase complex. It selectively recruits substrates via its N-terminal MATH domain, whereas its BTB domain mediates dimerization and interaction with CUL3. Majority of the cancer-associated genomic SPOP mutations affect evolutionarily conserved residues in the MATH domain suggesting that the mutations may alter the interaction of SPOP with its substrates and interrupt its tumour suppressor function by affecting target gene degradation. SPOP has been linked to the ubiquitination of several substrates including androgen receptor. SPOP is frequently mutated in prostate and endometrial cancers.


ICGC release 20

New ICGC Studies:

COSU651 AML-US COSU658 BTCA-JP

Gene Expression - Updated Studies:

COSU376 COAD-US COSU413 BLCA-US COSU414 BRCA-US COSU415 CESC-US COSU416 KIRC-US COSU417 LUAD-US COSU418 LUSC-US COSU419 UCEC-US COSU435 PRAD-US COSU540 SKCM-US COSU542 THCA-US COSU545 LGG-US COSU627 HNSC-US COSU628 LIHC-US COSU629 PAAD-US COSU632 DLBC-US COSU635 SARC-US

Copy Number Variants - Updated Studies:

COSU584 NBL-US COSU329 GBM-US COSU340 CLLE-ES COSU382 PACA-CA COSU414 BRCA-US COSU416 KIRC-US COSU417 LUAD-US COSU418 LUSC-US COSU419 UCEC-US COSU540 SKCM-US COSU541 STAD-US COSU542 THCA-US COSU545 LGG-US COSU584 NBL-US COSU627 HNSC-US COSU628 LIHC-US COSU629 PAAD-US COSU635 SARC-US

Structural Mutations - Updated Studies:

COSU340 CLLE-ES COSU382 PACA-CA


Systematic Screen Papers

Follow links below to the 17 papers which are new in v76, or view the full table of papers here.

COSP32640 COSP40815 COSP40739 COSP40695 COSP40815 COSP40649 COSP40744 COSP40306 COSP39209 COSP40032 COSP39713 COSP40314 COSP40289 COSP40307 COSP40000 COSP39756 COSP40548


Full description of new v75 content

Summary

  1. Curated Genes;
    • GRIN2A - 93 papers, 2004 samples, 667 mutations.
  2. Curated Fusions;
    • TCF3-PBX1 (E2A-PBX1) - 48 papers, 3416 samples, 296 mutations
  3. Cancer Gene Census; 4 names have been updated.
  4. Genome Data
    • 4 TCGA studies reanalysed by the WTSI Cancer Genome Project.
    • 17 new Systematic Screen papers (474 new samples).

COSMIC's cancer genome data is interpreted into standardised annotations from a variety of sources, described here.


Curated Genes

GRIN2A

GRIN2A gene encodes GluN2A protein, a regulatory subunit of the glutamate-gated N-methyl-d-aspartate receptor (NMDAR). NMDARs have a role in central nervous system synaptic transmission, cellular migration and survival. Somatic missense mutations in GRIN2A are frequently found in melanoma but also in other cancers such as colorectal, gastric and lung carcinoma. Functional studies have shown that these mutations can disrupt NMDAR complex formation and result in loss of function in calcium channel signalling, increase cell migration and proliferation. Some clinical studies have linked GRIN2A mutations to faster disease progression and shorter overall survival.


Curated Fusions

TCF3-PBX1

TCF3-PBX1 (E2A-PBX1) fusions are now represented in COSMIC. A proportion of the literature reporting on this fusion has been curated. The TCF3-PBX1 fusion results from the t(1;19)(q23;p13) chromosomal translocation, one of the frequently recurring translocations in childhood acute lymphoblastic leukaemia where it is strongly associated with pre-B immunophenotype. TCF3 encodes a member of the E protein (class I) family of helix-loop-helix transcription factors and PBX1 encodes a nuclear protein that belongs to the PBX homeobox family of transcriptional factors. The fusion protein includes the N-terminal transactivation domain of TCF3 fused to the C-terminal DNA-binding homeobox domain of PBX1. It transforms cells by constitutively activating transcription of genes regulated by the PBX protein family. Cases most commonly present fusion of TCF3 exon 16 to PBX1 exon 3.


Cancer Gene Census

4 gene names have been updated in the Cancer Gene Census. Thesse are shown below, with the old names in square brackets.

CDKN2a(p14) SEPT5 SEPT6 SEPT9       [ CDKN2A(p14) ] [ Sep-05 ] [ Sep-06 ] [ Sep-09 ]


Systematic Screen Papers

Follow links below to the 17 papers which are new in v75, or view the full table of papers here.

COSP36295 COSP37241 COSP35276 COSP36625 COSP36102 COSP36383 COSP36203 COSP40457 COSP37751 COSP28892 COSP35648 COSP30070 COSP38922 COSP39961 COSP39468 COSP39468 COSP36825


TCGA studies reanalysed by the Cancer Genome Project (CGP), Sanger Institute:

COSU375 READ-US COSU376 COAD-US COSU414 BRCA-US COSU417 LUAD-US

The methodology used is described here.


Full description of new v74 content

Summary

  1. Curated Genes;
    • POLE - 86 papers, 3919 samples.
    • AXIN2 - 74 papers, 2241 samples.
    • KDM6A - 140 papers, 4144 samples.
  2. Curated Fusions;
    • BCR-ABL1 - 196 papers, 5990 samples.
    • KMT2A (MLL) - 52 fusion pairs, 334 papers, 3069 samples.
  3. Cancer Gene Census; FLT4 has been added the census.
  4. Genome Data Imports
    • Simple Somatic Mutations (SSM);
      • ICGC release 19 (2 new studies, 49 studies updated, 367 new samples, 1.94 million new mutations).
      • 17 new Systematic Screen papers (453 new samples).
    • Structural Variants; 2 new and 9 updated studies, 85,076 new variants (ICGC release 19)
    • Copy Number Variants; 3 new and 13 updated studies, 447 new samples, 154,863 new variants (ICGC release 19)
    • Gene Expression Variants; 15 updated studies, 602 new samples, 642,701 new variants (ICGC release 19)

COSMIC's cancer genome data is interpreted into standardised annotations from a variety of sources, described here.


Curated Genes

POLE

POLE (DNA polymerase epsilon catalytic subunit A, 12q24.3) contains proof-reading exonuclease domain that functions to ensure low mutation rates in replicating cells. Missense mutations of the exonuclease domain of POLE have been documented in colorectal, endometrial and stomach carcinomas, melanoma and many other tumour types, and are associated with a high number of genetic mutations across the tumour genome. The POLE-induced ultramutated somatic phenotype is distinguishable from the hypermutated tumours by its mutational signature. The frequently occurring P286R mutation seems to produce a particularly strong mutator phenotype leading to increased frequencies of recurrent nonsense mutations in other key tumour suppressor genes providing an explanation for why loss of heterozygosity is not required for the development of POLE-mutant tumours. With respect to clinical outcome, endometrial carcinomas with POLE exonuclease domain mutation have been found to exhibit a less aggressive clinical course as compared to endometrial carcinomas without POLE exonuclease domain mutation.

AXIN2

AXIN2 functions as a scaffold protein regulating a variety of signalling pathways and biological functions. It is a key component of Wnt-signalling binding to several of its members including APC, GSK3 and β-catenin. Mutations in the AXIN2 gene lead to stabilization of β-catenin and activation of target genes. AXIN2 mutations have been found in several cancers including colorectal, gastric and hepatocellular carcinomas. Mutations are found throughout the gene including APC, GSK3 and β-catenin binding domains, however, G7 repeat in exon 7 (codons 663, 664 and 665) seems to be particularly prone to frameshift mutations in some microsatellite unstable tumours.

KDM6A (formerly UTX)

Located on Xp11.2, lysine (K)-specific demethylase 6A is positioned in a chromosomal region which escapes X-inactivation and has a Y paralogue (UTY). It encodes a ubiquitously expressed protein which controls levels of H3K27me3 and is thus associated with genomic silencing and repression of transcription. Somatic mutations have been found in a wide variety of solid and haematopoietic tumours including multiple myeloma, renal cell, bladder and oesophageal carcinomas, medulloblastoma, astrocytoma and B-cell/T-cell acute lymphoblastic leukaemias (ALL). Mutations are often inactivating frameshift or nonsense mutations and KDM6A has been shown to act as a tumour suppressor in a both solid tumours and ALL, with mutant KDM6A leukaemia’s demonstrating sensitivity to H3K27me inhibitors. Some studies report clear tendency for mutated females to show LOH. KDM6A mutations are mutually exclusive with those also found in its binding partner KMT2D (MLL2) in a variety of tumours.


Curated Fusions

BCR-ABL1

BCR-ABL1 fusions are now represented in COSMIC. A proportion of the huge corpus of literature reporting on this fusion pair has been curated. This representative curation covers the range of variant BCR-ABL1 fusions which have been reported as well as the phenotypes in which they have been found. The BCR-ABL1 fusion results from the Philadelphia chromosome, a reciprocal translocation t(9;21)(q34;q11), which characterizes chronic myeloid leukaemia (CML). In the fusion product, the amino-terminus of ABL1, a protooncogene encoding a protein with non-receptor tyrosine kinase activity involved in a variety of cellular processes, is replaced by BCR and the product has constitutively active tyrosine kinase activity. There are 2 predominant variant transcripts found in CML with BCR breakpoints within the major breakpoint region: BCR exon 13 or exon 14 fused to ABL1 exon 2. A minority of CML cases have one of a few rarer variants, including those with breakpoints in the BCR micro breakpoint region: exon 19 of BCR fused to exon 2 of ABL1. In Philadelphia-positive acute lymphoblastic leukaemia a BCR breakpoint at the minor breakpoint region, exon 1 fused to ABL1 exon 2, is a common variant.

KMT2A (MLL) fusions

KMT2A (MLL) fusions are now represented in COSMIC. A proportion of the literature reporting on fusions involving KMT2A has been curated. This representative curation covers a wide range of KMT2A partners involved in these fusions as well as the phenotypes in which they have been found. Numerous partner genes have been reported for KMT2A but the majority of cases involve AFF1, MLLT1, MLLT3, MLLT4, MLLT10, ELL or EPS15. KMT2A fusions occur in 5-10% of acute myeloid and lymphocytic leukaemias, with higher frequencies in some patient subsets such as infant leukaemias. Some of the KMT2A fusions are associated with poor outcome in paediatric and adult acute leukaemias. KMT2A is also translocated in patients with therapy-related acute leukaemia induced by DNA topoisomerase II inhibitors.

KMT2A-ABI1 KMT2A-ABI2 KMT2A-ACTN4 KMT2A-AFF1 KMT2A-AFF3 KMT2A-AFF4 KMT2A-ARHGAP26 KMT2A-ARHGEF12 KMT2A-BTBD18 KMT2A-CASC5 KMT2A-CASP8AP2 KMT2A-CBL KMT2A-CREBBP KMT2A-CT45A2 KMT2A-DAB2IP KMT2A-EEFSEC KMT2A-ELL KMT2A-EP300 KMT2A-EPS15 KMT2A-FOXO3 KMT2A-FOXO4 KMT2A-FRYL KMT2A-GAS7 KMT2A-GMPS KMT2A-GPHN KMT2A-KIAA0284 KMT2A-KIAA1524 KMT2A-LASP1 KMT2A-LPP KMT2A-MAPRE1 KMT2A-MLLT1 KMT2A-MLLT10 KMT2A-MLLT11 KMT2A-MLLT3 KMT2A-MLLT4 KMT2A-MLLT6 KMT2A-MYO1F KMT2A-NCKIPSD KMT2A-NRIP3 KMT2A-PDS5A KMT2A-PICALM KMT2A-PRRC1 KMT2A-SARNP KMT2A-SEPT2 KMT2A-SEPT5 KMT2A-SEPT6 KMT2A-SEPT9 KMT2A-SH3GL1 KMT2A-SORBS2 KMT2A-TET1 KMT2A-TOP3A KMT2A-ZFYVE19


Cancer Gene Census

FLT4 has been added to the Cancer Gene Census.

FLT4


Simple Somatic Mutations (SSM)

Loaded from ICGC release 19

New studies:

COSU646 COCA-CN COSU656 SKCA-BR

Updated ICGC studies:

COSU322 LIRI-JP COSU323 LINC-JP COSU328 PACA-AU COSU329 GBM-US COSU331 OV-US COSU340 CLLE-ES COSU351 CMDI-UK COSU371 GACA-CN COSU375 READ-US COSU376 COAD-US COSU377 LAML-US COSU379 PBCA-DE COSU381 LICA-FR COSU382 PACA-CA COSU385 BRCA-UK COSU413 BLCA-US COSU414 BRCA-US COSU415 CESC-US COSU416 KIRC-US COSU417 LUAD-US COSU418 LUSC-US COSU419 UCEC-US COSU435 PRAD-US COSU440 MALY-DE COSU486 BOCA-UK COSU534 EOPC-DE COSU535 ESAD-UK COSU537 PRAD-CA COSU538 PRAD-UK COSU539 ORCA-IN COSU540 SKCM-US COSU541 STAD-US COSU542 THCA-US COSU543 KIRP-US COSU544 LAML-KR COSU545 LGG-US COSU581 BLCA-CN COSU582 ESCA-CN COSU583 LUSC-KR COSU584 NBL-US COSU585 OV-AU COSU586 PAEN-AU COSU587 RECA-CN COSU588 RECA-EU COSU589 THCA-SA COSU628 LIHC-US COSU646 COCA-CN COSU648 LIHM-FR COSU650 PACA-IT COSU656 SKCA-BR

Systematic Screen Papers

Follow links below to the 17 papers which are new in v74, or view the full table of papers here.

COSP37004 COSP36721 COSP36980 COSP37415 COSP37137 COSP37714 COSP36179 COSP39462 COSP37188 COSP29073 COSP39525 COSP36278 COSP35291 COSP39189 COSP39667 COSP38509 COSP37842


Structural Variants

New studies from ICGC release 19:

COSU645 BOCA-FR COSU656 SKCA-BR

Updated studies:

COSU328 COSU440 COSU486 COSU534 COSU537 COSU538 COSU584 COSU585 COSU650


Copy Number Variants

New Studies from ICGC release 19:

COSU645 BOCA-FR COSU646 COCA-CN COSU656 SKCA-BR

Updated studies:

COSU413 BLCA-US COSU414 BRCA-US COSU417 LUAD-US COSU418 LUSC-US COSU419 UCEC-US COSU435 PRAD-US COSU538 PRAD-UK COSU540 SKCM-US COSU542 THCA-US COSU545 LGG-US COSU627 HNSC-US COSU628 LIHC-US COSU629 PAAD-US


Gene Expression Variants

Updated studies:

COSU376 COAD-US COSU413 BLCA-US COSU414 BRCA-US COSU415 CESC-US COSU416 KIRC-US COSU417 LUAD-US COSU418 LUSC-US COSU419 UCEC-US COSU435 PRAD-US COSU540 SKCM-US COSU542 THCA-US COSU545 LGG-US COSU627 HNSC-US COSU628 LIHC-US COSU629 PAAD-US


Full description of new v73 content

Summary

  1. Curated Genes; 9 new fully curated genes.
  2. Cancer Gene Census; synchronised the list with HGNC symbols and chromosome assignments.
  3. Genome Data Imports
    • • Simple Somatic Mutations (SSM);
      • • ICGC release 18 (3 new studies).
      • • 26 new Systematic Screen papers (805 new tumour samples).
  4. COSMIC Genome Browser
    • • New Tracks
      • • ENCODE Regulatory Features.
      • • dbSNP.
      • • SNPs excluded from the COSMIC website.
    • Cancer Browser; a new Genome Browser added to view data by cancer type.
  5. Mutation Impact; FATHMM MKL scores added for coding mutations.
  6. Cell Lines project; Tissue/Histology classifications updated for 267 cell lines.


Curated Genes

SPEN

SPEN encodes an RNA binding co-regulatory protein which functions as a negative notch signalling regulator and participates in other pathways’ transcriptional responses. In the absence of activated Notch signalling it acts as a transcriptional repressor to down-regulate specific genes. Somatic mutations have recently been reported in adenoid cystic carcinoma, solid type (ACC), splenic marginal zone lymphoma (SMZL) and hepatitis C-associated diffuse large B cell lymphoma. Mutations are predominantly nonsense or frameshifts predicted to encode truncated proteins lacking the c-terminal domain involved in the notch signalling inhibition. Evidence of LOH is seen in some ACC cases, but not SMZL, suggesting no simple loss-of-function process in this cancer type.

IKBKB

IKBKB is member of the NF-κB signalling pathway and is 1 of 3 core proteins making up the IKK kinase complex which regulates the NF-κB transcription factor. Recurrent somatically acquired missense mutations affecting codon Lys171 have been found in splenic marginal zone lymphoma. These mutations target the kinase domain near the activation loop, where mutations have been shown to activate IKBKB tyrosine kinase activity and NF-κB signalling in vitro.

MYOD1

MYOD1 encodes a nuclear protein in the basic helix-loop-helix family of transcription factors. A recurrent somatic missense mutation has been found in a highly conserved residue of the basic region in embryonal type rhabdomyosarcoma (ERMS). MYOD1 mutated tumours are often of spindle cell morphology, located in the head and neck region, associated with older patients at diagnosis, poor outcome and accompanying PI3K-AKT pathway mutations. Thus they may define a specific subset of ERMS. It has been proposed that Leu122Arg can block WT function and bind to MYC consensus sequences, suggesting a possible switch from differentiation to proliferation in cooperation with PIK3CA mutations.

KDM5C

KDM5C (JARID1C) is a chromatin modulating tumour suppressor gene encoding a histone H3K4 demethylase. Subclonal somatic mutations have been found in clear cell renal cell carcinoma and appear to be associated with advanced tumour stage, grade and invasiveness. Mutations of all types are observed across the gene and these truncating and inactivating mutations are proposed to lead to sustained methylation of KDM5C targets and thus activated transcriptional activity.

ACVR1

Recurrent activating mutations have been identified in ACVR1 (ALK2) in paediatric diffuse intrinsic pontine glioma (DIPG). Mutation hot spots at codons R258, G328 and G356 are all within the serine/threonine kinase domain and codon R206 is within the glycine-serine-rich domain. ACVR1 encodes a type 1 activin receptor involved in bone morphogenetic protein signalling and germline mutations cause the congenital childhood developmental disorder fibrodysplasia ossificans progressiva (FOP). Some of the recurrent mutations found in DIPG are the same as those found in FOP. Somatic mutations seem to occur in a distinct subset of DIPG with predominance of females, relatively restricted age at onset and longer survival times.

ESR1

ESR1 encodes oestrogen receptor alpha which belongs to the super-family of nuclear hormone receptors and functions as a ligand-activated transcription factor. Highly recurrent activating mutations in ESR1 have been identified in oestrogen receptor-positive metastatic breast cancer. There are hot spots at Y573 and D538 in helix 12 of the ligand-binding domain. These mutations result in oestrogen receptor constitutive activity and confer acquired resistance to anti-hormone therapy. In premalignant lesions and primary breast cancer the mutation K303R is found which confers an increased sensitivity to oestrogen, and in untreated tumours, is associated with aggressive clinical features.

CDKN2C

CDKN2C is a member of the gene family coding for inhibitors of cyclin-dependent kinases 4 and 6. Its protein functions as a cell growth regulator, controlling cell cycle G1 progression. CDKN2C is a tumour suppressor gene in multiple myeloma and glioma where the majority of mutations are homozygous deletions.

COL2A1

Collagen, type II, alpha 1 gene (COL2A1) encodes an extracellular matrix protein that has a role in collagen maturation and chondro-skeletal development. Constitutional disruption of this process is known to result in skeletal and ocular disorders. High frequency of somatic mutations has been found in COL2A1 in chondrosarcoma and enchondroma but not in other cartilaginous tumours.

CACNA1D

CACNA1D encodes an L-type voltage-gated calcium channel. Recurrent somatic mutations have been found in some KCNJ5 mutation-negative adrenal aldosterone-producing adenomas (APAs) whose histological appearance generally represents a zona glomerulosa-like phenotype. These gain-of-function mutations often occur towards the ends of the S6 transmembrane segments which line the channel pore, involving codons Gly403 and Ile770. The mutations cause the channel to become more easily activated, and in the case of Gly403 also adversely affect channel inactivation. The resulting Ca2+ influx leads to increased aldosterone production and cell proliferation in the adrenal glomerulosa.


Cancer Gene Census

67 gene symbols have been updated in the Cancer Gene Census. These symbols now correspond to the approved symbols described in the HGNC database. The affected genes are listed below with the previous symbols in parentheses:

ACKR3(CMKOR1) ACSL6(FACL6) AFF1(MLLT2) AFF3(LAF4) AFF4(AF5q31) ARHGAP26(GRAF) C15orf65(FLJ27352) CASC5(AF15Q14) CCDC6(D10S170) CDKN2A(CDKN2a(p14)) CLP1(HEAB) CNBP(ZNF9) CNTRL(CEP1) CRTC1(MECT1) DUX4L1(DUX4) ERC1(ELKS) FCRL4(IRTA1) FLCN(BHD) FOXO1(FOXO1A) FOXO3(FOXO3A) FOXO4(MLLT7) HMGN2P46(C15orf21) HNF1A(TCF1) HSP90AA1(HSPCA) HSP90AB1(HSPCB) IGH(IGH@) IGK(IGK@) IGL(IGL@) KAT6A(RUNXBP2) KAT6B(MYST4) KDSR(FVT1) KLF6(COPEB) KMT2A(MLL) KMT2C(MLL3) MECOM(EVI1) MECOM(MDS1) MLLT11(AF1Q) MNX1(HLXB9) MYCL(MYCL1) NBN(NBS1) NCKIPSD(AF3p21) NUTM1(C15orf55) PATZ1(ZNF278) PDCD1LG2(CD273) PRRX1(PMX1) PTCH1(PTCH) RABEP1(RAB5EP) RAD51B(RAD51L1) RHOH(ARHH) RMI2(C16orf75) RNF213(ALO17) RNF217-AS1(STL) RUNX1T1(CBFA2T1) SDHAF2(SDH5) SEPT5(PNUTL1) SEPT9(MSF) SMAD4(MADH4) SPECC1(HCMOGT-1) SRSF3(SFRS3) STIL(SIL) TET1(LCX) TRA(TRA@) TRB(TRB@) TRD(TRD@) TRIM24(TIF1) ZBTB16(ZNF145) ZMYM2(ZNF198)


Simple Somatic Mutations (SSM)

ICGC release 18

SSM data was loaded from ICGC release 18.

New studies:

COSU584 NBL-US COSU648 LIHM-FR COSU650 PACA-IT

New samples were added to the following studies:

COSU322 LIRI-JP COSU328 PACA-AU COSU375 READ-US COSU376 COAD-US COSU381 LICA-FR COSU382 PACA-CA COSU413 BLCA-US COSU414 BRCA-US COSU415 CESC-US COSU416 KIRC-US COSU417 LUAD-US COSU418 LUSC-US COSU419 UCEC-US COSU435 PRAD-US COSU535 ESAD-UK COSU537 PRAD-CA COSU539 ORCA-IN COSU540 SKCM-US COSU542 THCA-US COSU543 KIRP-US COSU544 LAML-KR COSU545 LGG-US COSU583 LUSC-KR COSU628 LIHC-US

Systematic Screen Papers

Follow links below to the 26 new papers, or view the full table of papers here.

COSP38402 COSP38754 COSP35305 COSP36588 COSP39174 COSP39168 COSP35670 COSP37154 COSP38032 COSP38676 COSP38660 COSP37077 COSP32705 COSP38135 COSP38343 COSP38481 COSP36673 COSP36052 COSP39466 COSP36335 COSP38040 COSP35940 COSP39541 COSP36245 COSP39329 COSP39357


Copy Number Variants (CNV)

Copy number data was uploaded from ICGC release 18. Two additional studies have been added in v73:

COSU648 LIHM-FR COSU650 PACA-IT


COSMIC Genome Browser

New tracks:

  • dbSNP; imported from build 142.
  • • SNPs; variants flagged as SNPs using a 'noise reduction' filtering system. These are excluded from the COSMIC website but are now visisble as a track in the genome browser.
  • • ENCODE regulatory features; imported from Ensembl Regulation. Feature types: Enhancer, Open chromatin, TF binding site, Promoter, Promoter Flanking Region and CTCF Binding Site.


FATHMM MKL - Mutation Impact

The mutation impact filters introduced in COSMIC v73 have been derived from the new FATHMM-MKL algorithm. This algorithm predicts the functional, molecular and phenotypic consequences of protein missense variants using hidden Markov models.

More information about FATHMM-MKL is available here

The new method improves on the older version of FATHMM and now incorporates ENCODE annotation for its prediction. This method is as powerful as CADD scores for coding variants and shows improved prediction for non-coding variants (compared to GWAVA and CADD).

The functional scores for individual mutations from FATHMM-MKL are in the form of a single p-value, ranging from 0 to 1. Scores above 0.5 are deleterious, but in order to highlight the most significant data in COSMIC, only scores ≥ 0.7 are classified as 'Pathogenic'. Mutations are classed as 'Neutral' if the score is ≤ 0.5. In addition, each functional score is classified into 10 groups of features, depending on whether it is a coding or non-coding variant. Please see the original publication for more details regarding the feature classification (doi:10.1093/bioinformatics/btv009).

The following is reproduced from the publication in order to aid interpretation:

Description for each of the feature groups [A-J]

  • A. 46-Way Sequence Conservation: based on multiple sequence alignment scores, at the nucleotide level, of 46 vertebrate genomes compared with the human genome.
  • B. Histone Modifications (ChIP-Seq): based on ChIP-Seq peak calls for histone modifications.
  • C. Transcription Factor Binding Sites (TFBS PeakSeq): based on PeakSeq peak calls for various transcription factors.
  • D. Open Chromatin (DNase-Seq): based on DNase-Seq peak calls.
  • E. 100-Way Sequence Conservation: based on multiple sequence alignment scores, at the nucleotide level, of 100 vertebrate genomes compared with the human genome.
  • F. GC Content: based on a single measure for GC content calculated using a span of five nucleotide bases from the UCSC Genome Browser.
  • G. Open Chromatin (FAIRE): based on formaldehyde-assisted isolation of regulatory elements (FAIRE) peak calls.
  • H. Transcription Factor Binding Sites (TFBS SPP): based on SPP peak calls for various transcription factors.
  • I. Genome Segmentation: based on genome-segmentation states using a consensus merge of segmentations produced by the ChromHMM and Segway software.
  • J. Footprints: based on annotations describing DNA footprints across cell types from ENCODE.

Please note: The current FATHMM-MKL algorithm is trained on the human gene mutation database (The HGMD database http://www.hgmd.cf.ac.uk/ac/index.php), which now also contains somatic variants. Results from the current available version of FATHMM-MKL can be used/has been used for somatic variants, but the user should be aware of the caveats. The cancer specific version of FATHMM-MKL is under development and when available these scores will be updated.


Full description of new v72 content

Summary

  1. Upgrade to genome version GRCh38, and run an additional website at http://grch37-cancer.sanger.ac.uk to display GRCh37 coordinates.
  2. Curated Genes; 22 new fully curated genes.
  3. Curated Fusions; 28 new fully curated fusion pairs added.
  4. Cancer Gene Census; 26 new genes added.
  5. Genome Data Imports
    • • Differential Methylation; 12 new TCGA studies (4,377 samples, from ICGC release 18)
    • • Simple Somatic Mutations (SSM);
      • • ICGC release 17
      • • Paediatric Cancer Genome Project; 7 studies (142 new samples).
      • • 69 new Systematic Screen papers (841 new tumour samples).
    • • Copy Number Variants (CNV); ICGC release 18 (1 new study, BOCA-UK). Data added for 16 new samples in Cell Lines Project
    • • Gene Expression Variants; added for all samples in Cell Lines Project


Curated Genes

RHOA

RHOA is a member of RHOA GTPase family, important in regulation of cell motility, adhesion and cell-cell interactions. Recurrent, hotspot, heterozygous mutations have been found in specific T-cell lymphomas (angioimmunoblastic lymphomas and PTCL-NOS with features of AITL), paediatric Burkitt Lymphoma and diffuse gastric cancer. Although mutant proteins have dominant-negative effects causing loss of wild type RHOA function, mutant RHOA is thought to gain different biochemical properties and to behave as a gain-of-function oncogene.

TBL1XR1

TBL1XR1 encodes transducin-β–like 1 X-linked receptor 1, which is a transcriptional regulator involved both in the Wnt/β-catenin and NFκB pathways. TBL1XR1 missense and frameshift mutations have been found in splenic marginal zone lymphoma and primary central nervous system lymphoma, and at a lower frequency in other cancer types. The lack of mutation hot spots and the recurrent deletions of 3q26.32 (TBL1XR1 locus) reported in some cancers suggest its potential role as a tumour suppressor.

DICER1

DICER1 is an RNase endoribonuclease which catalyses cleavage of double stranded DNA substrates to produce microRNAs and short interfering RNAs thus indirectly affecting expression of >30% of genes. DICER1 is also involved in regulation of chromatin, alternative splicing, DNA replication timing, genome stability and cellular senescence. Both germline and somatic mutations are found in this gene. Somatic missense mutations are generally found in RNase IIIa and RNase IIIb domain hot spots in a number of different tumour types including germ cell tumours, sex cord-stromal tumours, pleuropulmonary blastoma and pituitary blastoma.

IL6ST

Recurrent somatic mutations are found in interleukin-6 signal transducer (IL6ST) gene that codes gp130 component of the IL-6 signal transduction machinery. Mutations are commonly found in inflammatory hepatocellular adenomas and in a subset of hepatocellular carcinomas. Mutations are commonly in-frame deletions in a hotspot region in gp130 D2 domain, which is involved in IL-6 binding. Functionally activated gp130 are thought to lead to the constitutive activation of JAK and STAT3, resulting in sustained inflammatory response that promote adenoma formation.

CSF3R

CSF3R encodes the transmembrane receptor for granulocyte colony-stimulating factor (G-CSF), which provides the proliferative and survival signal for granulocytes. Somatic, truncating mutations occur in severe congenital neutropenia and are associated with enhanced G-CSF response and progression to MDS/leukaemia. Membrane proximal mutations, in particular T618I, are also found in these cases, as well as being the oncogenic driver in the majority of de novo chronic neutrophilic leukaemia and some atypical CML. A further distinct set of CSF3R mutations are found in some CMML cases.

ETNK1

ETNK1 is an ethanolamine-specific kinase involved in phosphatidylethanolamine (PE) biosynthesis, which is important for many diverse biochemical functions and is a precursor for several key biologically active molecules. Recurrent heterozygous mutations (H243Y, N244S/T/K and G245A) have been found in the kinase domain of ETNK1 in a minority of cases of aCML, CMML, systemic mastocytosis (SM) with eosinophilia, SM-AHD with eosinophilia and HES.

DNM2

The transcription factor Dynamin 2 (DNM2) is a member of a family of large GTPases and has a wide range cellular functions. Frameshift, misense, nonsense and splice mutations have been found in adult and childhood ETP-ALL and non-ETP-ALL. The mutations are located across the gene in all the functional domains and are likely to result in loss of DNM2 function.

CUX1

CUX1 is a homeodomain transcription factor and deficiency activates phosphoinositide 3-kinase (PI3K) signalling, leading to increased tumour growth. Low frequencies of somatic mutations have been found in various solid and myeloid tumours. Heterozygous inactivating mutations are distributed across the gene in myeloid cancers, in particular myelodysplastic syndromes and AML, and CUX1 has been shown to act as an haploinsufficient tumour suppressor in these cases. It is over expressed in solid cancers and may also function as an oncogene.

CNOT3, RPL5, and RPL10

CNOT3, part of the CCR4-NOT complex that regulates gene expression, has been identified as a tumour suppressor gene in T-cell acute lymphoblastic leukaemia (T-ALL), where it is associated with adult age. Recurrent oncogenic mutations in RPL5 and RPL10, 2 genes encoding ribosomal proteins, have also been found in T-ALL, where RPL10 mutations are associated with young age and a hot spot at R98.

NT5C2

Relapse-specific activating mutations in NT5C2 have been identified in acute lymphoblastic leukaemia, with hot spots at R238 and R367. NT5C2 encodes a 5’-nucleotidase enzyme which is responsible for the inactivation of nucleoside-analogue chemotherapy drugs. Acquired somatic mutations in NT5C2 result in chemoresistance to purine analogues.

CBLB

CBL proteins (CBL, CBLB, CBLC) are a highly conserved family of RING finger ubiquitin E3 ligases which are critical negative regulators of protein tyrosine kinase signalling. CBLB Ub-E2 protein binding RING finger domain mutations are found in acute myeloid leukaemia and chronic myeloid leukaemia.

UBR5

Missense and truncating mutations in UBR5, a gene encoding an E3 ubiquitin-protein ligase, have been found in mantle cell lymphoma. The truncating mutations cluster within the HECT domain.

FBXO11

FBXO11 has been identified as a haploinsufficient tumour suppressor gene in diffuse large B cell lymphoma, where mutations and deletions contribute to lymphomagenesis through BCL6 stabilization. FBXO11 encodes a member of the F-box protein family and these proteins constitute one of the four subunits of ubiquitin protein ligase complex SKP1-CUL1-F-box protein (SCF). BCL6 is targeted for ubiquitylation and degradation by the SCF ubiquitin ligase complex that contains FBXO11.

MAX

MAX (Myc associated factor X; 14q23) functions as a transcriptional regulator through heterodimerization with MYC or MAD. Disruption of this signaling pathway has been associated with development of hereditary and sporadic neoplasms. Somatic mutations have been found at a low frequency in sporadic pheochromocytomas arising in the adrenal medulla and in some epithelial carcinomas such as endometrial and colon carcinomas.

RAD21

RAD21 homolog (S. pombe) (8q24) is a component of a multimeric cohesin complex that has a role in cohesion of sister chromatids, post replicative DNA repair and transcriptional regulation. Somatic mutations in RAD21 have been found in epithelial cancers such as endometrioid and colorectal carcinomas and myeloid neoplasms.

CASP8

Caspase 8 (CASP8, 2q33-q34) is a key protease involved in apoptosis. Somatic mutations in this gene result in defective apoptosis induction and have been found in various solid tumours. The mutations are spread across the gene reflecting its nature as a tumour suppressor.

BIRC3

BIRC3 encodes one of the components of a protein complex which negatively regulates the MAP3K14 serine-threonine kinase, the central activator of noncanonical NF-kappaB signalling. It is recurrently mutated in chronic lymphocytic leukaemia (CLL). Although mutations are uncommon in the early phase of disease, where they indicate a poor prognosis, they associate specifically with a chemorefractory phenotype. Inactivating mutations (commonly frameshift and nonsense) and/or gene deletions cause the truncation or elimination of the C-terminal RING domain, the E3 ubiquitin ligase activity of which is essential for MAP3K14 proteasomal degradation by BIRC3. Recurrent somatic mutations affecting the same RING domain have also been identified in mantle cell lymphoma, splenic marginal zone lymphoma and multiple myeloma.

FOXA1

FOXA1 encodes a member of the forkhead class of DNA-binding proteins which is commonly expressed in hormonally-driven cancers. Recurrent mutations in FOXA1 have been identified in prostate cancer; the majority occurring in the carboxy-terminal transactivating domain and in the forkhead domain. In breast cancer, FOXA1 recurrent mutations have also been found clustered in the forkhead domain.

ATP1A1 and ATP2B3

Somatic mutations have been found in ATP1A1 and ATP2B3 in some KCNJ5 mutation-negative adrenal aldosterone-producing adenomas (APA). These genes are P-type ATPase gene family members encoding an Na+/K+ ATPase α subunit and the plasma membrane Ca2+ transporting ATPase respectively. Recurrent ATP1A1 mutations, either missense substitution or in-frame deletions, often involve p.Leu104 or p.Val332, located in the 1st and 4th of 10 transmembrane segments respectively. Both are suggested to be involved in K+ ion-binding and gating. Recurrent in-frame deletion mutations of ATP2B3 involve p.Val426 (also located in transmembrane segment 4) and are predicted to cause major disruption of the ca2+ binding site. In contrast to KCNJ5 mutation-positive APA patients, these patients are predominantly older, male and present with a more severe phenotype including higher aldosterone levels.


Curated Fusions

CDKN2D-WDFY2

A fusion of CDKN2D, which encodes a cell cycle modulator that is also involved in DNA repair, with WDFY2, encoding a protein known to modulate AKT interactions with its substrates, has been found in 20% of high-grade serous ovarian carcinomas.

DCTN1-ALK

DCTN1 has been identified as a novel fusion partner to ALK in inflammatory myofibroblastic tumour. The fusion protein retains the cytoskeleton-associated protein-glycine domain and coiled-coil domain of DCTN1 and the receptor tyrosine kinase domain of ALK.

HIP1-ALK

HIP1 has been identified as a novel partner to ALK in non-small cell lung cancer. HIP1 belongs to the huntingtin interacting protein family which plays a role in clathrin-mediated endocytosis and receptor trafficking. The HIP1 fusion protein comprises the coiled-coil domain of HIP1 and the juxtamembrane intracellular region o f ALK.

ROS1, NTRK1, ALK, BRAF and RET Kinase Fusions

Fusions involving the kinases ROS1, NTRK1 (* NTRK1_ENST00000392302), ALK, BRAF and RET occur in spitzoid neoplasms in a mutually exclusive manner. All the kinase fusion proteins retain the intact kinase domain at the 3’ end of the fusion transcript, with many of the 5’ partners contributing coiled-coil domains.

CLIP1-ROS1 PPFIBP1-ROS1 ZCCHC8-ROS1 MYO5A-ROS1 PWWP2A-ROS1 HLA-A-ROS1 ERC1-ROS1 TPM3-ROS1 KIAA1598-ROS1 CEP89-BRAF LSM14A-BRAF TPM3-ALK DCTN1-ALK GOLGA5-RET KIF5B-RET

TP53-NTRK1* LMNA-NTRK1*

CIC-FOXO4

CIC-FOXO4 fusions have been found in Ewing-like undifferentiated small round cell soft tissue sarcoma. CIC is a member of the HMG-box superfamily of transcription factors, and FOXO4 is a member of the forkhead family and the protein it encodes is highly homologous to the human forkhead protein FKHR.

CD74-NRG1

Recurrent CD74-NRG1 (CD74_ENST0000000953) fusions have been identified in KRAS mutation-negative invasive mucinous lung adenocarcinoma. CD74 replaces the transmembrane domain of NRG1 while the membrane-tethered EGF-like domain of NRG1 is preserved.

BRAF and RAF1

Fusions involving BRAF and RAF1 have been identified in pancreatic acinar cell carcinoma. Five different variants of the most prevalent fusion, SND1-BRAF, were found in 6 carcinomas. In all variants, the 5’ region of SND1 fuses to the complete kinase domain of BRAF.

SND1-BRAF HERPUD1-BRAF ZSCAN30-BRAF GATM-BRAF HACL1-RAF1

NAB2-STAT6

Recurrent NAB2-STAT6 gene fusions have been identified in solitary fibrous tumours (SFT), occurring in 100% of tumours in some reports. The transcriptional repressor NAB2 comprises an N-terminal EGR1-binding domain, a NAB conserved region 2 and a C-terminal transcriptional repressor domain, while the transcriptional activator STAT6 comprises a DNA-binding domain, SH2 domain and a C-terminal transcriptional activation domain. Multiple variants of the fusion with different locations for the breakpoints have been detected. The NAB2-STAT6 fusion has also been found in meningeal haemangiopericytoma, providing further evidence for the close association of these tumours with peripheral SFT.


Cancer Gene Census

The Cancer Gene Census has been updated with 26 new genes, this brings the total of known cancer genes substantiated by the scientific literature to 572. The new genes are :

ETNK1 RANBP2 STRN SND1 FAM131B CRTC1 AXIN2 IKBKB FGFR4 ATR PPP6C SPOP MYOD1 ACVR1 MAP3K1 NCOR1 TBX3 ARID1B CDKN1B SMARCD1 MAP3K13 SPEN POLE NFKBIE GRIN2A RHOA


Differential Methylation

Methylation data was downloaded (from ICGC release 18) and analysed for the following 12 TCGA studies -

COSU376 COAD-US COSU413 BLCA-US COSU414 BRCA-US COSU416 KIRC-US COSU417 LUAD-US COSU418 LUSC-US COSU419 UCEC-US COSU435 PRAD-US COSU542 THCA-US COSU543 KIRP-US COSU627 HNSC-US COSU628 LIHC-US


Simple Somatic Mutations (SSM)

ICGC release 17

SSM data was loaded (from ICGC release 17) and the following new study was added -

486 (BOCA-UK)

All studies (from ICGC release 17) which have contributed SSM data -

COSU322 LIRI-JP COSU323 LINC-JP COSU328 PACA-AU COSU329 GBM-US COSU331 OV-US COSU340 CLLE-ES COSU351 CMDI-UK COSU371 GACA-CN COSU375 READ-US COSU376 COAD-US COSU377 LAML-US COSU379 PBCA-DE COSU381 LICA-FR COSU382 PACA-CA COSU385 BRCA-UK COSU414 BRCA-US COSU415 CESC-US COSU416 KIRC-US COSU417 LUAD-US COSU418 LUSC-US COSU435 PRAD-US COSU440 MALY-DE COSU486 BOCA-UK COSU534 EOPC-DE COSU535 ESAD-UK COSU537 PRAD-CA COSU538 PRAD-UK COSU539 ORCA-IN COSU540 SKCM-US COSU541 STAD-US COSU543 KIRP-US COSU544 LAML-KR COSU545 LGG-US COSU582 ESCA-CN COSU583 LUSC-KR COSU585 OV-AU COSU586 PAEN-AU COSU587 RECA-CN COSU588 RECA-EU COSU589 THCA-SA

Pediatric Cancer Genome Project (PCGP)

Data was obtained from Pediatric Cancer Genome Project and the following new studies were added in COSMIC.

COSU637 COSU638 COSU639 COSU640 COSU641 COSU642 COSU643

Systematic Screen Papers

Follow links below to the 69 new papers, or view the full table of papers here.

COSP28324 COSP28767 COSP29006 COSP32038 COSP32776 COSP32924 COSP33521 COSP33709 COSP33868 COSP33912 COSP33949 COSP33999 COSP34036 COSP34058 COSP34094 COSP34313 COSP34374 COSP34394 COSP34432 COSP34489 COSP34587 COSP34618 COSP34640 COSP34774 COSP34786 COSP34904 COSP35074 COSP35107 COSP35120 COSP35154 COSP35241 COSP35330 COSP35479 COSP35484 COSP35693 COSP35753 COSP35781 COSP35911 COSP36033 COSP36065 COSP36071 COSP36112 COSP36113 COSP36120 COSP36166 COSP36237 COSP36246 COSP36338 COSP36492 COSP36535 COSP36629 COSP36887 COSP36956 COSP37087 COSP37321 COSP37352 COSP37447 COSP37473 COSP37544 COSP37629 COSP37678 COSP37693 COSP37797 COSP38198 COSP38402 COSP38425 COSP38428 COSP38538 COSP38687


Copy Number Variants (CNV)

For the Cell Lines Project the 16 cell lines added in the last release (v71) have now been analysed using the Affymetrix SNP6.0 array and the copy number variants are visible on the website. The full copy number segmentation and genotype data are available to download for 1021 samples but there is no SNP6.0 data for Hs633T (COSS1240149), JHH-2 (COSS1240157), JHH-4 (COSS1240158), and SW156 (COSS1240220).

IN COSMIC, copy number data was loaded (from ICGC release 18) and the following new study was added -

486 (BOCA-UK)

All studies (from ICGC release 18) which have contributed copy number data -

COSU328 PACA-AU COSU329 GBM-US COSU331 OV-US COSU340 CLLE-ES COSU375 READ-US COSU376 COAD-US COSU377 LAML-US COSU379 PBCA-DE COSU381 LICA-FR COSU382 PACA-CA COSU385 BRCA-UK COSU413 BLCA-US COSU414 BRCA-US COSU415 CESC-US COSU416 KIRC-US COSU417 LUAD-US COSU418 LUSC-US COSU419 UCEC-US COSU435 PRAD-US COSU486 BOCA-UK COSU534 EOPC-DE COSU538 PRAD-UK COSU540 SKCM-US COSU541 STAD-US COSU542 THCA-US COSU545 LGG-US COSU585 OV-AU COSU586 PAEN-AU COSU589 THCA-SA COSU627 HNSC-US COSU628 LIHC-US COSU629 PAAD-US COSU631 ACC-US COSU632 DLBC-US COSU633 ESCA-US COSU634 KICH-US COSU635 SARC-US COSU636 UCS-US


Gene Expression Variants

For the Cell Lines Project we have added gene expression data across all samples from the Affymetrix Human Genome U219 Array platform.

In COSMIC, these studies (from ICGC release 18) have contributed gene expression data -

COSU329 GBM-US COSU331 OV-US COSU375 READ-US COSU376 COAD-US COSU377 LAML-US COSU414 BRCA-US COSU415 CESC-US COSU416 KIRC-US COSU417 LUAD-US COSU418 LUSC-US COSU419 UCEC-US COSU435 PRAD-US COSU540 SKCM-US COSU541 STAD-US COSU542 THCA-US COSU545 LGG-US COSU627 HNSC-US COSU628 LIHC-US COSU629 PAAD-US COSU632 DLBC-US COSU633 ESCA-US COSU634 KICH-US COSU635 SARC-US COSU636 UCS-US