Release Notes

v83 - 7th November 2017

v83 (November 2017) includes 3 new fully curated genes, a substantial curation update for VHL, 1 new fusion pair, 1,138 genomes from 13 new systematic screen papers, updates from ICGC release 25, and updated resistance mutation data; 8 new samples and 11 new resistance mutations. We have also added 5 new genes to the Cancer Gene Census (Tier 1) and expanded the census table functionality to display both tiers. In this release we have retired the legacy website but in response to user feedback we have added full support for the GRCh37 coordinate system across the main website.

Data Updates

  1. New fully curated cancer genes (4);
    • TGFBR2 - 6,164 samples, 857 mutations, 183 papers
    • ERBB4 - 7,967 samples, 232 mutations, 296 papers
    • BCL9L - 271 samples, 8 mutations, 83 papers
  2. Substantial curation update (1);
    • VHL - 15,790 samples (+1,140), 2,487 mutations (+322), 569 papers (+80)
  3. Curated Gene Fusions (1);
    • ETV6-ABL1 - 595 samples, 32 mutations, 24 papers
  4. Cancer Gene Census;
  5. Resistance Mutations Update;
  6. Whole Genome data
    • 13 Systematic Screen Papers; 1,138 genomes
    • Updates from ICGC release 25

COSMIC's cancer genome data is interpreted into standardised annotations from a variety of sources, described here.

Support for GRCh37 and 38 Reference Genomes

We have added a new menu to the main navigation bar on the COSMIC website called 'Genome Version'. The default is set to GRCh38 but GRCh37 can be selected from this menu. When set to GRCh37 the 'GRCh37 Archive' logo will appear at the top of each web page. Please note that you will need to enable cookies in order for this to work.

Curated Genes

TGFBR2

TGFBR2, transforming growth factor beta receptor 2, encodes a transmembrane member of the Ser/Thr protein kinase family which forms a heterodimeric complex with TGF-beta receptor type-1 and binds TGF-beta. TGFBR2 acts as a tumour suppressor gene in colorectal cancer, where its mutational inactivation is the most common genetic event affecting the TGF-beta signalling pathway, occurring in approximately 30% of these cancers. A 10-base pair polyadenine tract in the extracellular domain is a hotspot, where insertions and deletions result in a frameshift and a non-functioning protein lacking the receptor's transmembrane domain and intracellular kinase domain. These mutations are common in cancers displaying microsatellite instability, with unique clinicopathological features, including an increased incidence in the proximal colon, presentation at an early stage and better prognosis than microsatellite stable (MSS) colon cancer. The frameshift mutations also occur in gastric cancer and missense mutations are found in the kinase domain in MSS colon cancer. Genome studies have identified TGFBR2 as a significantly mutated gene in cervical cancer, and head and neck squamous cell carcinoma.

ERBB4

ERBB4 is a member of the Epidermal Growth Factor Receptor (EGFR) subfamily of receptor tyrosine kinases, along with EGFR, ERBB2 and ERBB3. Ligands include neuregulins and several EGF family members. Activated ERBB4 can function as both a homodimer and a heterodimer with other EGFR family members, resulting in a range of cellular responses. A comparatively less well understood member of the EGFR family, somatic mutations in ERBB4 are seen across various cancers (including breast, lung and melanoma) and in various different regions of the gene. No hotspot mutations have been identified. It has been proposed to act as both an oncogene and a tumour suppressor and is being investigated as a potential drug target.

BCL9L

BCL9L (B-cell CLL/lymphoma 9 like) is a co-activator of Wnt/beta-catenin signalling. It increases the expression of a subset of Wnt target genes but also regulates genes that are required for early stages of intestinal tumour progression. Somatic loss-of-function alterations in BCL9L are frequent in aneuploid colorectal carcinoma but are also found in other tumour types at lower frequency. BCL9L has been proposed to function as an oncogene or as a tumour suppressor depending on the cellular context.

VHL (updated)

VHL is a tumour suppressor gene that plays a role in a rare inherited disorder called Von Hippel-Lindau syndrome but also in sporadic forms of cancer. The current update in COSMIC brings together the historic collection and the latest published data on the somatic mutations in the VHL gene, including novel mutations and VHL mutations in new histological entities and ethnic groups. Early inactivation of VHL is commonly seen in ccRCC, the most common form of renal cancer. A recent publication by Corr?? et al. (28214514) explores the feasibility of using circulating tumour DNA as a biomarker in this disease. Cho et al. (27994516) sequenced Taiwanese pancreatic neuroendocrine tumours (pNETs) for a large customised panel of genes. They observed that Asian patients with pNETs were more frequently mutated for the mTOR and angiogenesis (including VHL) pathways when compared to Caucasian patients, which could partially explain the better outcome observed for targeted therapy in Asian patients with pNETs. Other reports analysed VHL mutations in tumour types such as parotid mucoepidermoid carcinoma, glioblastoma, breast cancer, colorectal cancer, and clear cell microcystic adenoma.

Curated Gene Fusions

ETV6-ABL1

ETV6-ABL1, resulting from t(9;12)(q34;p13) or a complex rearrangement, is a rare but recurrent fusion in a wide range of haematological malignancies including myelodysplastic neoplasm, acute lymphoblastic leukaemia, acute myeloid leukaemia and Philadelphia chromosome-negative chronic myeloid leukaemia. ETV6 encodes an ETS family transcription factor which contains two functional domains, an N-terminal pointed domain that is involved in protein-protein interactions with itself and other proteins, and a C-terminal DNA-binding domain. Two types of ETV6-ABL1 transcript are detected: type A has an ETV6 breakpoint at exon 4 and type B at exon 5. The ABL1 breakpoint is consistent at exon 2. Both types result in constitutive tyrosine kinase activity similar to that seen with the BCR-ABL1 fusion. Eosinophilia is a common characteristic of patients with ETV6-ABL1 fusion.

Cancer Gene Census (CGC)

New CGC genes (Tier 1)

BARD1IRS4PIK3CBPOLD1POLQ

Based on the concept defined by D. Hanahan and R. A. Weinberg, COSMIC, in collaboration with Open Targets, integrates functional descriptions focused on Hallmarks of Cancer into the CGC. The Hallmark pages visually explain the role of a gene in cancer by highlighting which of the classic behaviours are displayed by the gene and whether they are promoted or suppressed. A concise overview with associated references is available for 227 census genes and this will continue to be expanded.

All CGC genes have been re-evaluated and classified with regard to their function in cancer, as oncogenes or tumour suppressive genes, as well as genes participating in fusions, where applicable.

To be able to provide high-confidence and comprehensive data, the CGC has been divided into two tiers.

To classify into Tier 1 of the CGC, a gene must possess a documented activity that may drive or suppress cancer, and there must be evidence of mutations in this gene, detected in cancer, and changing the activity of the protein in a way that promotes the oncogenic transformation. We also take into account the existence of somatic mutation patterns in cancer samples, typical for tumour suppressor genes (broad range of inactivating mutations) or oncogenes (well defined hotspots of missense mutations).

Tier 2 of the Cancer Gene Census consists of genes with strong indications of roles in cancer but with less expansive available evidence, compared to Tier 1. It currently contains 127 genes and is being expanded.

The complete CGC list (Tier 1 and 2) is available here, but please note that any reference to the CGC (or 'Census') across the website which doesn't specify tier, refers to the Tier 1 list.

Systematic Screen Papers

Follow links below to the 13 papers which are new in v83, or view the full table of papers here.

COSP37488COSP39873COSP41061COSP41673COSP42002COSP42117COSP42498COSP42745COSP42774COSP42802COSP42924COSP43750COSP44021

COSMIC Statistics:

1,343,214
Samples (+16,867)
5,366,273
Coding mutations (+530,287)
25,501
Papers (+331)
18,845
Fusions (+36)
32,514
Whole genomes (+1,238)
1,180,789
Copy number variants (+0)
9,176,464
Gene expression variants (+428)
7,879,142
Differentially methylated CPGs (+0)

v82 - 3rd August 2017

v82 (August 2017) includes 4 new fully curated genes, a substantial curation update for SMAD4, 1 new fusion pair, 342 genomes from 11 new systematic screen papers, updates from ICGC release 24, and updated resistance mutation data; 1 new drug and 4 updated. We also launch the new COSMIC website featuring new styles and layout as well as an enhanced version of the Cancer Gene Census and additional website download options.

New COSMIC Webite

The new COSMIC website has now been launched. We welcome your feedback, please email cosmic@sanger.ac.uk with any issues or suggestions for improvement.

The old websites have been updated to v82 and will continue as the legacy website and GRCh37 (archive) legacy website. These will be available until the next release in November 2017, but we do not plan to maintain them beyond that date. However, we will continue to provide our download files as both GRCh38 and GRCh37 versions for the foreseeable future.

New features include -

  • - Updated styles and page layouts
  • - New Cancer Gene Census pages (Hallmarks of cancer)
  • - New 'Targeted Screen' filter on the gene page and cancer browser
  • - Option to download filtered datasets from download page directly, avoiding SFTP (requires login)

Oracle download

For users who download the COSMIC Oracle database dumps, please note that we now only support Oracle 12c. This is because Oracle 11.2 is no longer supported by Oracle.

Data Updates

  1. New fully curated cancer genes (4);
    • KEAP1 - 2,665 samples, 149 mutations, 94 papers
    • DROSHA - 1,774 samples, 121 mutations, 83 papers
    • BTK - 1,810 samples, 55 mutations, 83 papers
    • EPAS1 - 964 samples, 72 mutations, 96 papers
  2. Substantial curation update (1);
    • SMAD4 - 13,198 samples, 710 mutations, 443 papers
  3. Curated Gene Fusions (1);
  4. Cancer Gene Census;
    • 49 genes removed. See section below for details.
    • New web pages integrate functional descriptions focused on Hallmarks of Cancer.
  5. Resistance Mutations Update;
  6. Whole Genome data
    • 11 Systematic Screen Papers; 342 genomes
    • ICGC release 24; 5 new studies, 1 updated

COSMIC's cancer genome data is interpreted into standardised annotations from a variety of sources, described here.

Curated Genes

KEAP1

Kelch-like ECH-associated protein 1 (KEAP1) is a component of the Cullin 3-based E3 ubiquitin ligase complex and controls the stability and accumulation of NRF2 protein. When cells are exposed to oxidative damage, KEAP1 releases NRF2 which translocates into the nucleus where it specifically recognises an enhancer sequence known as Antioxidant Response Element (ARE) resulting in the activation of redox balancing genes. Several studies have reported somatic mutations of the interacting domain between KEAP1 and NRF2 leading to a permanent NRF2 activation. Somatic mutations of the KEAP1 gene are found in non-small cell lung cancer, hepatocellular carcinoma, endometrial cancer, melanoma and many other cancer types and have been associated with a poor outcome and resistance to chemotherapy. The mutations are generally widely distributed in the KEAP1 gene and the frequency of mutations depends on the cancer type and origin.

DROSHA

microRNAs (miRNA) are vital regulators of gene expression. Together with its co-factor DGCR8, the miRNA processing gene DROSHA (drosha ribonuclease III) is involved in the early stages of miRNA processing and is essential for the biogenesis of most miRNAs. Low DROSHA expression levels are observed in several cancer types, including neuroblastoma, endometrial and ovarian cancer, and are associated with advanced stages of several cancer types. In contrast, copy number increases (seen in advanced cervical squamous cell carcinoma) and over-expression are observed in other cancer types, including serous ovarian carcinoma, gastric and non-small cell lung cancers, often associated with prognosis or progression. DROSHA is frequently mutated in Wilms tumour, with the majority of mutations found in the RNase IIIb domain, at p.E1147. The recurrent mutation p.E1147K affects miRNA processing via a dominantnegative mechanism resulting in down regulation of miRNAs.

BTK

BTK encodes Bruton tyrosine kinase, a TEC family cytoplasmic tyrosine kinase required for the development, activation and differentiation of B cells, and an early component of the B-cell receptor signalling pathway. Recurrent mutations at BTK C481 have been identified in patients with chronic lymphocytic leukaemia (CLL) who have progressed after an initial response to ibrutinib treatment. Ibrutinib is a highly specific BTK inhibitor, inactivating by irreversible binding to C481 within the ATP-binding domain of BTK. While C481 mutations are most common among CLL patients who progress on ibrutinib, mutations at the non-kinase SH2 domain at T316 have also been reported. Progression of mantle cell lymphoma after a durable response to ibrutinib may also be due to C481 BTK mutation. This same mutation has also been detected in Waldenstrom macroglobulinaemia patients progressing on ibrutinib.

EPAS1

Hypoxia-inducible factors (HIFs) are transcription factors that respond to changes in tissue oxygen concentration. One of these, Hypoxia-inducible factor 2-alpha (HIF-2-alpha), is encoded by EPAS1. Somatic mutations in EPAS1 occur recurrently in sporadic pheochromocytomas and paragangliomas, as well as in somatostatinomas as part of Pacak-Zhuang syndrome (multiple paragangliomas and somatostatinomas associated with polycythaemia). In some patients with multiple tumours, these somatic EPAS1 mutations are mosaic, having arisen post-zygotically. The majority of somatic EPAS1 mutations are found in exon 12, and gain of function mutations in this region have been shown to cause stabilisation of the HIF2A protein, resulting in transcription of genes involved in the hypoxia response and promotion of angiogenesis and proliferation.

SMAD4 (updated)

The expertly curated data for SMAD4 have been updated. Over 40 publications which include screening of SMAD4, often alongside other genes, are included in this release. SMAD4 encodes a member of the Smad family of signal transduction proteins which plays a pivotal role in signal transduction of the transforming growth factor beta superfamily cytokines by mediating transcriptional activation of target genes. SMAD4, a tumour suppressor gene, is one of the major driver genes in pancreatic cancer. A lack of SMAD4 mutations in high-grade pancreatic intraepithelial neoplasia, the major precursor of pancreatic ductal adenocarcinoma, indicates these are late genetic alterations in pancreatic carcinoma. SMAD4 mutations are also found in colorectal carcinoma (CRC), where they have a prognostic role in metastatic CRC cases, and less frequently in other tumours, including lung cancer.

Curated Gene Fusions

SET-NUP214

The SET-NUP214 fusion results from a recurrent genetic abnormality at 9q34 and is found predominantly in T-cell acute lymphoblastic leukaemia (T-ALL), with a reported frequency of up to 10%. The fusion is rarely detected in acute myeloid leukaemia, acute undifferentiated leukaemia and B-cell acute lymphoblastic leukaemia. In T-ALL, the SET-NUP214 fusion is associated with elevated expression of HOXA cluster genes and with corticosteroid/chemotherapy resistance. SET encodes a protein with a critical role in chromatin binding and remodelling, while NUP214 encodes an FG-repeat-containing nucleoporin involved in the cell cycle and transportation of material between the nucleus and cytoplasm. Most commonly the breakpoints in the SET-NUP214 transcript are at exon 7 of SET and exon 18 of NUP214.

Cancer Gene Census

Based on the concept defined by D. Hanahan and R. A. Weinberg, COSMIC, in collaboration with Open Targets, integrates functional descriptions focused on Hallmarks of Cancer into the Cancer Gene Census. New Hallmark pages visually explain the role of a gene in cancer by highlighting which of the classic behaviours are displayed by the gene and whether they are promoted or suppressed. A concise overview with associated references is available for 226 census genes and will be expanded on a regular basis.

All Cancer Gene Census (CGC) genes have been re-evaluated and classified with regard to their function in cancer, as oncogenes or tumour suppressive genes, as well as genes participating in fusions, where it was applicable.

To be able to provide high-confidence and comprehensive data, we have divided the CGC into two tiers. Currently, only the Tier 1 genes are shown on the website and in the download files.

To classify into Tier 1 of the CGC, a gene must possess a documented activity that may drive or suppress cancer, and there must be evidence of mutations in this gene, detected in cancer, and changing the activity of the protein in a way that promotes the oncogenic transformation. We also take into account the existence of somatic mutation patterns in cancer samples, typical for tumour suppressor genes (broad range of inactivating mutations) or oncogenes (well defined hotspots of missense mutations).

Tier 2 of the Cancer Gene Census consists of genes with strong indications of roles in cancer but with less expansive available evidence, compared to Tier 1. It currently contains 41 genes from the previous release of the Cancer Gene Census and is being expanded, with a planned initial release of about 200 genes in November 2017, along with v83.

Complete census list (Tier 1) is available here

CGC Genes moved to Tier 2 of the Cancer Gene Census

1. PMS1 - PMS1, a component of DDR, only one recurrent frameshift mutation K164fs*6 in four samples, newer papers about MMR genes in cancer don't mention this gene, mice deficient in PMS1 do not develop tumours, no evidence for significant activity in MMR in vitro [PMID: 10542278]

2. Fusion genes with only one case (or rare partners of potent oncogenes known to be fused to multiple partners and able to drive the transformation on their own):

  • CGC GeneFusion partner
      
    C15orf65CIITA
    COX6CHMGA2
    FNBP1KMT2A
    GMPSKMT2A
    KIAA1598ROS1
    NCKIPSDKMT2A
    OMDUSP6
    PWWP2AROS1
    TFRCBCL6
    THRAP3USP6
    ZCCHC8ROS1

3. Fusion genes transcribed with a shifted reading frame or untranscribed upon fusion, for which there is no sufficient evidence for tumour suppressing activity:

  •    ACSL6
  •    ALDH2
  •    HMGN2P46 (also a pseudogene)
  •    LHFP
  •    MDS2
  •    RALGDS
  •    RUNDC2A
  •    TFPT

4. Non-coding genes and pseudogenes do not fit to the current schema of Tier 1 of the Cancer Gene Census. We are working on better characterisation of the role of such genes in cancer. Temporarily they are classified as Tier 2 CGC genes:

  •    DUX4L1
  •    MALAT1

5. Genes known to be involved in cancer only through fusions, where the oncogenic mechanisms depend on disruption of the structure of their fusion partner and there is no evidence of their other cancer-promoting activity so far:

  • CGC GeneFusion partner
      
    AKAP9 BRAF
    CEP89 BRAF
    ELN PAX5
    FAM131B BRAF
    KIAA1549 BRAF
    LSM14A BRAF
    SRGAP3 BRAF

6. Genes known to be involved in cancer only through fusions, for which there is not enough data describing their participation in oncogenic transformation

  • CGC GeneFusion partnerComment
       
    CHIC2 ETV6 CHIC2 is also deleted in process of FIP1L1 to PDGFRA fusion generation,
        but no function of CHIC2 suggesting tumour suppressive, or other cancer-related properties, has been described
       
    CLP1 MLLT10 This fusion is generated through an insertion and coexists with MLLT10-KMT2A
       
    JAZF1 SUZ12 fusion protein is antiapoptotic, but identical to the fusion product arising from physiologically regulated trans-splicing,
        it is possible that the presence of fusion transcript in cancer cells is a result rather than a cause of transformation
       
    LCP1 BCL6 LCP1 is involved in invasion and metastasis and is a biomarker of renal cancer. Lai et al.
        classify the LCP1-BCL6 fusion as a secondary structural change that increases invasive capacity
        of the developed NHL [PMID: 9614913]. More evidence is needed before classifying this gene as a Tier 1 CGC gene
    MNX1 ETV6 Beverloo et al. suggest that the transformation mechanism may be fully dependent on disruption of ETV6 structure
        and function [PMID: 11454678],the involvement of MNX1 in oncogenesis remains to be confirmed
       
    NACA BCL6 There is evidence of antiapoptotic activity of NACA and its involvement in regulation of haematopoiesis;
        it is a rare fusion partner of BCL6 in NHL. More evidence is needed before classifying this gene as a Tier 1 CGC gene
       
    SEPT5, SEPT6, SEPT9 KMT2A Septins form a distinct class of MLL fusion partners, but these fusions are rare;
        there is no data on septins function in cancer and they have no recurrent somatic mutations in cancers
       
    SPECC1 PDGFRB in fusion with PDGFRB in 1 sample, has a recurrent somatic N303fs mutation in colon adenocarcinoma
        and in other cancers, may be a TSG, but the functional data is still missing
       
    VTI1A TCF7L2 The mechanism of oncogenesis in case of VTI1A-TCF7l2 fusion is TCF7L2-dependent
        and the role of VTI1A is not determined [PMID: 21892161]

Genes removed from the Cancer Gene Census (Tier 1 and 2)

  • GeneReason for removal
      
    BCL5 obsolete name of BCL6, currently the name of a phenotype, all data in COSMIC refers to BCL6
      
    C12orf9 withdrawn from all gene databases translocation non-coding RNA sequence,
      partner of LPP in the LPP-LRFT fusion transcript. It has been shown, that truncated LPP is oncogenic on its own
      
    CDKN2A(p14) CDKN2A and CDKN2A(p14) have been merged, as it is a single gene coding both p14 and p16 TSGs
      
    PCSK7 Pafah1B2 (a CGC gene) and not PCSK7, located in the same unstable genomic region,
      is the gene disrupted by the translocation [PMID:10362256], only one paper describes the fusion between IGH and 3'UTR of PCSK7
      
    RANBP17 only in one ALL case TRD was fused to exon 24 of RANBP17 [PMID:12399963], however TSS of TLX3,
      a CGC gene implicated in ALL, is located just 10kb downstream of RANBP17
      
    RNF217-AS1 oncogenic properties may not arise from a fusion protein but rather from the disruption
      or the transcriptional deregulation of the STL/RNF217 locus
      
    TCL6 in fusion with TRA, lncRNA located in a breakpoint region;
      closest coding genes are TCL1B and TCL1A - in CGC - oncogenes involved in T-cell leukemias
      
    TTL it is an obsolete name of LINC00598 non-coding RNA, involved in fusion with ETV6 in ALL,
      TTL is a current name of a non-cancer gene

In total, the following 49 genes have been removed from Tier 1 of the CGC:

NCKIPSDAKAP9ALDH2BCL5C12orf9HMGN2P46CDKN2A(p14)CHIC2COX6CDUX4L1ELNACSL6C15orf65FNBP1GMPSSPECC1CLP1MNX1JAZF1KIAA1549LCP1LHFPMDS2SEPT9NACAOMDPCSK7PMS1SEPT5RALGDSRANBP17RUNDC2ASEPT6SRGAP3RNF217-AS1TCL6TFPTTFRCTHRAP3TTLVTI1AECT2LMALAT1PWWP2AZCCHC8KIAA1598CEP89LSM14AFAM131B

ICGC release 24

New ICGC Studies:

COSU647 LIAD-FRCOSU672 GBM-CNCOSU675 PRAD-FRCOSU676 THCA-CNCOSU677 UTCA-FR

New Copy number data:

COSU535 ESAD-UK

Systematic Screen Papers

Follow links below to the 11 papers which are new in v82, or view the full table of papers here.

COSP39924COSP40964COSP41366COSP42109COSP42647COSP42872COSP42908COSP42955COSP43068COSP43148COSP43326

COSMIC Statistics:

1,326,347
Samples (+24,554)
4,835,986
Coding mutations (+359,460)
25,170
Papers (+327)
18,809
Fusions (+46)
31,276
Whole genomes (+1,471)
1,180,789
Copy number variants (+12,732)
9,176,036
Gene expression variants (+574)
7,879,142
Differentially methylated CPGs (+0)

v81 - 9th May 2017

v81 (May 2017) includes 6 new fully curated genes, a substantial curation update for TET2, 1 new fusion pair, 220 genomes from 9 new systematic screen papers and updated resistance mutation data; 1 new drug and 5 updated. We also announce the launch of a new COSMIC beta site featuring new styles and layout as well as an enhanced version of the Cancer Gene Census and additional website download options.


COSMIC Beta Site

The new COSMIC Beta site http://cancer-beta.sanger.ac.uk has now been launched. This site will be under continual update over the next 3 months and will be regularly updated. We welcome your feedback, please email cosmic@sanger.ac.uk with any issues or suggestions for improvement.

New features include -

  • - Updated styles and page layouts
  • - New Cancer Gene Census pages (Hallmarks of cancer)
  • - New 'Targeted Screen' filter on the gene page and cancer browser
  • - Option to download filtered datasets from download page directly, avoiding SFTP (requires login)

Oracle download

For users who download the COSMIC Oracle database dumps, please note that from v82 we will only support Oracle 12c. This is because Oracle 11.2 is no longer supported by Oracle.


Data Updates

  1. New fully curated cancer genes (6);
    • DDR2 - 3,879 samples, 278 mutations, 95 papers
    • SMAD2 - 3,531 samples, 217 mutations, 82 papers
    • SMAD3 - 2,836 samples, 212 mutations, 85 papers
    • PREX2 - 2,909 samples, 919 mutations, 138 papers
    • NCOR1 - 2,172 samples, 595 mutations, 129 papers
    • PPM1D - 912 samples, 156 mutations, 56 papers
  2. Substantial curation update (1);
    • TET2 - 17,144 samples (+2,027), 2,946 mutations (+277), 606 papers (+46)
  3. Curated Gene Fusions (1);
    • SET-NUP214 - 473 samples, 41 mutations, 14 papers
  4. Cancer Gene Census;
    • New web pages integrate functional descriptions focused on Hallmarks of Cancer.
  5. Resistance Mutations Update;
    • Imatinib: 20 new samples, 3 new unique resistant mutations
    • Tyrosine kinase inhibitor-NS: 30 new samples
    • Gefitinib: 1 new sample
    • Crizotinib: 1 new gene (MET), 3 new samples, 5 new unique resistant mutations
    • Nilotinib: 1 new gene (KIT), 1 new sample, 1 new unique resistant mutation
    • New drug:
      • Savolitinib: 1 gene (MET), 1 sample, 1 unique resistant mutations
    • Change of drug name:
      • AZD9291: changed to Osimertinib
  6. Merging of duplicate mutations
    • Copy Number Variations have been merged where there were multiple data sources for the same sample ID.This has resulted in a slight drop in the overall number of CNV segments in v81.
  7. Genome Data
    • Re-loaded ICGC study COSU417 LUAD-US (307 samples with 130,235 novel mutations).
    • Cell Lines Project, copy number download; 4 samples were missing GRCh38 coordinates, these have now been included.
    • Systematic Screens; 9 new papers (220 new genomes)
    • SNVs and indels loaded from a Colorectal Cancer Organoids study from the suppresSTEM consortium.

COSMIC's cancer genome data is interpreted into standardised annotations from a variety of sources, described here.

Curated Genes

DDR2

Oncogenic gain-of-function mutations in DDR2 have been identified in squamous cell carcinoma (SqCC) of the lung. DDR2 encodes the discoidin domain receptor 2, a collagen-stimulated receptor tyrosine kinase. These kinases are involved in the regulation of cell differentiation, cell migration and cell proliferation. DDR2 mutations are present in 4% of lung SqCC where they are associated with sensitivity to dasatinib. Low frequency DDR2 mutations have been found in other cancer types such endometrial, kidney, brain, breast and colorectal, and in recurrent/metastatic head-neck SqCC.

SMAD2 and SMAD3

Mutations in SMAD2 and SMAD3 occur at very low frequency in various cancers types. SMAD2 mutations have been found in cervical and colorectal cancer, hepatocellular carcinoma and non-small cell lung cancer. SMAD3 mutations have been detected in colorectal cancer and in oral squamous cell carcinoma. Most of the mutations observed are missense mutations. Both SMAD2 and SMAD3 encode proteins which are major signalling molecules acting downstream of the serine/threonine kinase receptors.

NCOR1

NCOR1 (nuclear receptor corepressor 1) plays a part in maintenance of genomic integrity. It has been reported among the most frequently mutated drivers in breast cancer. Downregulation of NCOR1 expression abrogates HDAC3 function and results in genomic instability. Breast cancer patients with high NCOR1 expression levels have been found to have a better prognosis than those with low expression (Zhang et al., 2005). NCOR1 mutations also play a role in skin cancer, colorectal carcinoma and many other cancer types. Predicted damaging and somatic mutations in epigenetic regulators were detected in one third of high hyperdiploid acute lymphoblastic leukaemia (HD-ALL) patients (de Smith AJ 2016).

PPM1D

Protein phosphatase, Mg2+/Mn2+-dependent, 1D (PPM1D) encodes WIP1, a member of the PP2C family of serine/threonine protein phosphatases. PPM1D dephosphorylates DNA damage response mediators such as CHEK2 and p53, antagonising their function and promoting reentry into the cell cycle. Recurrent PPM1D mutations have been observed in brainstem gliomas, with many of these resulting in truncation of the C-terminal regulatory domain and leaving the phosphatase domain intact.

PREX2

Mutations across the PREX2 gene, including numerous truncating mutations, have been found in metastatic melanoma, including in desmoplastic melanoma, and also in other cancers such as basal cell carcinoma, pancreatic ductal and lung adenocarcinomas, and merkel cell carcinoma. PREX2 has been recognised as playing a role in melanoma for some years, although the precise nature of all the mechanisms of its involvement remain uncertain. Some in vivo and mouse studies have a demonstrated that cancer-associated PREX2 mutations promote the growth of human melanoma cells. It is a GTP/GDP exchange factor and both mutated and wild type PREX2 inhibit the tumour suppressor PTEN, but PTEN can no longer inhibit mutated PREX2, hence mutual inhibition is disrupted promoting tumour growth via activation of the PIK3 signalling pathway. Increased RAC-dependent invasiveness is also associated with mutated PREX2.

TET2 (updated)

TET2 (ten-eleven-translocation gene) is an epigenetic regulator responsible for converting DNA cytosine methylation to hydroxymethylation, a process disrupted by mutations which are known to be associated with myeloproliferative neoplasms (MPN), leukaemias and mastocytosis. An update of 46 publications which included screening of TET2, often along-side other genes or gene panels, has been carried out. Overall 2,027 new samples were curated which identified 277 new mutations of all types and located across the gene. Publications included reports of many haematopoietic and lymphoid disorders, as well as 2 where solid cancers progressed following hormone or tyrosine kinase therapy. One of these publications reported TET2 mutations associated with metastatic prostate cancer after hormone therapy and the second publication reported 12% TET2 mutated samples in non-small cell lung cancer progressions following tyrosine kinase therapy. MPN publications curated include those where TET2 was found associated with progression, and chronic myelomonocytic leukaemia, where mutated TET2 was predictive of inferior prognosis when co-occurring with ASXL1 mutation; myelodysplastic syndrome (MDS) and chronic eosinophilic leukaemia (CEL), including a report where mutated TET2 could help distinguish MDS/CEL from reactive disorders and hypereosinophilic syndrome respectively. Leukaemia publications include HTLV-1 associated adult T cell associated leukaemia/lymphoma (with TET2 as the most commonly mutated gene); angioimmunoblastic T cell leukaemia and peripheral T cell leukaemia, where TET2 mutation are associated with shorter PFS; And somatic TET2 mutation associated with AML in a family with familial platelet disorder.

Curated Gene Fusions

SET-NUP214

The SET-NUP214 fusion results from a recurrent genetic abnormality at 9q34 and is found predominantly in T-cell acute lymphoblastic leukaemia (T-ALL), with a reported frequency of up to 10%. The fusion is rarely detected in acute myeloid leukaemia, acute undifferentiated leukaemia and B-cell acute lymphoblastic leukaemia. In T-ALL, the SET-NUP214 fusion is associated with elevated expression of HOXA cluster genes and with corticosteroid/chemotherapy resistance. SET encodes a protein with a critical role in chromatin binding and remodelling, while NUP214 encodes an FG-repeat-containing nucleoporin involved in the cell cycle and transportation of material between the nucleus and cytoplasm. Most commonly the breakpoints in the SET-NUP214 transcript are at exon 7 of SET and exon 18 of NUP214.

Cancer Gene Census

Complete census list available here

Based on the concept defined by D. Hanahan and R. A. Weinberg, COSMIC, in collaboration with Open Targets, integrates functional descriptions focused on Hallmarks of Cancer into the Cancer Gene Census. New Hallmark pages visually explain the role of a gene in cancer by highlighting which of the classic behaviours are displayed by the gene and whether they are promoted or suppressed. A concise overview with associated references is initially available for 116 census genes and will be expanded on a regular basis.

Systematic Screen Papers

Follow links below to the 9 papers which are new in v81, or view the full table of papers here.

COSP43009COSP42418COSP41127COSP42048COSP42046COSP42047COSP42049COSP42050COSP41741

SNVs and indels have also been uploaded from a Colorectal Cancer Organoids study from the suppresSTEM consortium: COSU670


COSMIC Statistics:

1,301,793
Samples
4,476,526;
Coding mutations
24,843
Papers
18,763
Fusions
29,805
Whole genomes
1,168,057
Copy number variants
9,175,462
Gene expression variants
7,879,142
Differentially methylated CPGs

v80 - 13th February 2017

v80 (Feb 2017) includes a major new tool "COSMIC-3D" supporting target characterisation and pharmaceutical design alongside significant updates to our cancer genome and key cancer gene curations.


COSMIC-3D - expanding COSMIC's support for pharmaceutical design

We have a new interface to explore cancer mutations on 3D protein structures, "COSMIC-3D", now available for public evaluation. Produced in partnership with Astex Pharmaceuticals (Cambridge, UK), it shows interactive 3D visualisations of over 8000 human proteins (using PDB structures), with COSMIC mutations mapped, and options to see frequency and effect. Putative small-molecule drug pockets are identified, and can be explored alongside cancer mutations to identify, characterise and design molecules against new targets across oncology. All the information is correct, but as an beta-evaluation release we would value your feedback on the web interface, so we can make it as useful as possible.

New Curations in v80

In our traditional way, full and exhaustive literature curations are now provided across cancer genes USP8, FAT1, FAT4, CXCR4 and fusion pair PML-RARA; substantial curation updates are made to AR and CTNNB1 and the Cancer Gene Census describes 7 new genes. Genome-wide molecular profiles have been curated from the ICGC (release 23, Oct 7th 2016) and 421 new genomes have been added by curation of 18 systematic screen publications. For full details of the new content in v80 please see the Datasheet.

From our Pipeline

We use recommendations from the HGVS for syntax when annotating the data within COSMIC. As part of our ongoing commitment to data quality we are currently in the process of ensuring all our mutation data are described in the most modern ways, including the latest HGVS nomenclature and gene structures. Over the last 6 months we have been working on a new system to continually annotate COSMIC data to the latest standards. Of course, to ensure the new annotations are exactly correct, we are including expert manual oversight, so it takes a little time to completely validate our huge dataset. Once we have verified the precision of our system, it will be deployed in forthcoming releases.

Newsletter new

For more information about release v80 and other news please see the first issue of our Newsletter. We will be using this to communicate with you more frequently about the project and the exciting developments we have in the pipeline. This issue includes details about the COSMIC Workshop on March 6th and the beta release of COSMIC-3D


v79 - 14th November 2016

v79 (Nov 2016) includes substantial updates to our cancer genome and key cancer gene curations. Full literature curations are now provided across cancer genes PRKACA and AR, and fusion pair CBFA2T3-GLIS2; substantial curation updates are made, especially to GNAS, GNAQ, and GNA11, and the Cancer Gene Census describes 7 new genes. Genome-wide molecular profiles have been curated from the ICGC (release 22, Aug 2016) and 265 new genomes have been added by curation of 9 systematic screen publications. A new drug, Vismodegib, has been added to our Genetics of Drug Resistance, describing 19 therapy-resistance variants in the gene SMO.


Data Updates in brief (for full details of this latest release, please see the v79 Datasheet).

  1. New fully curated cancer genes;
    • PRKACA - 2,145 samples, 286 mutations, 55 papers
    • AR - 3,226 samples, 598 mutations, 141 papers
  2. Substantial updates to GNAS, GNAQ, and GNA11
  3. Curated Gene Fusions;
  4. Cancer Gene Census;
    • 7 new genes added
  5. Drug Resistance; 1 new gene (SMO) and 1 new drug (Vismodegib), 19 new unique resistance mutations curated.
  6. Genome Data
    • ICGC release 22; August 23rd 2016
      • Mutations; 2 new studies
      • Copy Number Variants; 2 new studies, 3 studies updated
      • Structural Variants; 1 new study
    • Systematic Screens; 9 new papers (265 new genomes)

COSMIC's cancer genome data is interpreted into standardised annotations from a variety of sources, described here.


Genetics of Drug Resistance

We now include drug resistance data for the gene SMO (Vismodegib) as well as updates for EGFR (Gefitinib,Erlotinib and Afatinib), ESR1 (Endocrine therapy) and ALK (Alectinib).

All drug resistance data is detailed here, describing our curations across 11 genes and 21 pharmaceuticals. Links are provided to explore this information in detail, with charts showing the landscape of resistance to drugs targeting mutations in the gene of interest.


Cancer Gene Census

7 genes have been added to the Cancer Gene Census:
EPAS1PTPRTPPM1DBTKPREX2TP63QKI

The complete list is available in the census table, which describes the role of each gene in cancer progression (tumour suppressor or oncogene). Currently this information is available for 244 census genes. This content, as well as additional functional annotation is being substantially expanded for future releases.


COSMIC and St Jude Children's Research Hospital data-sharing agreement

COSMIC data have been combined with the ProteinPaint data mining and visualization system at St. Jude Children's Research Hospital in Memphis TN, to support the discovery and understanding of genetic mutations in paediatric cancers [ .... read more ].


COSMIC Workshop March 2017

On Monday 6th March 2017 we are holding a workshop titled 'An introduction to COSMIC' at the Wellcome Trust Genome Campus, Hinxton, Cambridge, UK.

The course will begin with a presentation overviewing the COSMIC project, followed by a hands-on tutorial introducing the COSMIC website and strategies for exploring cancer variation data and investigating the genetic causes of human cancers. In addition, there will be short presentations describing exciting new developments scheduled for future release, and opportunities to engage the team in a group Q&A session and informal discussions about the COSMIC website and future plans.

Registration will open in January, but please email cosmic@sanger.ac.uk if you would like more information or wish to express an interest in attending.

If you would be interested in hosting a COSMIC workshop at your workplace, we would be very pleased to hear from you. Please contact the COSMIC helpdesk (cosmic@sanger.ac.uk)


Website Developments

We are planning to merge the functionality of the COSMIC and Whole Genomes websites in February 2017 (v80). We will be introducing a new 'whole genomes' filter on the gene and cancer browser pages, and as a consequence the Whole Genomes site will become redundant and will be retired.

An API and new web interfaces for downloading COSMIC data will also be developed and rolled out in 2017. As part of these developments, and due to incompatibility between BioMart (0.7) and the latest version of our Oracle databases, we are discontinuing support for the COSMICMart in this release.

If you have any questions about these changes please email the COSMIC helpdesk (cosmic@sanger.ac.uk).


COSMIC Statistics:

1,257,487
Samples
4,175,878
Coding Mutations
23,870
Papers
18,165
Fusions
29,112
Whole Genomes
2,113,866
Copy Number Variants
9,175,462
Gene Expression Variants
7,879,142
Differentially Methylated CpGs

v78 - 5th September 2016

COSMIC has been updated significantly in v78 (Sept 2016). This major data release includes new full literature curations of cancer genes HIF1A, MTOR and PTPN13, drug resistance profiles across Sorafenib & Quizartinib, and a complete update of genome-wide analysis from the ICGC (release 21, May 2016). We have also added 9 new genes to the Cancer Gene Census, and fully re-analysed the copy number data across all TCGA samples using the ASCAT2 algorithm.


Data Updates in brief (for full details of this latest release, please see the v78 Datasheet).

  1. New fully curated cancer genes;
    • HIF1A - 1,782 samples, 196 mutations, 56 papers
    • MTOR - 3,239 samples, 634 mutations, 132 papers
    • PTPN13 - 1,761 samples, 429 mutations, 85 papers
  2. Curated Fusion;
    • ETV6-RUNX1 - 2,276 samples, 357 mutations, 37 papers
  3. Mouse insertional mutagenesis
    • The latest update adds mouse data for an additional 5,600 genes
  4. Genetics of Drug Resistance
    • Resistance data for 1 new therapeutic target gene and 2 additional drugs
  5. Genome Data
    • Copy Number Variation; Re-analysis of all TCGA studies with ASCAT v2
    • ICGC release 21; May 16th 2016
      • Gene Expression; 5 studies updated
      • Methylation; 3 studies updated
      • Copy Number Variants; 10 new studies, 28 studies updated
      • Structural Variants; 2 new studies
    • Systematic Screens; 31 new papers (1,165 new samples)

COSMIC's cancer genome data is interpreted into standardised annotations from a variety of sources, described here.


Genetics of Drug Resistance

New in v78; FLT3 with drugs Quizartinib, and Sorafenib, detailing a total of 76 new unique resistance mutations.

All drug resistance data is now detailed here, describing our curations across 10 genes and 20 pharmaceuticals. Links are provided to explore this information in detail, with charts showing the landscape of resistance to drugs targeting mutations in the gene of interest.


Cancer Gene Census

9 genes have been added to the Cancer Gene Census:
DDR2MAPK1BCORL1KEAP1LRP1BDROSHAB2MDDX3XAPOBEC3B

The complete list is available in the census table, which describes the role of each gene in cancer progression (tumour suppressor or oncogene). Currently this information is available for 237 census genes. This content, as well as additional functional annotation is being substantially expanded for future releases.


Cell Lines Project

Over time we have added filters aimed at selecting those variants within the cell lines that are more likely to contribute to carcinogenesis. These have included the ability to select variants in genes known to contribute to cancer (Cancer Gene Census), as well as an estimation of the mutation impact on the protein as determined by FATHMM. We have now extended this list of filters to include a filter that identifies variants within the cell lines that are similar to variants seen recurrently in whole genome screened tumour samples. The criteria for calling a variant as recurrent differs based on mutation type. For further details please see the Genome Annotation page.


COSMIC Expansion

We welcome two new starters to the COSMIC team, Dr. John Tate and Ms. Bhavana Harsha. John is our new web design and visualisation specialist who will be driving new developments and improving the design of the website. Bhavana, our new bioinformatic specialist, is developing a new annotation system to handle the ever increasing volume and complexity of genomic variation data.

Thank you for your continued support.


COSMIC Workshop

On Monday 26th September 2016 we are holding a workshop at the University of Cambridge, UK, titled 'COSMIC: Exploring cancer genetics at high resolution'.

During the course we will use the live COSMIC website and genome browser to show you how to access and explore cancer variation data, seeking to identify genetic causes and targets in all human cancers.

For more details please see the course timetable.

If you wish to attend the workshop, please visit the registration page.

If you would be interested in hosting a COSMIC workshop at your workplace, we would be very pleased to hear from you. Please contact cosmic@sanger.ac.uk


Oracle database downloads

We are considering changing the compatibility of the Oracle data pump export files from supporting Oracle 10g to 11g (11.2). If this change will cause problems for you, please let us know by emailing the COSMIC helpdesk (cosmic@sanger.ac.uk).


COSMIC Statistics:

1,235,846
Samples
4,067,689
Coding Mutations
23,489
Papers
18,029
Fusions
28,366
Whole Genomes
1,271,436
Copy Number Variants
9,175,462
Gene Expression Variants
7,879,142
Differentially Methylated CpGs

v77 - 16th May 2016

COSMIC now encompasses the Genetics of Drug Resistance across 9 therapeutic target genes and 18 drugs (release v77). Also, full mutation profiles across ATR, TBX3 & NFKBIE, STIL-TAL1 & DNAJB1-PRKACA gene fusions, and over 700 new cancer genomes.


Data Updates in brief (for full details of this latest release, please see the v77 Datasheet).

  1. New fully curated cancer genes;
    • TBX3 - 55 papers, 2,724 samples, 241 mutations.
    • NFKBIE - 34 papers, 1,710 samples, 176 mutations.
    • ATR - 106 papers, 3,379 samples, 646 mutations.
  2. Curated Fusions;
  3. Genome Data
    • 16 New Systematic Screen papers (741 genome-wide tumour analyses).
  4. Genetics of Drug Resistance
    • New Content: Drug resistance data for 9 therapeutic target genes and 18 drugs
  5. Cancer Gene Census
    • 23 new cancer-causing genes annotated, with evidence.

COSMIC's cancer genome data is interpreted into standardised annotations from a variety of sources, described here.


Genetics of Drug Resistance

In this COSMIC release, we now encompass the genetics of drug resistance, somatic mutations that allow a tumour to continue growing despite targeted therapeutics. Initial curations cover 9 genes and 18 pharmaceutical therapies (listed below), detailing 226 resistance-driving mutations.

Genes: ABL1 ,ALK ,BRAF ,EGFR ,ESR1 ,KIT ,MAP2K1 ,MAP2K2 ,PDGFRA
Drugs: Vemurafenib, AZD9291, Ceritinib, Erlotinib, Gefitinib, Imatinib, Nilotinib, Tyrosine kinase inhibitor - NS, Afatinib, Endocrine therapy, Alectinib, PD0325901, Dasatinib, Crizotinib, Selumetinib, Sunitinib, Dabrafenib, Bosutinib

This information is available in the 'Drug Resistance' tab of the gene analysis pages; where a table and charts show the landscape of resistance to drugs targeting mutations in the gene of interest. For example, please look at the Tyrosine Kinase Inhibitors associated with EGFR.


Cancer Gene Census

23 genes have been added to the Cancer Gene Census:
AR, CHD4, CTCF, CXCR4,ERBB4 , FAT1 , FAT4 , HIF1A , LEF1 , LZTR1 , MTOR , NCOR2 , PRKACA , PTK6 , PTPN13 ,RBM10 , SDHA , SMAD2 , SMAD3 , TGFBR2 , USP8 , ZFHX3

New information is added to the census table, describing the role of each gene in cancer progression (tumour suppressor or oncogene). Currently this information is available for 156 census genes. This content, as well as additional functional annotation is being substantially expanded for future releases.


COSMIC Expansion

This spring, we welcome three new additions to the COSMIC team. Dr Laura Ponting and Dr Raymund Stefancsik join us from Cambridge University (UK) as curator scientists. They are now enhancing our team of expert manual curators, aiming to comprehensively describe the range of cancer-causing mutations across all cancer genes (driven by the Cancer Gene Census, describing 595 genes).

In addition, Charalambos (Harry) Boutselakis joins us from London's Farr Institute, bringing substantial informatic expertise across databases and data analytics. He will be expanding the ways in which COSMIC can be used while ensuring its immediate responsiveness as the database increases in size and scope.

Thank you for continuing to support us.


Registration and email announcements

Please ensure you are registered (here) for data downloads, and to ensure you receive future communications.


COSMIC Statistics:

1,209,567
Samples
4,118,156
Coding Mutations
23,084
Papers
17,628
Fusions
25,875
Whole Genomes
1,064,039
Copy Number Variants
9,479,893
Gene Expression Variants
7,879,142
Differentially Methylated CpGs

v76 - 16th February 2016

v76 includes full curations across cancer genes PPP6C and SPOP, genomic content from 17 systematic screen publications, and a complete update from ICGC release 20. We welcome two new scientists to the COSMIC team who will be focused on identifying targets and biomarkers across the expanding COSMIC dataset. The streamlining of our website also continues, improving the layout and design of many large-data webpages, and we have improved our Download files to simplify frequency calculations across COSMIC datasets.


Data Updates

For full details of this latest release, please see the v76 Datasheet; in brief:

  1. Curated Genes;
    • PPP6C - 33 papers, 1023 samples, 164 mutations.
    • SPOP - 74 papers, 2230 samples, 224 mutations.
  2. Genome Data
    • ICGC release 20; November 27th 2015
      • 2 new studies; AML-US and BTCA-JP
      • Gene Expression; 17 studies updated (187 new samples, 227,023 new variants)
      • Structural Mutations; 2 studies updated (243 new samples, 18,369 new variants)
      • Copy Number Variants; 18 studies updated (1,188 new samples, 44,689 new variants)
    • 17 New Systematic Screen papers (238 new samples).

COSMIC's cancer genome data is interpreted into standardised annotations from a variety of sources, described here.


COSMIC Expansion

We welcome two new scientists who will be investigating the curated database and annotating the most interesting target and biomarker opportunities across this enormous database.

Dr. Sam Thompson is a medical statistician with expertise in clinical trials. In collaboration with Bayer Pharmaceuticals, she will be exploring correlations across the different types of variant annotation in COSMIC, aiming to systematically identify novel markers for disease.

Dr. Harry Jubb brings a proteomic perspective to COSMIC. Working together with Astex Pharmaceuticals, Harry will spend the next three years enhancing our visualisation of coding mutations, and investigating which mutated peptide domains are tractable for pharmaceutical design.

Thank you for your support, allowing us to enhance the utility of the curations in COSMIC.


Website Updates

We have extended the layout and design used on the Gene page to the Cancer Browser, Sample, Study, and Mutations pages. Tabulations showing variant annotations from multiple datatypes have been combined into a 'Variants' tab on these pages.

On the Overview tab of the Gene page various icons indicate if the selected gene is part of a significant dataset. The icons census icon, classic icon and mouse icon indicate a cancer census gene, an expert curated gene, and a gene with a significant role in oncogenesis as evidenced from mouse insertional mutagenisis experiments.

Substantial changes are made on the Genome Browser home page with a new smart search feature with the option to select any of the specific datasets; COSMIC, Whole Genomes or the Cell Lines Project.


Download Files

We have updated the structure of the mutation files in our Download site to simplify the calculation of mutation frequencies. Data has been separated according to the type of screening method used; targeted gene screen and whole genome screen. We have also enhanced the information available from the sample details file so that whole genome samples can be extracted for use in whole genome screen mutation frequency calculations. Please see our FAQ for details.


Registration and email announcements

We are changing the way we communicate release updates to COSMIC users. Please register to ensure you receive future communications.


COSMIC Statistics:

1,192,776
Samples
3,942,175
Coding Mutations
22,844
Papers
17,245
Fusions
25,133
Whole Genomes
1,064,039
Copy Number Variants
9,479,893
Gene Expression Variants
7,879,142
Differentially Methylated CpGs

v75 - 24th November 2015

v75 includes curations across GRIN2A, fusion pair TCF3-PBX1, and genomic data from 17 systematic screen publications. We are also beginning a reannotation of TCGA exome datasets using Sanger's Cancer Genome Project analyis pipeline to ensure consistency; four studies are included in this release, to be expanded across the next few releases. The Cancer Gene Census now has a dedicated curator, Dr. Zbyslaw Sondka, who will be focused on expanding the Census, enhancing the evidence underpinning it, and developing improved expert-curated detail describing each gene's impact in cancer. Finally, as we begin to streamline our ever-growing website, we have combined all information for each gene onto one page and simplified the layout and design to improve navigation.


Data Updates

For full details of this latest release, please see the v75 Datasheet; in brief:

  1. Curated Genes;
    • GRIN2A - 93 papers, 2004 samples, 667 mutations.
  2. Curated Fusions;
    • TCF3-PBX1 (E2A-PBX1) - 48 papers, 3416 samples, 296 mutations
  3. Cancer Gene Census; 4 names have been updated.
  4. Genome Data
    • 4 TCGA studies reanalysed by the WTSI Cancer Genome Project.
    • 17 new Systematic Screen papers (474 new samples).

COSMIC's cancer genome data is interpreted into standardised annotations from a variety of sources, described here.


The Cancer Gene Census

We welcome Dr. Zbyslaw Sondka to the COSMIC team. Working in collaboration with The Centre for Therapeutic Target Validation (CTTV) he will be curating the Cancer Gene Census; building the evidence behind existing genes as well as extending the census list.


Website Updates

Gene Pages upgrade and redesign

Overview information has been merged into the Gene Analysis page. This page also has a full featured Genome Browser which repsonds to filters. The page layout has also been redesigned, with tabulations organised under a single 'Data' tab and studies and publications combined under the 'References' tab.

COSMIC Beacon

The GA4GH (Global Alliance for Genomics and & Health) Beacon Project is a project to encourage international sites to share genetic data in the simplest of all technical contexts. The service is designed merely to accept a query of the form "Do you have any genomes with an 'A' at position 100,735 on chromosome 3" (or similar data) and responds with one of "Yes" or "No."

The Beacon Network lists all the known beacons, including the newly released COSMIC Beacon

Cancer Genome Browser

A new miRNA track has been added across all browsers, with the data sourced from miRBase.


Registration and email announcements

We are changing the way we communicate website updates to COSMIC users. As from this release all our registered users will receive email notification of updates to the website. We would encourage all those who have subscribed to the mailing list cosmic-announce@sanger.ac.uk to register as communication via this list is being phased out. If you are registered but prefer not to receive emails you can opt out by logging in and going to the Account Settings page.

We have also introduced a new 'non-affiliated' category to allow users who do not belong to a recognised academic or corporate organisation to register for email updates using their personal email address.


COSMIC Statistics:

1177397
Samples
3702312
Coding Mutations
22621
Papers
17209
Fusions
23223
Whole Genomes
1019350
Copy Number Variants
9252870
Gene Expression Variants
7879142
Differentially Methylated CpGs

v74 - 8th September 2015

v74 brings a new focus on curating blood cancer fusion genes, starting with BCR/ABL and KMT2A (MLL) fusions. We are also beginning to capture much greater clinical details on the samples we curate, now available for download. More traditionally, somatic mutations are curated from three new cancer genes, POLE, AXIN2 and KDM6A. Substantial new genomic data are included from 17 systematic screen publications, and a full update to the latest ICGC release (v19).


Data Updates

For full details of this latest release, please see thethe v74 Datasheet; in brief:

  1. Curated Genes; 3 new fully curated genes (POLE,AXIN2 and KDM6A).
  2. Curated Fusions; Representative curations of blood cancer fusions BCR/ABL and KMT2A (MLL).
  3. Cancer Gene Census; FLT4 has been added the census.
  4. Genome Data Imports
    • Simple Somatic Mutations (SSM);
      • ICGC release 19 (2 new studies, 49 studies updated).
      • 17 new Systematic Screen papers .
    • Structural Variants; 2 new and 9 updated studies (ICGC releaase 19)
    • Copy Number Variants; 3 new and 13 updated studies (ICGC releaase 19)
    • Gene Expression Variants; 15 updated studies (ICGC releaase 19)

COSMIC's cancer genome data is interpreted into standardised annotations from a variety of sources, described here.


Website Updates

Non-coding variants

"Mutation Impact" scores (via FATHMM-MKL) are now available for non-coding variants. These values can be viewed on the NCV, Study and Sample overview pages, and the COSMIC Genome Browser (functionally significant variants are coloured blue). They are also included in the download files on the SFTP site. There are 422,212 functionally significant variants (scores ≥ 0.7). Please see the Mutation Impact section of Cancer Genome Annotation for help interpreting the scores.

COSMIC Sample Features

We are now capturing substantially more clinical feature annotations on the samples we curate. Across 24 new columns we are capturing, where available, annotations such as therapeutic regimes and responses, mutation allele specification, tumour stage/grade/cytogenetics, patient age/ethnicity/gender. This full information is available via COSMIC Downloads, and is also displayed on the website on each individual Sample Overview page. For full details of these rich expanded clinical annotations, please see the 'Cosmic sample features' section (describing the CosmicSample.tsv.gz file) here.


COSMIC Statistics:

1,144,255
Samples
3,480,051
Coding Mutations
22,276
Papers
16,648
Fusions
22,690
Whole Genomes
1,018,171
Copy Number Variants
9,252,792
Gene Expression Variants
7,879,142
Differentially Methylated CpGs

v73 - 24th June 2015

v73 contains full expert curation across 9 cancer genes, 26 systematic screen publications and ICGC release 18. 'Mutation impact' filters across the website now estimate pathogenic functional consequences, based on the new FATHMM-MKL algorithm. Substantial new information is now present in the COSMIC Genome Browser: regulatory features from ENCODE are now available, particularly enhancing the utility of the differential methylation and non-coding variant data; human SNPs are now shown alongside COSMIC somatic mutations, and genome browsing is now navigable via our Cancer Browser.


Data Updates

Below is a summary of new data in v73, please see the v73 Datasheet for further description.

Expert Curations
  1. Curated Genes; 9 new fully curated genes (SPEN, IKBKB, MYOD1, KDM5C, ACVR1, ESR1, CDKN2C, COL2A1, and CACNA1D).
  2. Cancer Gene Census; 67 gene symbols have been updated to correspond with the approved symbols described in the HGNC database.
  3. Genome Data Imports
    • • Simple Somatic Mutations (SSM);
      • • ICGC release 18 (3 new studies).
      • • 26 new Systematic Screen papers (805 new samples).
  4. Cell Lines Project; tissue/histology classifications updated in 267 cell lines.

COSMIC's cancer genome data is interpreted into standardised annotations from a variety of sources, described here.


Website Updates

Mutation Impact

We have upgraded our 'Mutation Impact' filters to use scores generated by the a new version of FATHMM (FATHMM-MKL). See the v73 Datasheet for more information.

Genome Browser
  • 3 new tracks have been added:
    • • ENCODE Regulatory Features.
    • • dbSNP (build 142).
    • • SNPs flagged by our 'noise reduction' filtering system which are excluded from the COSMIC website.
  • A new version of the Genome Browser (JBrowse) has been added to the Cancer Browser page to view data by disease.

COSMIC Statistics:

Samples
1,121,509
Coding Mutations
3,430,789
Papers
21,631
Fusions
10,894
Whole Genomes
21,076
Copy Number Variants
863,308
Gene Expression Variants
8,610,091
Differentially Methylated CpGs
7,882,461

v72 - 31st March 2015

v72 is our largest release ever, containing new annotations across 5466 cancer genomes and full literature curation across 22 new cancer genes, 28 fusion pairs; 26 genes have been added to the Cancer Gene Census. We provide our first integration of differential methylation data and many additional mutations, copy number aberrations and expression variants from recent ICGC & TCGA releases. All genomic events in COSMIC have been upgraded to GRCh38 (with a GRCh37 archive available). Finally, we present a new curated resource, to be regularly updated, describing the characterisation of 30 mutation signatures across human cancer.


Licensing

COSMIC is adopting a new licensing strategy for v72, to grow the scope of our literature curations, enhance the analytics available across our data, and support the capacity to sustain this ever-growing database into the future. Key changes are -

  1. Access to the COSMIC website will stay free for all users.
  2. For-profit organisations will be required to pay a fee to download COSMIC datasets.
  3. Download by academic and non-profit organisations will remain free.

All licensing payments are used to grow COSMIC, its coverage and analytic usefulness for oncology insight. We will also be inviting licensees to tell us which priorities we might best pursue, to ensure the direction of COSMIC best supports these industries' commercial oncology research. Please see our Licensing page for more details.


Data Updates

This v72 release is too large to describe here in detail. Here's a summary, please see the v72 Datasheet for further description.

Expert Curations
  • • 22 new cancer genes with full literature curation.
  • • 28 new fully curated fusion pairs.
  • • 26 new genes added to Cancer Gene Census.

Our curations are generated by expert postdoctoral scientists, described here.

Cancer Genomes
  • • Upgrade to genome version GRCh38, (GRCh37 archive website available at http://grch37-cancer.sanger.ac.uk)
  • • Differential Methylation; now integrated across 12 new TCGA studies (4,377 samples, from ICGC release 18)
  • • Copy Number Variants (CNV); ICGC release 18 (1 new study, BOCA-UK). Data added for 16 new samples in Cell Lines Project
  • • Gene Expression Variants; added for all samples in Cell Lines Project
  • • Simple Somatic Mutations (SSM);
    • • ICGC release 17
    • • Paediatric Cancer Genome Project; 7 studies (142 new samples).
    • • 69 new Systematic Screen papers (841 new tumour samples).

COSMIC's cancer genome data is interpreted into standardised annotations from a variety of sources, described here.


Website Updates

Genome Version GRCh38 Upgrade

We have updated the genomic coordinates in COSMIC to GRCh38. However, we are also hosting a parallel website to display the data on the GRCh37 reference. This GRCH37 site will be maintained and updated throughout 2015 with any new source data where the original coordinates are on GRCh37. However, it will not be updated with any new data where the original coordinates are on GRCh38.

Mutation Signatures

Different mutational processes generate unique combinations of mutation types, termed "Mutational Signatures". Based on an analysis of 10,952 exomes and 1,048 whole-genomes across 40 distinct types of human cancer we have added a Mutation Signatures page on the website; a curated census of signatures providing the profiles of, and additional information about, known mutational signatures.

Integration of Differential Methylation data

We have integrated methylation data across the COSMIC website. The Gene Analysis page has been extended to show a methylation track on the mutation histogram, differential methylation counts in the tissue tab and a new 'Methylation' tab has been added to display a table of variants. The Cancer Browser, Study and Sample Overview pages have also been updated to integrate methylation data. The majority of methylation annotations are outside gene footprints, COSMIC's Genome Browser is the best way to explore this information.

Genome Browser

The COSMIC Genome Browser is valuable tool for exploring COSMIC data in its genomic context. This browser can be used to explore the data in COSMIC, COSMIC genomes (WGS) and the COSMIC Cell lines Project, on either the GRCh37 or GRCh38 reference sequence. It can also be used to view the data for an individual sample if selected. Please see the COSMIC Genome Browser homepage for more details.


COSMIC Statistics:

Samples
1,103,964
Coding Mutations
3,158,657
Papers
21,086
Fusions
10,890
Genomic Rearrangements
61,232
Whole Genomes
19,672
Copy Number Variants
842,651
Gene Expression Variants
8,228,797
Methylation Variants
7,652,950

v71 - 4th Nov 2014

v71 includes full literature curation of PTPRB, PLCG1, POT1 and STAG2, the addition of 25 new census genes and an update of gene expression and copy number data from ICGC release 17 (Sept 2014).

Census Genes

The Cancer Gene Census has been updated with 25 new genes, this brings the total of known cancer genes substantiated by the scientific literature to 547. The new genes are :

CUX1 COL2A1PTPRB PLCG1 NAB2 STAT6 FOXO4 NFATC2 DCTN1 RSPO2 RSPO3 EIF3E PTPRK NRG1 HLA-A MYO5A PPFIBP1 ERC1 PWWP2A CLIP1 ZCCHC8 KIAA1598 LMNA CEP89 LSM14A


Cell Lines Project Update

We have added an additional 16 cell lines to the Cell Lines Project. The lines are:

SW403 MCAS UDSCC2 KYSE-30 MC-1010 CRO-AP3 BL-70 GEO LIM1215 Sarc9371 OUMS-23 HUH-6-clone5 NCI-H820 MCC13 MCC26Set2


Website Updates

Mouse Data

We have included an initial integration of mouse insertional mutagenesis data for 851 COSMIC genes from the CCGD (Candidate Cancer Gene Database) adding supporting evidence for cancer driver genes. These data are integrated in the Gene Overview page, more details can be found here.

Mutation Matrix (Study/Publication Page)

A mutation matrix plot has been added to the Study Overview page, enabling the relationship between genes, point mutations, copy number gains/losses, over/under gene expression and samples to be investigated for a specific study or publication.

Genome Browser (Sample Overview Page)

For whole genome analysed samples the Sample Overview page now includes a Genome Browser (JBrowse), allowing all mutation types for a sample (including coding and non-coding mutations, and aberrant copy number and gene expression) to be viewed in genomic context with COSMIC and Ensembl gene annotations (GRCh37).

Tutorials

There is a new tutorial section in the help pages including 4 new tutorials demonstrating the Sample, Gene, Fusion and Cancer Browser pages.

Copy Number Variation (CNV) Data

We have added 7,148 new copy number variants from 8 new TCGA studies (source ICGC release 17, re-analysed with ASCAT).

Gene Expression

We have nearly doubled the gene expression data in COSMIC by adding data from 10 new studies from TCGA (source ICGC release 17). The platforms supported are: IlluminaHiSeq_RNASeqV2, IlluminaGA_RNASeqV2, IlluminaHiSeq_RNASeq, and IlluminaGA_RNASeq. Please note that as from this release we no longer show results from the array platforms AgilentG4502A_07_2 and AgilentG4502A_07_3. For more information please visit the gene expression help page.


Literature Curation

PTPRB and PLCG1

PTPRB, encoding a tyrosine phosphatase specific to the vascular endothelium that inhibits angiogenesis, has been identified as a tumour suppressor gene in angiosarcoma. Mutations were found in secondary tumours or those with MYC amplification, a biomarker of radiation-associated secondary angiosarcoma. PLCG1, encoding a tyrosine kinase signal transducer in the phosphoinositide pathway, also has recurrent, likely activating mutations in angiosarcoma. PLCG1 gain of function mutations have previously been identified in cutaneous T-cell lymphoma.

POT1

POT1 encodes a single-stranded telomere-binding component of the shelterin complex. It is the only shelterin that contains 2 N-terminal oligonucleotide/oligosaccharide-binding (OB) domains. Recurrent mutations in POT1 have been found in chronic lymphocytic leukaemia where they occur in the clinically aggressive subtype with wild-type IGHV@. The POT1 mutations are most often found in gene regions encoding the 2 OB folds.

STAG2

Stromal antigen 2 (STAG2) is a subunit of cohesin complex and has a role in chromatid separation during cell division. Genetic disruption of this process can lead to aneuploidy in cancer. A number of tumour types have been found to harbour somatic mutations in STAG2, these include bladder cancer, myeloid neoplasms and glioblastoma. The gene maps to the X-chromosome (Xq25) and is present as a single copy in males; in females the other X-chromosome is inactivated. Hence, complete genetic inactivation of STAG2 requires only a single mutational event. STAG2 has also been suggested to act as a tumour suppressor via other mechanisms distinct from its role in cohesion.

Systematic screens

We have added mutation data for 841 tumour samples from publications where genome wide analyses have been used. More details can be found here


COSMIC Statistics:

Genes
28977
Samples
1058292
Coding Mutations
2710449
Papers
20247
Unique Variants
2139424
Fusions
10567
Genomic Rearrangements
61232
Whole Genomes
15047
Copy Number Variation
702652
Gene Expression
118886698

IE Support

As from this release we no longer support Internet Explorer version 8. This allows us to facilitate and develop tools for the latest browsers and provide a richer user experience. We apologise for the inconvenience caused to IE 8 users.

v70 - 14th Aug 2014

v70 includes an initial integration of gene expression data from TCGA, full literature curation of CALR, CD79A and CD79B, 12 whole-genome sequencing publications, and extensive updates to point mutation and structural variant data from ICGC (release 16, May 2014) and TCGA.

Website Updates

Gene Expression

Gene expression level 3 data has been integrated into COSMIC from 10 publicly accessible TCGA studies. The platform codes currently used to produce the COSMIC gene expression values are: IlluminaGA_RNASeqV2, IlluminaHiSeq_RNASeqV2, AgilentG4502A_07_2, AgilentG4502A_07_3 . COSMIC now includes gene expression alongside coding mutations and copy number aberrations on the cancer browser, sample overview, gene analysis and study/paper overview pages. We have also added a gene expression track to the histogram on the gene analysis page and the circos diagram on the sample overview page, more details can be found here.

Mutation Matrix

A mutation matrix has been added to the cancer browser, enabling the relationship between genes, point mutations, copy number gains/losses, over/under gene expression and samples to be investigated for a specific cancer.

The mutation matrix chart shows 20 x 175 boxes, with each box representing a gene-sample combination. Genes are ranked by the number of samples with variations (depending on the selected data type) and the samples are sorted using a clustering algorithm to group them in relation to the ranked genes, more details can be found here.


Data Filtering

To improve the value of COSMIC data we have tried to identify the most significant high-value data within cancer genomes using the following filtering strategies -

Mutations

We have excluded data from any sample with over 15,000 mutations. In addition, we have flagged all known SNPs as defined by the 1000 genomes project, dbSNP and a panel of 378 normal (non-cancer) samples from Sanger CGP sequencing. Using this approach 812,136 mutations have been flagged. Although all data are included in our download files, we have excluded flagged mutations from the website.

Copy Number Variation (CNV) Data

Although no CNV data has been excluded from the website, we have applied filtering so that by default only the most significant variants are shown. For these CNVs the minor allele and total copy number values are known and gain/loss has been defined using stringent criteria [ see the Copy Number Variants section in the help pages ]. However, at the head of every table showing CNVs there is an option to switch off the filter and view all the data.

Sample Overview Page

In order to make it easier to examine each sample, analysis filters have been introduced on the sample overview page. These filters allow you to specify that the mutations viewed should be likely pathogenic (as defined by FATHMM analysis), in the cancer census genes, or of a particular mutation type. In future releases, we will be developing further filters across these data to enhance their analysis.


Tutorials

We have started to upgrade our help pages and have introduced two new tutorials to help users navigate the COSMIC website. The first of these tutorials focus on the components of the website [ Site Tour ] and a guide to searching COSMIC [ Search ].


Literature Curation

CALR

The recently identified oncogene calreticulin (CALR) is a multi-functional Ca+ binding protein chaperone localised in the endoplasmic reticulum. CALR somatic mutations are now the second most prevalent mutation seen in patients with myeloproliferative neoplasms; Mutations have found in the majority of JAK2/MPL mutation-negative essential thrombocythaemia (ET) and primary myelofibrosis (PMF) patients, in addition to a small number of myelodysplastic patients (RARS, RARS-T, CMML and aCML). Almost all the reported mutations are insertion, deletion or complex mutations generating a +1 bp frameshift and an extended novel CALR C-terminal domain. CALR mutations appear to be associated with a more benign clinical course, younger age and male sex.

CD79A and CD79B

The Ig-alpha and Ig-beta proteins encoded by CD79A and CD79B are necessary for expression and function of the B-cell antigen receptor. Recurrent activating mutations in CD79A and CD79B have been identified in diffuse large B cell lymphoma where they occur more frequently in the activated B-cell-like subtype. The ITAM (immunoreceptor tyrosine-based activation motif) domain is targeted, with a hot spot at Y196 in CD79B. Mutations in both genes have also been found in Waldenstrom???s macroglobulinaemia.

Systematic screens

In this release 12 systematic screen publications have been curated in COSMIC, more details can be found here.


COSMIC Statistics:

Genes
28735
Samples
1029547
Coding Mutations
2002811
Papers
19703
Unique Variants
1564699
Fusions
10435
Genomic Rearrangements
61299
Whole Genomes
12542
Copy Number Variation
695504
Gene Expression
60119787

IE Support

We have decided to drop support for Internet Explorer version 8 from November 2014. This allows us to facilitate and develop tools for latest browsers and provide rich user experience for our users. We apologise in advance for the inconvenience caused to IE 8 users.

v69 - 2nd June 2014

COSMIC v69 Release

This 69th release of COSMIC contains a major update on cancer genomes, including over a million novel mutations from ICGC resequencing projects.


Cancer Genome Updates.

Earlier this year ICGC release 15.1 (Feb 2014) defined over 1.5 million new somatic mutations, and these are now curated and combined into COSMIC. This update includes more than 1800 new cancer genomes and updates to many others, allowing COSMIC to represent 33 ICGC studies. COSMIC now contains mutant details across a total of 9424 cancer genomes.


Genome noise reduction.

Cancer genomes can be a very noisy source of data. It is estimated that an individual's tumour is caused by 5-10 driver mutations, but genome resequencing regularly reveals over 10,000 somatic mutations per tumour, with much larger numbers not unusual in hypermutated samples (we've seen samples with over 100,000 mutations each, the greatest being 178,763). Across studies from different groups using different techniques, it is unclear whether these huge numbers reflect true hypermutation, substantial germline variation or technical artefacts. To try to improve the value of these data, we are beginning to define a cancer genome noise reduction strategy. Initially, we will exclude any sample with over 15,000 mutations, as this immediately introduces huge noise into COSMIC; these can be reintroduced at a later date. In addition, we are removing all known SNPs from new genome uploads (initially, these are defined by the 1000 genomes project and a panel of normal (non-cancer) samples from Sanger CGP sequencing). We will be assessing how to enhance these filters and best apply them to our curated genes over the next release. Ultimately, we aim to identify the most significant high-value data within cancer genomes, making it much easier to identify actionable biomarkers.


Download registration

COSMIC is a free resource to the cancer research community, released 4 times a year and available in a number of different ways. While the website is best used to explore the information, the most comprehensive datasets are obtained via download, where the entire database can be obtained in various formats. Over future releases we intend to tidy these systems. So we can do this most effectively, we are asking those who download our large data files to register their interest, so we can begin to understand which files are most useful and which are irrelevant. This will also enable us to discuss with those most interested, how these files and systems should evolve as we begin to consider how the project should develop over the next few years.


COSMIC v69 Total Statistics

Genes 27829
Samples 999872
Coding Mutations 1808915
Unique Variants 1412383
Papers 19031
Fusions 10389
Genomic Rearrangements 7584
Whole Genomes 9424
Copy Number Aberrations 674592


Additional CGP resources:

The Cancer Gene Census, a listing of all genes known to be involved in cancer promotion

The Cancer Cell Line Project, defining key driver mutations in 1015 common cancer cell lines

Cosmic Whole Genomes, all tumours with genome-wide somatic annotations

Genomics of Drug Sensitivity, Analysis of drug sensitivity data in human cancer cell lines.



v68 - 4th Feb 2014

COSMIC's 10th anniversary & 68th release

The COSMIC system was first released to the public on 4th February 2004, and this 68th release marks our 10th anniversary. From an initial release containing only 4 genes, our latest version presents full mutation spectra across 132 known cancer genes, 208 fusion gene pairs and almost 8000 cancer genomes. This information is available in many forms, most easily via our website, which has been designed to make the data easy to browse and explore. The website is updated every release with new functionality, and if you have any suggestions on changing or adding anything, please tell us at our email address:cosmic@sanger.ac.uk. Over these years, we have paid particular attention to genes strongly implicated in cancer, maintaining a list called the Cancer Gene Census, and this has grown far beyond initial expectations, currently numbering 522 genes, with a rate of increase that shows no sign of slowing. We are very grateful for the continued strong support from our director Professor Sir Michael Stratton, our core funder The Wellcome Trust, and our industrial collaborators, Bayer, Cancer Research Technology and AstraZeneca, who enable COSMIC to grow rapidly both in content and scope, maintaining our long-term aim to describe in detail every form of somatic mutation in human cancer.


CENSUS - 9 new genes

Nine new genes now have enough evidence to be considered known cancer genes. These are added to the The Cancer Gene Census, bringing the total to 522: CASP8, FOXA1, H3F3B, STAG2, TRRAP, CALR, ATP1A1, ATP2B3, CACNA1D


Full mutation spectra are now curated for these new cancer genes

KCNJ5

KCNJ5 (11q24) is a G-protein sensitive potassium channel. Recurrent somatic heterozygous missense mutations are reported in adrenal aldosterone producing adenomas, which result in primary aldosteronism and secondary hypertension. Mutations are all found in or near the selectivity filter (most commonly p.G151R and p.L168R) and result in depolarization of the cell and activated Ca2+ entry, stimulating aldosterone production and cell proliferation.

STAT5B

STAT5B (17q11) is involved in both signal transduction and activation of transcription. Several recurrent somatic missense mutations have been observed in the highly conserved SH2 domain in large granular lymphocytic leukaemias and acute promyelocytic leukaemia, and in the transcriptional activation region in basal cell carcinoma. Mutations in the SH2 domain have been shown to increase the transcriptional activity of STAT5B.

RNF43

RNF43 (Ring finger protein 43) encodes an E3 ubiquitin-protein ligase that acts as a negative regulator of Wnt/β-catenin signalling. It is a tumour suppressor gene with somatic mutations in mucinous neoplasms of the pancreas and ovary, biliary tract carcinoma and at a low level in many other cancers.

SETBP1

SETBP1 (SET binding protein 1, 18q21.1). The SETBP1-SET nuclear oncoprotein complex acts as a negative regulator of the putative tumour suppressor PP2A, and SETBP1 also appears to act as a transcriptional activator of the HOXA9 and HOXA10 genes in myeloid progenitors. Recurrent heterozygous somatic missense mutations are observed in a variety of myeloid malignancies including aCML, sAML, MDS/MPN-U, CMML and JMML, with mutations clustered in the SKI-homologous domain. SETBP1 mutations have been observed during transformation from MDS to sAML and have been proposed to contribute to tumour progression/evolution. They are generally associated with a poor prognosis.

TERT

TERT: Highly recurrent somatic point mutations have been identified in the core promoter of TERT (telomerase reverse transcriptase) in >70% of melanomas. The changes at the two mutually exclusive hot spots (-124 and -146 bp from the ATG start site) create new binding sites for ETS transcription factors resulting in increased induction of the TERT gene. TERT encodes the reverse transcriptase component of telomerase which adds telomere repeats to chromosome ends enabling cell replication. Maintenance of telomere length is a key process in malignant progression. TERT promoter mutations are the single most frequently occurring mechanism to affect telomere length. As well as the common mutations at g.1295228 (p.C228T) and g.1295250 (p.C250T) changes at other promoter positions have also been identified. Numerous cancer types have TERT promoter mutations: adult primary glioblastoma, liposarcoma, hepatocellular carcinoma, urothelial carcinoma, squamous cell carcinoma of the tongue, medulloblastoma, follicular cell-derived thyroid cancers, meningioma, cutaneous basal and squamous cell carcinoma, malignant mesothelioma, atypical fibroxanthoma and pleomorphic dermal sarcoma.


Literature Curation - Gene fusion mutations

TCF3-PBX1

The TCF3 (E2A)-PBX1 fusion, usually associated with acute lymphoblastic leukaemia, has been identified in solid tumours where it is found more frequently in adenocarcinoma than in other subtypes of non-small cell lung carcinoma. The TCF3-PBX1 fusion protein contains the transactivation domain of TCF3 and the DNA-binding domain of PBX1.

RET/Papillary Thyroid Carcinoma (CCDC6-RET, PRKAR1A-RET, NCOA4_ENST00000452682-RET, GOLGA5-RET, TRIM24-RET, TRIM33_ENST00000358465-RET, KTN1-RET, ERC1-RET, PCM1-RET, TRIM27-RET and HOOK3-RET)

"RET/PTC" fusions are largely restricted to papillary thyroid carcinoma (PTC). The most frequent translocations are those involving CCDC6 (RET/PTC1), PRKAR1A (RET/PTC2) and NCOA4 (RET/PTC3) but novel RET partners have been reported in both irradiated and non-irradiated papillary thyroid carcinomas. All RET/PTC fusions result in the juxtaposition of the tyrosine kinase domain of RET to the 5' portion of the heterologous genes. The translocated amino terminal regions have dimerization domains capable of inducing ligand-independent dimerization of the RET/PTC oncoproteins.


Literature Curation - Systematic Screens:

12 new large-scale studies have been newly curated into COSMIC for release v68, more details can be found here.

Website updates

The COSMIC genome browser, an instance of GMOD's "gBrowse", is now retired and replaced by a more user-friendly system based on the new "jBrowse". This browser provides much more intuitive click-and-drag utility and retains all the functionality of the old system. With this upgrade, we will be able to improve the exploration of genomic information, providing views which can be restricted to your selections to investigate selected diseases, samples or mutation types.


COSMIC website search

As the data in COSMIC grows, easily searching it is becoming increasingly critical. We have devised a new system for searching COSMIC, presenting better summaries of what's available for your search terms. Under the summary, concise information is shown to easily inform your choice of which linked webpage to visit.

Cancer Cell Lines Project

Examining tumour samples by full-genome resequencing is generating remarkable amounts of mutation data. In order to make it easier to examine the content of each sample, we have begun to introduce analysis filters on Sample Detail pages in the Cell Lines website. These filters allow you to specify that the mutations viewed should be likely pathogenic, in cancer census genes, or of a particular mutation type. In future releases, we will be developing further filters across these data to enhance their analysis.


COSMIC v68 Total Statistics

Genes 25660
Samples 981720
Mutations 1627878
Unique Variants 1292597
Papers 18465
Fusions 10251
Genomic Rearrangements 7584
Whole Genomes 8236
Copy Number Aberrations 425776


Additional CGP resources:

The Cancer Gene Census, a listing of all genes known to be involved in cancer promotion

The Cancer Cell Line Project, defining key driver mutations in 1015 common cancer cell lines

Cosmic Whole Genomes, all tumours with genome-wide somatic annotations

Genomics of Drug Sensitivity, Analysis of drug sensitivity data in human cancer cell lines.



v67 - 24th Oct 2013

COSMIC v67 Release

Version 67 is a major upgrade to COSMIC, as it now includes copy number aberrations as somatic mutations. Also in this release, 3 new curated genes, 11 new fusion gene pairs, 20 new systematic screen publications and 6 new census genes. Additionally, COSMIC's Cancer Cell Line Project now contains the full exome sequencing of 1015 cell lines.


COSMIC now contains somatic Copy Number changes

Genomic amplifications and deletions have been known to promote cancer growth for many years, and recent improvements in genome resequencing have made this data much more available. COSMIC has now been upgraded to include significant copy number gains and losses alongside coding mutations. This information has been uploaded from TCGA, ICGC and CGP. For consistency of annotation, data from TCGA & CGP has been interpreted using the ASCAT algorithm (for primary tumours) or the PICNIC algorithm (for cell lines); CNV data on 130 ICGC samples have been taken directly from the ICGC data portal. This CNV data, across a total of 4223 tumour samples, can be examined in similar ways to the coding mutations in the COSMIC website, including per-gene and per-tissue navigation, and the usual mutation and tumour analysis filters. Further details are available here.


COSMIC cancer cell-line project

A major update is also available to the COSMIC cancer cell line project. The capillary sequencing of 800 cell lines through 64 genes has been retired here. In its place, the project and website now comprises full exome sequencing of 1015 cell lines. 1146874 coding mutations are now available for examination across these cell lines, each of which has been assessed for mutation impact (predicted oncogenicity), highlighting those most likely involved in tumour formation. Primary interpretations of sequence context and mutation patterns are also provided. 1010 of these cell lines also has full copy number annotations for fuller investigation of cancer driver mutations, integrated into the website as described above.


CONAN

The copy-number analysis tool “CONAN” has been retired from the CGP website and brought into COSMIC alongside the new copy number data. Available from COSMIC's front page, CONAN's functionality has been incorporated into the COSMIC website. With a substantial update to the copy number data available, CONAN can now be used to explore 3083 tumour samples.


Census

The Cancer Gene Census has been updated with six new genes recently identified in the formation of Blood tumours. This brings the total of known cancer genes substantiated by the scientific literature to 513. The new genes include : CSF3R , PTPRC, RAD21, STAT5B, TBL1XR1, UBR5.


Literature curation:Point-mutated cancer genes

RAC1

RAC1 is a RAS-related member of the Rho subfamily of GTPases. It has a role in cytoskeleton rearrangement, cellular adhesion, migration and invasion. Somatic mutations in RAC1 have been found at low prevalence in epithelial cancers. In melanoma a recurring mutation P29S causes an increased binding of the protein to downstream effectors and promotes melanocyte proliferation and migration.

KLF4 and TRAF7

TRAF7 mutations have been identified in meningioma where they commonly occur with a recurrent KLF4 (K409Q) or AKT1 mutation (E17K). Secretory meningioma, which has a more aggressive clinical course, is defined by combined KLF4 and TRAF7 mutations. The KLF4 mutations occur in the first of three zinc fingers which is central to DNA binding. TRAF7 is a pro-apoptotic E3 ubiquitin ligase containing an N-terminal RING finger, a TRAF domain and a coiled-coil domain followed by seven WD40 repeats. Unlike KLF4, TRAF7 mutations are not restricted to secretory meningioma or to a single hot spot, although many map to the WD40 domains. Novel mutations in KLF4 and TRAF7 are both mutually exclusive of NF2 mutations and these non-NF2 meningiomas are clinically distinctive, and originate from the skull base as opposed to the cerebral and cerebellar hemispheres.


Literature Curation - Gene fusion mutations

QKI-NTRK2 and NACC2-NTRK2

Recurrent NTRK2 fusions have been identified in non-cerebellar pilocytic astrocytoma. NTRK2 encodes a member of the neurotrophic tyrosine receptor kinase (NTRK) family and is a membrane-bound receptor which, on neurotrophin binding, phosphorylates itself and members of the MAPK pathway. Genetic alterations within the MAPK signalling pathway are typical of pilocytic astrocytoma.

RNF130-BRAF, CLCN6-BRAF, MKRN1-BRAF and GNAI1-BRAF

Novel BRAF fusions, with alternative partners to KIAA1549, have been found in a minority of pilocytic astrocytomas. These fusions all result in the loss of the N-terminal regulatory region of BRAF.

EPC1-PHF1, JAZF1-PHF1, MEAF6-PHF1, , MBTD1-CXorf67 and ZC3H7B-BCOR

(curated as: EPC1-PHF1, JAZF1-PHF1, MEAF6-PHF1, MBTD1_ENST00000586178-CXorf67_ENST00000342995, ZC3H7B-BCOR_ENST00000378444)

A subset of low-grade endometrial stromal sarcomas (ESS) has fusions involving PHF1, a gene encoding a protein with two zinc finger motifs. In each case the entire coding region of PHF1 is included in the fusion. Additionally, the novel fusions MBTD1-CXorf67 and ZC3H7B-BCOR have also been identified in ESS.


Literature Curation - Systematic Screens:

In this release 20 systematic screen publications have been curated in COSMIC, more details can be found here.

COSMIC v67 Total Statistics

Genes 25606
Samples 947213
Mutations 1592109
Unique Variants 1273479
Papers 17731
Fusions 9190
Genomic Rearrangements 7584
Whole Genomes 7954
Copy Number Aberrations 422314


Additional CGP resources:

The Cancer Gene Census, a listing of all genes known to be involved in cancer promotion

The Cancer Cell Line Project, defining key driver mutations in 1015 common cancer cell lines

Cosmic Whole Genomes, all tumours with genome-wide somatic annotations

Genomics of Drug Sensitivity, Analysis of drug sensitivity data in human cancer cell lines.



v66 - 25th Jul 2013

COSMIC v66 Release

COSMIC v66 contains curation of cancer genes TSC1 &TSC2, four new fusion gene pairs, 17 whole-genome sequencing publications, and extensive updates from ICGC & TCGA.


CENSUS update

We have reviewed the current knowledge on genes involved in human cancer and as a result, added 17 new genes to the Cancer Gene Census (a list of genes with significant proof they contribute to human cancer). We will be adding more genes to this census in future releases, when we consider their involvement proved in the literature.


COSMIC releases scheduled every 3 months

This July release is the last performed on a bimonthly schedule. COSMIC's next release will be in October, when the schedule will become one release every three months.


Literature curation:Point-mutated cancer genes

TCS1 and TSC2

Somatic alterations in tumour suppressor gene TSC1 have been detected in sporadic tumours such as bladder cancer, renal cell carcinoma and hepatocellular carcinoma, and TSC2 alterations in sporadic pulmonary LAM, renal angiomyolipoma and head and neck cancers. TSC1 and TSC2 gene products, hamartin and tuberin, form a protein complex that plays a critical role in growth control as a primary regulator of the mammalian target of Rapamycin (mTOR) pathway. Germline mutations in these genes cause tuberous sclerosis complex (TSC), a neurocutaneous syndrome characterized by seizures, mental retardation, and benign tumours of many organs.


Literature Curation - Gene fusion mutations

EWSR1-YY1

A subgroup of mesotheliomas is characterised by EWSR1-YY1 fusions. The EWSR1 breakpoint is similar to that found in other fusions involving EWSR1 such as EWSR1-FLI1 and EWSR1-DDIT3. YY1 encodes a ubiquitously distributed transcription factor belonging to the GLI-Kruppel class of zinc finger proteins. The EWSR1-YY1 fusion protein includes the transactivation domain of EWSR1 and the DNA-binding domain of YY1.

EWSR1-NFATC1

Further broadening the range of tumours associated with EWSR1 fusions, EWSR1-NFATC1 has been identified in haemangioma of the bone. NFATC1 encodes a component of the nuclear factor of activated T cells DNA-binding transcription complex which is involved primarily in immune response. The transactivation domain of EWSR1 is retained in the fusion where it's fused to the DNA-binding domain, the REL-homology region, of NFATC1.

IRF2BP2-CDX1

IRF2BP2-CDX1 has been identified as an alternative fusion to HEY1-NCOA2 in mesenchymal chondrosarcoma. IRF2BP2 encodes an interferon regulatory factor-2 (IRF2) binding protein that interacts with the C-terminal transcriptional repression domain of IRF2. CDX1 belongs to the homeobox gene family. For the fusion protein, the N terminal includes the IRF2BP2 zinc finger motif and the C terminal includes the CDX1 homeodomain.

STRN-ALK

STRN, encoding a calmodulin-binding protein, has been identified as a novel ALK fusion partner in lung adenocarcinoma. As in most ALK fusions the kinase domain of ALK is preserved in the fusion protein.


Literature Curation - Systematic Screens:

Tarpey et al (2013). Frequent mutation of the major cartilage collagen gene COL2A1 in chondrosarcoma. Nature genetics(epub)

Zhang et al (2013). Genetic heterogeneity of diffuse large B-cell lymphoma. Proceedings of the National Academy of Sciences of the United States of America 110:1398

Reuss et al (2013). Secretory meningiomas are defined by combined KLF4 K409Q and TRAF7 mutations. Acta neuropathologica 125:351

Ho et al (2013). The mutational landscape of adenoid cystic carcinoma. Nature genetics 45:791

Lui et al (2013). Frequent mutation of the PI3K pathway in head and neck cancer defines predictive biomarkers. Cancer discovery 3:761

Murtaza et al (2013). Non-invasive analysis of acquired resistance to cancer therapy by sequencing of plasma DNA. Nature 497:108

Chen et al (2013). Next-generation-sequencing-based risk stratification and identification of new genes involved in structural and sequence variations in near haploid lymphoblastic leukemia. Genes, chromosomes & cancer 52:564

Han SW et al (2013). Targeted sequencing of cancer-related genes in colorectal cancer using next-generation sequencing. PloS one 8:e64271

Bettegowda et al (2013). Exomic Sequencing of Four Rare Central Nervous System Tumor Types. Oncotarget 4:572

Clark et al (2013). Genomic Analysis of Non-NF2 Meningiomas Reveals Mutations in TRAF7, KLF4, AKT1, and SMO. Science 339:1077

Yost SE et al (2013). High-resolution mutational profiling suggests the genetic validity of glioblastoma patient-derived pre-clinical models. PloS one 8:e56185

Zang et al (2012). Exome sequencing of gastric adenocarcinoma identifies recurrent somatic mutations in cell adhesion and chromatin remodeling genes. Nature genetics 44:570

Iyer et al (2012). Genome sequencing identifies a basis for everolimus sensitivity. Science 338:221

Kridel R et al (2012). Whole transcriptome sequencing reveals recurrent NOTCH1 mutations in mantle cell lymphoma. Blood 119:1963

Totoki et al (2011). High-resolution characterization of a hepatocellular carcinoma genome. Nature genetics 43:464

Wang et al (2011). Exome sequencing identifies frequent mutation of ARID1A in molecular subtypes of gastric cancer. Nature genetics 43:1219

Lee et al (2010). The mutation spectrum revealed by paired genome sequences from a lung cancer patient. Nature 465:473


These Cancer Genome Consortium studies have been updated this release

TCGA

Ovarian Serous Cystadenocarcinoma

Rectum Adenocarcinoma

Colon Adenocarcinoma

Acute Myeloid Leukemia

Glioblastoma Multiforme

Prostate Adenocarcinoma

Bladder Urothelial Carcinoma

Breast Invasive Carcinoma

Cervical Squamous Cell Carcinoma

Kidney Renal Clear Cell Carcinoma

Lung Adenocarcinoma

Lung Squamous Cell Carcinoma

Uterine Corpus Endometrioid Carcinoma

Head and Neck Squamous Cell Carcinoma


ICGC (Riken)

Japanese liver cancer

ICGC (NCC)

Japanese liver cancer


COSMIC v66 Total Statistics

Genes 25563
Samples 908687
Mutations 1524954
Unique Variants 1216270
Papers 17157
Fusions 9054
Genomic Rearrangements 7584
Whole Genomes 7677


Additional CGP resources:

The Cancer Gene Census, a listing of all genes known to be involved in cancer promotion

The Cancer Cell Line Project, defining key driver mutations in 800 common cancer cell lines

Cosmic Whole Genomes, all tumours with genome-wide somatic annotations

Genomics of Drug Sensitivity, Analysis of drug sensitivity data in human cancer cell lines.

CGP Copy Number Analysis in Cancer, examining tumours for gains or losses of genomic content



v65 - 28th May 2013

COSMIC v65 Release

COSMIC v65 includes full curation of the genes SH2B3, MAP2K1 and MAP2K2, recently identified as causing blood and epithelial cancers, together with 5 new USP6 gene fusions, found in aneurysmal bone cysts. In addition, substantial updates have been made to studies curated from the ICGC and TCGA. 6989 samples in COSMIC now have full-genome annotations.


COSMIC release schedule

In recent years, COSMIC has released a new version every two months, six times a year. Howerver, in order to focus on maximising the content of each release and reducing the workload for those integrating the COSMIC database into their own resources, we will be changing to a 3 monthly schedule from July. After July, the next COSMIC release will be in October 2013 and every three months after this.


Literature curation:Point-mutated cancer genes

SH2B3

SH2B3 (LNK, 12q24.12) is a plasma membrane-bound adapter protein and a negative regulator of cytokine signalling involved in normal haematopoesis. Its functions include inhibition of wild type and mutant JAK2 signaling and it is overexpressed in myeloproliferative neoplasms (MPN) as well as myelodysplastic syndrome and leukaemic cells; growth of some transformed cells is inhibited by overexpression of SH2B3 and loss of LNK in murine models enhances development of MPNs. Mutations have mainly been found in MPNs and idiopathic erythrocytosis, predominantly heterozygous PH2 domain mutations (hotspot E208_D234) in wild type JAK2/MPL blast phase MPNs, implicating SH2B3 in MPN progression. However, mutations have also been found in chronic phase MPNs, can occur in JAK2/MPL mutated tumours and in other SH2B3 domains. Mutations have also been seen in small numbers of early T-cell precursor acute lymphoblastic leukaemia samples and solid tumours. The loss of inhibition of JAK-STAT activation may be related to haploinsufficiency of SH2B3 or due to dominant-negative effect of the mutant protein.

MAP2K1 and MAP2K2

MAP2K1 and MAP2K2 code for dual specificity protein kinases which belong to the mitogen-activated protein (MAP) kinase kinase family. These proteins phosphorylate both a threonine and a tyrosine residue in the activation loops of extracellular signal-regulated kinases (ERK1/2). MAP2K1 and MAP2K2 are activated by point mutations at low prevalence in epithelial cancers such as non-small cell lung cancer. MAP2K1 mutations cluster in 2 hot spots at a hinge region (Q56P, K57N, D67N) between the coiled-coil and catalytic domains, and at the activation loop of the kinase domain (E203K, I204T). Mutations have also been identified in malignant melanomas where they often occur with BRAF or NRAS mutations. There is also evidence that some MAP2K1 exon 3 mutations might confer resistance in melanomas to MEK/RAF inhibitors.


Literature Curation - Gene fusion mutations

CDH11-USP6, THRAP3-USP6, OMD-USP6, CNBP-USP6, COL1A1-USP6

(CDH11-USP6_ENST00000250066, THRAP3-USP6_ENST00000250066, OMD-USP6_ENST00000250066, CNBP-USP6_ENST00000250066, COL1A1-USP6_ENST00000250066)

Fusions involving USP6, partnered with one of five different genes (CDH11, THRAP3, OMD, CNBP or COL1A1), are found in aneurysmal bone cyst, a locally aggressive bone lesion with a propensity to recur. In each translocation the entire ubiquitin-specific protease coding sequence is fused downstream to the promoter region of the partner gene.


Literature Curation - Whole genome studies:

TCGA Cervical Squamous cell

TCGA Bladder Urothelial

TCGA Glioblastoma

TCGA Breast Carcinoma

TCGA Colon Cancer

TCGA Lung Adenocarcinoma

ISC/MICINN Chronic Lymphocytic Leukaemia

TCGA Acute Myeloid Leukaemia

TCGA Ovarian Serous Cystadenocarcinoma

TCGA Prostate Adenocarcinoma

TCGA Rectum Adenocarcinoma

TCGA Lung Squamous cell


Literature Curation - Systematic Screens:

Leich E et al (2013). Multiple myeloma is affected by multiple and heterogeneous somatic mutations in adhesion- and receptor tyrosine kinase signaling molecules. Blood cancer journal 3:e102

Sausen M et al (2013). Integrated genomic analyses identify ARID1A and ARID1B alterations in the childhood cancer neuroblastoma.Nature genetics 45(1):12-7

Agrawal N et al (2013). Exomic Sequencing of Medullary Thyroid Cancer Reveals Dominant and Mutually Exclusive Oncogenic Mutations in RET and RAS. The Journal of clinical endocrinology and metabolism 98(2):E364-9

Dulak AM et al (2013). Exome and whole-genome sequencing of esophageal adenocarcinoma identifies recurrent driver events and mutational complexity. Nature genetics 45(5):478-86

Landau DA et al (2013). Evolution and impact of subclonal mutations in chronic lymphocytic leukemia. Cell 152(4):714-26

Pugh TJ et al (2013). The genetic landscape of high-risk neuroblastoma. Nature genetics 45(3):279-84

Green MR et al (2013). Hierarchy in somatic mutations arising during genomic evolution and progression of follicular lymphoma. Blood 121(9):1604-11

Streppel MM et al (2013). Next-generation sequencing of endoscopic biopsies identifies ARID1A as a tumor-suppressor gene in Barretts esophagus. Oncogene 1476-5594

Zhou D et al (2013). Exome capture sequencing of adenoma reveals genetic alterations in multiple cellular pathways at the early stage of colorectal tumorigenesis. PloS one 8(1):e53310

Kim SC et al (2013).A high-dimensional, deep-sequencing study of lung adenocarcinoma in female never-smokers. PloS one 8(2):e55596

Demeure MJ et al (2012). Cancer of the ampulla of Vater: analysis of the whole genome sequence exposes a potential therapeutic vulnerability. Genome medicine 4(7):56

Newey PJ et al (2012) . Whole-Exome Sequencing Studies of Nonhereditary (Sporadic) Parathyroid Adenomas. The Journal of clinical endocrinology and metabolism 1538-7445


COSMIC v65 Total Statistics

Genes 24715
Samples 885051
Mutations 1146761
Unique Variants 873677
Papers 16514
Fusions 9014
Genomic Rearrangements 7584
Whole Genomes 6989

Additional CGP resources:

The Cancer Gene Census, a listing of all genes known to be involved in cancer promotion

The Cancer Cell Line Project, defining key driver mutations in 800 common cancer cell lines

Cosmic Whole Genomes, all tumours with genome-wide somatic annotations

Genomics of Drug Sensitivity, Analysis of drug sensitivity data in human cancer cell lines.

CGP Copy Number Analysis in Cancer, examining tumours for gains or losses of genomic content



v64 - 26th Mar 2013

COSMIC v64 Release

COSMIC v64 contains full curation of gene fusions CIC-DUX4 and ACTB-GLI1 in solid tumours. Twelve additional genome-wide sequencing publications bring the numbers of WGS samples in COSMIC to 5023.

Website update

The new COSMIC website (http://cancer.sanger.ac.uk) now replaces the old one (www.sanger.ac.uk/cosmic) which is no longer available. Existing links and bookmarks will still work, but will redirect to the appropriate page on the new website. Substantial help is available to navigate the new system here, but if you have any questions or comments about the new website, please contact us at cosmic@sanger.ac.uk

Literature Curation - Gene fusion mutations

CIC-DUX4

The recurrent CIC-DUX4 fusion is found in a subset of paediatric and young adult primitive round cell undifferentiated soft tissue sarcomas, distinct from Ewings sarcoma family of tumours. CIC is a member of the HMG-box superfamily of transcription factors and DUX4 is a double-homeobox gene belonging to the family of double homeo-domain transcription activators. The CIC-DUX4 fusion preserves most of the functional regions of the CIC gene, including the DNA-binding HMG-box and most of the MAPK phosphorylation sites, but both DUX4 homeobox domains are lost.

ACTB-GLI1

This novel fusion has been found in a discrete set of soft tissue sarcomas with distinctive pericytic features. The DNA-binding zinc domains of GLI1 are retained in the fusion and the GLI1 promoter region is replaced with that of the ubiquitously expressed ACTB gene.

Literature Curation - Systematic screens

Horn S et al (2013). TERT Promoter Mutations in Familial and Sporadic Melanoma. Science 339:959

Roberts KG et al (2012). Genetic alterations activating kinase and cytokine receptor signaling in high-risk acute lymphoblastic leukemia. Cancer cell 22:153

Le Gallo M et al (2012). Exome sequencing of serous endometrial tumors identifies recurrent somatic mutations in chromatin-remodeling and ubiquitin ligase complex genes. Nature genetics 44:1310

Agrawal N et al (2012). Comparative genomic analysis of esophageal adenocarcinoma and squamous cell carcinoma. Cancer discovery 2:899

Kannan K et al (2012). Whole-exome sequencing identifies ATRX mutation as a key molecular determinant in lower-grade glioma. Oncotarget 3:1194

Lindberg J et al (2012). The Mitochondrial and Autosomal Mutation Landscapes of Prostate Cancer. Eur Urol. 63:702

Nichols AC, et al. (2012). A Pilot Study Comparing HPV-Positive and HPV-Negative Head and Neck Squamous Cell Carcinomas by Whole Exome Sequencing. ISRN Oncol. 2012:809370

Seshagiri S et al. (2012). Recurrent R-spondin fusions in colon cancer. Nature 488:660

Seo JS et al (2012). The transcriptional landscape and mutational profile of lung adenocarcinoma. Genome research 22:2109

Liu J et al (2012). Genome and transcriptome sequencing of lung cancers reveal diverse mutational and splicing events. Genome research 22:2315

Piazza R et al (2012). Recurrent SETBP1 mutations in atypical chronic myeloid leukemia. Nature Genetics 45:18

Rossi D et al (2012). The coding genome of splenic marginal zone lymphoma: activation of NOTCH2 and other pathways regulating marginal zone development. The Journal of experimental medicine 209:1537


COSMIC v64 Total Statistics

Genes 24394
Samples 847698
Mutations 913166
Unique Variants 682654
Papers 16123
Fusions 8945
Genomic Rearrangements 7584
Whole Genomes 5023

Genomics of Drug Sensitivity in Cancer release 4 (March 2013)

This release features improvements increasing functionality of the GDSC website to facilitate analysis and interpretation of results.

Drug overview pages

These pages provide a visual summary of the screening results for each drug. Cell line IC50 values including confidence intervals are plotted as well as summary statistics for each drug. The overview page also contains separate plots for cell line IC90, IC75, IC25 and AUC (area under the curve) values for each drug.

IC50 scatter plots filtered by mutation and tissue type

The scatter plots of cell line IC50s values for significant drug-gene associations have been improved so that they can be filtered by mutation type (coding mutation, amplification or deletion) or by tissue type. A non-parametric test is performed for each resulting scatter plot to assess the significance of each association.

Genomics of Drug Sensitivity in Cancer Team (http://www.cancerRxgene.org), please contact us at: cancerrxgene@sanger.ac.uk

Additional CGP resources:

The Cancer Gene Census, a listing of all genes known to be involved in cancer promotion

The Cancer Cell Line Project, defining key driver mutations in 800 common cancer cell lines

Cosmic Whole Genomes, all tumours with genome-wide somatic annotations

Genomics of Drug Sensitivity, Analysis of drug sensitivity data in human cancer cell lines.

CGP Copy Number Analysis in Cancer, examining tumours for gains or losses of genomic content



v63 - 30th Jan 2013

COSMIC v63 Release

COSMIC v63 includes full curation of cancer genes STAT3 and TNFRSF14, together with further FGFR and EWSR1 fusion gene pairs. Nine additional systematic screen papers from 2012 ensure our curation of cancer genome analysis remains very current.

Website update

The new COSMIC website (cancer.sanger.ac.uk) will replace the old one (www.sanger.ac.uk/cosmic) in March 2013, with our v64 release. Existing links and bookmarks will still work, but will redirect to the appropriate page on the new website. If you have any questions or comments about the new website, please contact us at cosmic@sanger.ac.uk.

CCDS gene upgrade

COSMIC has annotated cancer mutations across more than 24,000 genes over the last eleven years, and these have been collected from a variety of sources. In order to better standardise our gene information, we are now updating our gene sequences onto the better CCDS standard of human transcripts, where the sequences have been agreed by consensus between several genome annotation projects. In this release, we include the update of the first 19584 gene transcripts to CCDS standard.

A COSMIC job opportunity

We're planning an expansion of the COSMIC project to include more useful somatic datatypes and further analytical software/webpages. We're based in Cambridge, UK, and our bioinformatics development work is focused on the Perl programming language, making much use of relational databases (Oracle, PostGreSQL). If you have expertise in these areas and would enjoy working on this challenging project, please reply to our job advert below, or email cosmic@sanger.ac.uk for more details by 15th February 2013.https://jobs.sanger.ac.uk/wd/plsql/wd_portal.show_job?p_web_site_id=1764&p_web_page_id=161595

Literature curation:Point-mutated cancer genes

TNFRSF14

TNFRSF14 (tumour necrosis factor receptor superfamily member 14; 1p36). LIGHT mediated triggering of non-mutated TNFRSF14 renders B-cell lymphomas more immunogenic and more sensitive to FAS induced apoptosis, and non-mutated TNFRSF14 can inhibit proliferation of adenocarcinoma cells, suggesting a tumour suppressor role. Somatic mutations have been found in follicular lymphomas and diffuse large B-cell lymphoma, the majority being nonsense and missense mutations, but also including frame-shift, splice site and insertion mutations distributed across the gene, consistent with loss of function of a tumour suppressor. Mutated DLBCL are associated with higher risk clinical features and a worse response to rituximab. FL are often associated with del 1p36; Individuals carrying a TNFRSF14 mutation have a worse prognosis than those carrying a 1p36 deletion alone, patients with both alterations being associated with the worst prognosis.

STAT3

STAT3 (signal transducer and activator of transcription 3 (acute-phase response factor); 17q21) is activated through phosphorylation and, following dimerization, acts as a transcription activator, playing a role in many cell processes including cell growth and apoptosis. It is persistently phosphorylated in cancer cell lines and primary tumours including several haematological malignancies and hepatocellular tumours. Heterozygous somatic mutations have been found particularly in large granular lymphocytic leukaemia (T-cell LGL and chronic lymphoproliferative disorders of NK cells) and inflammatory hepatocellular adenomas lacking IL-6 mutations. Mutations are often found in the SRC homology (SH2) domain, responsible for dimerization and activation of STAT3. STAT3 has been shown to be phosphorylated in patients with SH2 mutations, and the frequently reported Y640F and D661 substitution mutations have been shown to increase transcriptional activity of STAT3.

Literature Curation - Gene fusion mutations

EWSR1-ZNF444

(EWSR1-ZNF444_ENST00000337080)

Another novel fusion gene involving EWSR1 has been identified in a small subset of myoepithelial tumours of soft tissue. ZNF444 encodes a zinc finger protein which activates transcription of a scavenger receptor gene involved in the degradation of acetylated low density lipoprotein. The EWSR1 breakpoint in this fusion is in a position frequently found in other EWSR1 fusions.

FGFR3-BAIAP2L1, FGFR3-TACC3 and FGFR1-TACC1

(FGFR3-BAIAP2L1, FGFR3-TACC3, FGFR1_ENST00000447712-TACC1)

FGFR3 fusions involving 2 different partners, which generate constitutively activated fusion proteins, have been identified in urothelial carcinoma. The FGFR3 component of the fusion is the same in all cases; the tyrosine kinase coding domains are retained but the final exon that includes the PLCgamma1 binding site is lost. The TACC3 fusion component retains the transforming acidic coiled-coil domain that mediates microtubule binding and the BAIAP2L1 component retains the IRSp53/MIM domain that mediates actin binding and Rac interaction. Recurrent FGFR-TACC fusions have also been found in a small subset of glioblastoma multiforme.

Literature Curation - Systematic screens

Hodis et al (2012). A landscape of driver mutations in melanoma.Cell 150:251

Imielinski et al (2012). Mapping the hallmarks of lung adenocarcinoma with massively parallel sequencing. Cell 150:1107

Castellerin et al (2012). GClonal evolution of high-grade serous ovarian carcinoma from primary to recurrent disease. J Pathol (epub)

Biankin et al (2012). Pancreatic cancer genomes reveal aberrations in axon guidance pathway genes. Nature 491:399

Love et al (2012) . The genetic landscape of mutations in Burkitt lymphoma Nat Genet 44:1321

Nikolaev et al (2012). A Single-Nucleotide Substitution Mutator Phenotype Revealed by Exome Sequencing of Human Colon Adenomas. Cancer Res 72:6279

Wang et al (2011). Whole-exome sequencing of human pancreatic cancers and characterization of genomic instability caused by MLH1 haploinsufficiency and complete deficiency. Genome Res22:208

Gerlinger et al (2012). Intratumor heterogeneity and branched evolution revealed by multiregion sequencing. NEJM 366:883

Dolnik et al (2012). Commonly altered genomic regions in acute myeloid leukemia are enriched for somatic mutations involved in chromatin remodeling and splicing. Blood 120:e83

Curation of the ICGC v11 data portal:

Malignant Lymphoma

COSMIC v63 Total Statistics

Genes 24517
Samples 821455
Mutations 834571
Unique Variants 620857
Papers 15613
Fusions 8860
Genomic Rearrangements 7584
Whole Genomes 4677

Additional CGP resources:

The Cancer Gene Census, a listing of all genes known to be involved in cancer promotion

The Cancer Cell Line Project, defining key driver mutations in 800 common cancer cell lines

Cosmic Whole Genomes, all tumours with genome-wide somatic annotations

Genomics of Drug Sensitivity, Analysis of drug sensitivity data in human cancer cell lines.

CGP Copy Number Analysis in Cancer, examining tumours for gains or losses of genomic content



v62 - 29th Nov 2012

COSMIC v62 Release

COSMIC v62 includes full curation of genes H3F3A, BCOR and HIST1H3B, together with RSPO2/3 fusions in colon cancer and NTRK1 fusions in thyroid cancer. In addition, 1068 whole-genome screens are included from recent TCGA releases, with many more from 13 newly curated systematic screen publications.

Website update

The new COSMIC website (cancer.sanger.ac.uk) will replace the old one (www.sanger.ac.uk/cosmic) in March 2013, with our v64 release. Existing links and bookmarks will still work, but will redirect to the appropriate page on the new website. If you have any questions or comments about the new website, please contact us at cosmic@sanger.ac.uk.

Literature curation:Point-mutated cancer genes

H3F3A and HIST1H3B

Mutations in H3F3A (encoding histone H3.3) or in the related HIST1H3B (encoding H3.1) have been identified as molecular drivers in diffuse intrinsic pontine glioma, and paediatric and young adult glioblastoma. Mutations consistently occur at 2 key regulatory sites within the highly conserved N-terminal histone tail which influences the dynamic regulation of chromatin structure and accessibility. These hotspot mutations appear linked to tumour location.

BCOR

BCOR, a gene encoding a transcriptional corepressor, has been identified as a tumour suppressor gene in acute myeloid leukaemia. Somatic mutations are more frequent in patients with normal karyotype compared to those with abnormal cytogenetics.

Literature Curation - Gene fusion mutations

EIF3E-RSPO2 and PTPRK-RSPO3

The R-spondin family members RSPO2 and RSPO3 have been identified in recurrent fusions in microsatellite-stable colon adenocarcinoma at a frequency of 10%. R-spondins encode secreted proteins that can potentiate canonical WNT signalling. In the EIF3E-RSPO2 fusion, EIF3E exon 1 fuses to RSPO2 to produce a functional RSPO2 protein driven by the EIF3E promoter. In the most commonly identified PTPRK-RSPO3 fusion, PTPRK exon 1 fuses to RSPO3 exon2, preserving the coding sequence of RSPO3 and replacing its secretion signal sequence with that of PTPRK.

TPM3-NTRK1_ENST00000392302, TPR_ENST00000367478-NTRK1_ENST00000392302 and TFG-NTRK1_ENST00000392302

TRK oncogenes, fusions involving NTRK1, are found in a subset of papillary thyroid carcinomas. NTRK1 encodes a cell-surface transmembrane tyrosine kinase protein acting as receptor for nerve growth factor. In TRKs the 3' terminal sequence of the tyrosine kinase domain of NTRK1 fuses with the 5' terminal sequence of one of 3 activating genes, TPM3, TPR or TFG, all of which contain coiled-coil domains that mediate protein dimerization and consequent tyrosine kinase activation.

Literature Curation - Systematic screens

Huang et al (2012). Exome sequencing of hepatitis B virus-associated hepatocellular carcinoma.Nature Genetics 44:1117-1121

Xu et al (2012). Single-cell exome sequencing reveals single-nucleotide mutation characteristics of a kidney tumor. Cell 148:886-895

Greif et al (2012). GATA2 zinc finger 1 mutations associated with biallelic CEBPA mutations define a unique genetic entity of acute myeloid leukemia. Blood 120:395-403.

Dahlman et al (2012). BRAF L597 mutations in melanoma are associated with sensitivity to MEK inhibitors. Cancer Discovery 2:791-797

Northcott et al (2012). Subgroup-specific structural variation across 1,000 medulloblastoma genomes. Nature 488:49-56

Wang et al (2012). Mutations in isocitrate dehydrogenase 1 and 2 occur frequently in intrahepatic cholangiocarcinomas and share hypermethylation targets with glioblastomas. Oncogene (epub)

Walker et al (2012). Intraclonal heterogeneity and distinct molecular mechanisms characterize the development of t(4;14) and t(11;14) myeloma. Blood 120:1077-1086

Koo et al (2012). Janus Kinase 3-Activating Mutations Identified in Natural Killer/T-cell Lymphoma. Cancer Discovery 2:591-597

Peifer et al (2012). Integrative genome analyses identify key somatic driver mutations of small-cell lung cancer. Nature genetics 44:1104-1110

Kalender Atak et al (2012). High accuracy mutation detection in leukemia on a selected panel of cancer genes. PLoS One 7:e38463

Kuhn et al (2012). Identification of Molecular Pathway Aberrations in Uterine Serous Carcinoma by Genome-wide Analyses. J Natl Cancer Inst 104:1503-1513

Rudin et al (2012). Comprehensive genomic analysis identifies SOX2 as a frequently amplified gene in small-cell lung cancer. Nature Genetics 44:1111-1116

Barber et al (2011). Comprehensive Genomic Analysis of a BRCA2 Deficient Human Pancreatic Cancer. PLoS One 6:e21639

Curation of TCGA genome screens from the ICGC v10 data portal:

Bladder Urothelial Carcinoma (TCGA, US).

Breast Invasive Carcinoma (TCGA, US).

Cervical Squamous Cell Carcinoma (TCGA, US).

Kidney Renal Clear Cell Carcinoma (TCGA, US).

Lung Adenocarcinoma (TCGA, US).

Lung Squamous Cell Carcinoma (TCGA, US).

Uterine Corpus Endometrioid Carcinoma (TCGA, US).

Prostate Adenocarcinoma (TCGA, US).

COSMIC v62 Total Statistics

Genes 24691
Samples 803415
Mutations 745924
Unique Variants 540795
Papers 15365
Fusions 8789
Genomic Rearrangements 7584
Whole Genomes 4172

Genomics of Drug Sensitivity in Cancer release 3

New data

This release sees the addition of 4,901 new IC50 values including data for 4 new anti-cancer drugs as well as new data for previously released compounds.

Number of new drug: 4
Total number of drugs: 142

Number of new IC50 values: 4,901
Total number of IC50 values: 78,070

The therapeutic target(s) of drugs in this release are:

PI3Kbeta (TXG221)
IGF1R (GSK-1904529A)
HDAC (LAQ824)
PDPK1 (OSU-03012)

Enhanced cell line IC50 scatter plots

Cell lines are now coloured coded based on whether they have a coding mutation, amplification or homozygous deletion in a given cancer gene. This makes it simple to determine what types of mutations occur in a specific cancer gene and whether mutation-type influences drug response.

Genomics of Drug Sensitivity in Cancer Team (http://www.cancerRxgene.org), please contact us at : cancerrxgene@sanger.ac.uk

Additional CGP resources:

The Cancer Gene Census, a listing of all genes known to be involved in cancer promotion

The Cancer Cell Line Project, defining key driver mutations in 800 common cancer cell lines

Cosmic Whole Genomes, all tumours with genome-wide somatic annotations

Genomics of Drug Sensitivity, Analysis of drug sensitivity data in human cancer cell lines.

CGP Copy Number Analysis in Cancer, examining tumours for gains or losses of genomic content



v61 - 26th Sep 2012

COSMIC v61 Release

COSMIC v61 focuses on whole genome screen publications with information from 17 major new reports, including the new TCGA Colon & Rectal cancer studies. In addition, the full literature on point mutations in PHF6 has been curated, along with 4 new gene fusion pairs.

Literature curation:

Curated cancer genes

PHF6,encoding a plant homeodomain (PHD) factor containing 4 nuclear localization signals and 2 PHD-type finger domains, and with a proposed role in transcriptional regulation, has been identified as an X-linked tumour suppressor gene in T-cell acute lymphoblastic leukaemia and acute myeloid leukaemia. Mutations are evenly distributed throughout the gene with recurrent missense mutations in the second zinc finger domain. Mutation prevalence is greater in male than in female patients.

Newly curated gene fusions

FN1-ALK

A novel ALK fusion involving FN1 which encodes fibronectin, a ubiquitous component of extracellular matrix and plasma, has been found in ovarian malignant stromal sarcoma. The resultant fusion protein contains the amino-terminal 1201 amino acids of FN1 and the carboxyl-terminal 598 amino acids of ALK which include the transmembrane and cytoplasmic regions.

KLC1-ALK

An additional ALK fusion partner has been identified in lung carcinoma. KLC1, encoding a member of the kinesin light chain family, fuses to the canonical ALK exon 20 recombination site in bronchioloalveolar carcinoma.

FAM131B-BRAF

A recurrent oncogenic BRAF fusion involving FAM131B, a currently uncharacterized gene on chromosome 7q34, has been shown to be an alternative mechanism of MAPK pathway activation in pilocytic astrocytoma. In common with other BRAF and RAF1 fusions, the FAM131B-BRAF fusion product lacks the RAF auto-inhibitory domain. Of note is the small number of FAM131B exons, comprising mostly of 5' UTR, included in the fusion.

UBE2L3-KRAS

An oncogenic KRAS fusion has been identified in a metastatic prostate cancer cell line. UBE2L3-KRAS encodes a protein encompassing most of the UBE2L3 protein, a member of the E2 ubiquitin-conjugating enzyme family, and full length KRAS.

Systematic screen curation

Focus on recent high-impact systematic screens:

Guichard et al (2012). Integrated analysis of somatic mutations and focal copy-number changes identifies key genes and pathways in hepatocellular carcinoma. Nat Genet. 44:694-8.

Jones et al (2012). Low-grade serous carcinomas of the ovary contain very few point mutations.J Pathol. 226:413-20.

Gui et al (2011). Frequent mutations of chromatin remodeling genes in transitional cell carcinoma of the bladder. Nat Genet. 43:875-8.

Galante et al (2011). Distinct patterns of somatic alterations in a lymphoblastoid and a tumor genome derived from the same individual. Nucleic Acids Res. 39:6056-68.

Robinson et al (2012). Novel mutations target distinct subgroups of medulloblastoma. Nature. 488:43-8.

The Cancer Genome Atlas Network (2012). Comprehensive molecular characterization of human colon and rectal cancer.Nature. 487:330-7.

Pugh et al (2012). Medulloblastoma exome sequencing uncovers subtype-specific somatic mutations.Nature. 488:106-10.

Zhang et al (2011). Key pathways are frequently mutated in high-risk childhood acute lymphoblastic leukemia: a report from the Children's Oncology Group.Blood. 118:3080-7.

Yip et al (2012). Concurrent CIC mutations, IDH mutations, and 1p/19q loss distinguish oligodendrogliomas from other cancers.J Pathol. 226:7-16.

Lee et al (2012). A remarkably simple genome underlies highly malignant pediatric rhabdoid cancers.J Clin Invest. 122:2983-8.

Bass et al (2011). Genomic sequencing of colorectal adenocarcinomas identifies a recurrent VTI1A-TCF7L2 fusion.Nat Genet. 43:964-8.

Zhang et al (2012). The genetic basis of early T-cell precursor acute lymphoblastic leukaemia.Nature. 481:157-63.

Jones et al (2012). Dissecting the genomic complexity underlying medulloblastoma.Nature. 488:100-5.

Fujimoto et al (2012). Whole-genome sequencing of liver cancers identifies etiological influences on mutation patterns and recurrent mutations in chromatin regulators.Nat Genet. 44:760-4.

Pe??a-Llopis et al (2012). BAP1 loss defines a new class of renal cell carcinoma. Nat Genet. 44:751-9.

Ong et al (2012). Exome sequencing of liver fluke-associated cholangiocarcinoma. Nat Genet. 44:690-3.

Duns et al (2012). Targeted exome sequencing in clear cell renal cell carcinoma tumors suggests aberrant chromatin regulation as a crucial step in ccRCC development.Hum Mutat. 33:1059-62.

The following genes have been updated in this release:

ABL1, ACVR1B, AKT1, ALK, APC, ARID1A, ASXL1, ATM, ATRX, AXIN1, BAP1, BRAF, BRCA1, BRCA2, CARD11, CBL, CDC73, CDH1, CDKN2A, CEBPA, CREBBP, CRLF2, CSF1R, CTNNA1,CTNNB1, CYLD, DAXX, DNMT3A, EGFR, EML4, EP300, ERBB2, ERG, EZH2, FAM123B, FBXW7, FGFR1, FGFR2, FGFR3, FLT3, GATA1, GATA2, GATA3, GNA11, GNAQ, GNAS, HNF1A, HRAS, IDH1, IDH2, IKZF1, IL7R, JAK1, JAK2, JAK3, KDR, KIT, KRAS, MAP2K4, MED12, MEN1, MET, MLH1, MLL2, MLL3, MPL, MSH2, MSH6, MYD88, NF1, NF2, NFE2L2, NOTCH1, NOTCH2, NPM1, NRAS, NTRK3, PAX5, PBRM1, PDGFRA, PHF6, PHOX2B, PIK3CA, PIK3R1, PPP2R1A, PRDM1, PRKAR1A, PTCH1, PTEN, PTPN11, RB1, RET, RUNX1, SETD2, SF3B1, SMAD4, SMARCA4, SMARCB1, SMO, SRC, SRSF2, STK11, SUFU, TET2, TNFAIP3, TP53, TSHR, U2AF1, VHL, WT1, ZRSR2

COSMIC v61 Total Statistics

Genes 22170
Samples 773098
Mutations 405271
Unique Variants 224649
Papers 14819
Fusions 8931
Genomic Rearrangements 7503
Whole Genomes 2556

Additional CGP resources:

The Cancer Gene Census, a listing of all genes known to be involved in cancer promotion

The Cancer Cell Line Project, defining key driver mutations in 800 common cancer cell lines

Cosmic Whole Genomes, all tumours with genome-wide somatic annotations

Genomics of Drug Sensitivity, Analysis of drug sensitivity data in human cancer cell lines.

CGP Copy Number Analysis in Cancer, examining tumours for gains or losses of genomic content



v60 - 19th Jul 2012

COSMIC v60; Drug Sensitivity v2 Release

The 60th release of COSMIC includes the first full version of our new website, 9 new systematic screens and significant updates to the mutation spectra of known cancer genes.

Full release of new COSMIC website.

Following the recent successful trial of our new website, we now present its first full release. As before, there are three parallel websystems. The main COSMIC website provides access to the full COSMIC database, all somatic mutation data collected over the last eight years. The Cell Lines Project site focuses on the analysis of a set of 770 commonly used cell lines, unchanged from the previous release. Finally, the previous CGP studies website has been retired in favour of a new Genomes site. This exclusively presents data from genome-wide screens, including full genome resequencing, exome resequencing and low-coverage rearrangement screens (currently comprising a large majority of exome screens). Please send us any comments about your experiences with the new system (cosmic@sanger.ac.uk), which we will help us make it as good as possible.

Literature curation:

Curated cancer genes

For this release and the next, we are focusing on updating existing curated genes to bring these up-to-date with recent publications.

Systematic screen curation

Focus on recent high-impact systematic screens:

Berger et al (2012). Melanoma genome sequencing reveals frequent PREX2mutations. Nature 485:502-6.

Molenaar et al (2012). Sequencing of neuroblastoma identifieschromothripsis and defects in neuritogenesis genes. Nature 483:589-93.

Grasso et al (2012). The mutational landscape of lethalcastration-resistant prostate cancer. Nature 487:239-43.

Barbieri et al (2012). Exome sequencing identifies recurrent SPOP, FOXA1and MED12 mutations in prostate cancer. Nat Genet. 44:685-9.

Berger et al (2011). The genomic complexity of primary human prostatecancer. Nature 470:214-20.

Morin et al. (2011). Frequent mutation of histone-modifying genes innon-Hodgkin lymphoma. Nature 476:298-303.

Nikolaev et al (2011). Exome sequencing identifies recurrent somaticMAP2K1 and MAP2K2 mutations in melanoma. Nat Genet.44:133-9.

Wu et al (2011). Whole-exome sequencing of neoplastic cysts of thepancreas reveals recurrent mutations in components of ubiquitin-dependent pathways. Proc Natl Acad Sci U S A. 108:21188-93.

Guo et al (2011). Frequent mutations of genes encoding ubiquitin-mediatedproteolysis pathway components in clear cell renal cell carcinoma. Nat Genet. 44:17-9.

The following genes have been updated in this release:

ABL1, ACVR1B, AKT1, ALK, APC, ARID1A, ASXL1, ATM, ATRX, AXIN1, BAP1, BRAF, BRCA1, BRCA2, CARD11, CBL, CDC73, CDH1, CDKN2A, CEBPA, CREBBP, CRLF2, CSF1R, CTNNA1, CTNNB1, CYLD, DAXX, DNMT3A, EGFR, EML4, EP300, ERBB2, ERG, EZH2, FAM123B, FBXW7, FGFR1, FGFR2, FGFR3, FLT3, FOXL2, GATA1, GATA2, GATA3, GNA11, GNAQ, GNAS, HNF1A, HRAS, IDH1, IDH2, IKZF1, IL7R, JAK1, JAK2, JAK3, KDR, KIT, KRAS, MAP2K4, MED12, MEN1, MET, MLH1, MLL2, MLL3, MPL, MSH2, MSH6, MYD88, NF1, NF2, NFE2L2, NOTCH1, NOTCH2, NPM1, NRAS, NTRK3, PAX5, PBRM1, PDGFRA, PHOX2B, PIK3CA, PIK3R1, PPP2R1A, PRDM1, PRKAR1A, PTCH1, PTEN, PTPN11, RB1, RET, RUNX1, SETD2, SF3B1, SMAD4, SMARCA4, SMARCB1, SMO, SOCS1, SRC, STK11, SUFU, TET2, TNFAIP3, TP53, TSHR, U2AF1, VHL, WT1

COSMIC v60 Total Statistics

Genes 21850
Samples 743601
Mutations 340585
Unique Variants 170263
Papers 14310
Fusions 8004
Genomic Rearrangements 5494
Whole Genomes 1894

Genomics of Drug Sensitivity in Cancer v2 release

New data
This release sees the addition of 25,421 new IC50 values including data for 9 new anti-cancer drugs as well as new data for previously released compounds.

Number of new drugs: 9
Total number of drugs: 138

Number of new IC50 values: 25,421
Total number of IC50 values: 73,169

Number of new cell lines: 76
Total number of cell lines: 714

Drug sensitivity predictions with elastic net modeling.

We have enhanced the analysis of drug sensitivity data by including elastic net (EN) modeling. This approach is able to scan across the genome to identify genomic, transcriptomic and tissue-type features associated with drug sensitivity or resistance. EN modeling results are presented as heatmaps and the results of this analysis are freely downloadable from the website.

EN modeling is available in addition to the multivariate ANOVA of drug sensitivity.

Scatter plots of IC50 values

This new functionality allows users to generate scatter plots of cell line IC50 values for significant drug-gene associations. Users are able to select which data to plot depending on their drug or gene of interest. Scatter plot images are downloadable and cell line IC50 values are directly linked to the COSMIC database facilitating integration of drug sensitivity data with detailed cell line information.

Additional CGP resources:

The Cancer Gene Census, a listing of all genes known to be involved in cancer promotion

The Cancer Cell Line Project, defining key driver mutations in 800 common cancer cell lines

Cosmic Whole Genomes, all tumours with genome-wide somatic annotations

Genomics of Drug Sensitivity, Analysis of drug sensitivity data in human cancer cell lines.

CGP Copy Number Analysis in Cancer, examining tumours for gains or losses of genomic content



New COSMIC website - 29th Jun 2012

New COSMIC website

A new, improved COSMIC website (http://cancer.sanger.ac.uk/)

We are pleased to announce the development of a new website for the COSMIC database, designed to improve the exploration of this increasingly complex data. This more modern system will additionally form a better platform to extend COSMIC with new types of data and additional analysis functions. This new website is now available to the scientific community, presenting the current v59 COSMIC release. We invite all comments and feedback on our email: cosmic@sanger.ac.uk.

(Source: http://cancer.sanger.ac.uk/)

Entry into the system has been kept as simple as possible, focusing on a single search box, which is much more helpful in finding the correct information. Behind this, the 'By Gene' and 'By Tissue' options have also been enhanced. The tissue browser has been significantly redesigned, showing all available site/histology options with counts of mutated samples; once a phenotype has been selected, press 'Go' and substantial details will immediately appear beneath, with links deeper into the new website. The gene browser requests a search string, as little as one letter, and will search all gene names and synonyms, then present a list of available gene options.

The website content has been broadly reorganised in line with the current Sanger style. This allows much information to be presented on one webpage, organised into separate tabs along the top of the main panel, rather than as a series of boxes on a long deep web page. This should reduce the scrolling needed to navigate each page and presents the information in more logical tab-formatted units. For instance, on the Gene Overview page:

(Source: http://cancer.sanger.ac.uk/cosmic/gene/overview?ln=KRAS)

Amongst numerous improvements, particular new tools include a better zoomable mutations histogram and searchable, exportable data tables. The new mutations histogram is zoomable in a click-and-drag style (instead of +/- 5aa as before). Additional filter options are also available, and these are more noticeably presented; all tabs on the Gene Analysis webpage reflect the filters selected.

(Source: http://cancer.sanger.ac.uk/cosmic/gene/analysis?ln=KRAS#histo)

All tables of data in COSMIC are now presented in a new style which allows sorting by selected columns, searching of the table contents, and exporting of this information, as displayed on the screen.

(Source: http://cancer.sanger.ac.uk/cosmic/gene/analysis?ln=KRAS#ts)

This new system will form a major release in the near future, and the two websystems will run in parallel from the same release database until early 2013, allowing everyone time to adjust to the new system. We invite you to send us any comments on the new website, including what you like or dislike, or any difficulties you experience so that we can make this system the best possible for all of our users. Please contact us at: cosmic@sanger.ac.uk

Good luck with your research,

The COSMIC Team

(http://cancer.sanger.ac.uk/)

cosmic@sanger.ac.uk



v59 - 23rd May 2012

COSMIC v59 Release

This latest release of COSMIC includes full mutation spectra across three new cancer genes and 9 new gene fusions. Also included are updates from the latest TCGA and ICGC releases, together with 4 recent whole-genome publications.

Genes with fully curated mutation spectra.

SRSF2, U2AF1, ZRSR2
In addition to SF3B1, somatic mutations have been identified in other genes coding for components of the splicing machinery responsible for the processing of pre-mRNA to mature RNA. SRSF2, U2AF1 and ZRSR2 are mutated in haematopoietic neoplasms where there's further evidence for an association between clinical phenotype and the different splice gene mutations. Mutated SRSF2 is more frequently associated with leukaemic transformation and has a negative prognostic impact. For SRSF2 the recurrent mutations cluster at P95, while for U2AF1 the mutations cluster at 2 hotspots at S34 and Q157. ZRSR2 mutations are least frequent.

Curated fusion gene pairs:

YWHAE-FAM22A, YWHAE-FAM22B
Recurrent fusions of YWHAE and one of two FAM22 family members (FAM22A, FAM22B) have been identified as specific to high-grade endometrial stromal sarcoma (ESS), a clinically aggressive form of uterine sarcoma and distinct from the classic JAZF1-rearranged ESS. YWHAE encodes a member of the 14-3-3 protein family which plays a role in various signal transduction pathways and YWHAE-FAM22 fusions maintain a structurally and functionally intact YWHAE protein-binding domain. Identical fusions involving YWHAE-FAM22 have also been identified in clear cell sarcoma of kidney.

KIF5B-RET and CCDC6-RET
Fusions involving the kinesin family 5B gene and RET have been found recurrently in lung adenocarcinomas. Several transcript variants have been identified in which the KIF5B portion differs, but all retain the KIF5B coiled-coil domain that mediates homodimerization and all retain the full RET kinase domain. An additional RET fusion, involving CCDC6, has also been identified in lung adenocarcinoma.

EZR-ROS1, LRIG3-ROS1, SDC4-ROS1 and TPM3-ROS1
Four new fusion partners for ROS1, a receptor tyrosine kinase, have been found in non-small cell lung cancer, specifically adenocarcinoma. All of the breakpoints in ROS1 enable the resulting fusion to retain the ROS1 kinase domain and in the SDC4-ROS1 fusion, with a breakpoint at exon 32, the ROS1 transmembrane domain is also retained.

C2orf44-ALK
A novel ALK fusion involving C2orf44 has been identified in colorectal cancer. C2orf44, containing a coiled-coil domain, fuses to the canonical ALK exon 20 recombination site.

Whole genome resequencing data:

TCGA
The TCGA recently released large mutation datasets on three cancer types, Rectal, Colorectal and AML. This data has been curated from the TCGA data portal.

ICGC
The March release of the ICGC DCC (version 8) included four major new datasets, on liver , paediatric brain and two independent sets of pancreatic cancers (QCMG, Aus), (OICR,Canada). This information has been curated from the ICGC data release portal.

CGP
The complete genome sequences of 21 breast tumours have allowed the elucidation of the mutational process moulding these tumours. This data has been curated from pre-publication datasets here.

Shah, et al. (2012). The clonal and mutational evolution spectrum of primary triple-negative breast cancers.Nature (epub, pre-publication).

Focus on Blood tumours

Ding, et al. (2012). Clonal evolution in relapsed acute myeloid leukaemia revealed by whole-genome sequencing. Nature 481:506-10.

Graubert, et al. (2012). Recurrent mutations in the U2AF1 splicing factor in myelodysplastic syndromes. Nature Genetics 44:53-7.

Yoshida, et al. (2011). Frequent pathway mutations of splicing machinery in myelodysplasia. Nature 478:64-9.

The following genes have been updated in this release:

ABL1, ACVR1B, AKT1, ALK, APC, ARID1A, ASXL1, ATM, ATRX, AXIN1, BAP1, BRAF, BRCA1, BRCA2, CARD11, CBL, CDC73, CDH1, CDKN2A, CEBPA, CREBBP, CRLF2, CSF1R, CTNNA1, CTNNB1, CYLD, DAXX, DNMT3A, EGFR, EML4, EP300, ERBB2, EZH2, FAM123B, FBXW7, FGFR1, FGFR2, FGFR3, FLT3, FOXL2, GATA1, GATA2, GATA3, GNA11, GNAQ, GNAS, HNF1A, HRAS, IDH1, IDH2, IKZF1, IL7R, JAK1, JAK2, JAK3, KDR, KIT, KRAS, MAP2K4, MED12, MEN1, MET, MLH1, MLL2, MLL3, MPL, MSH2, MSH6, MYD88, NF1, NF2, NOTCH1, NOTCH2, NPM1, NRAS, NTRK3, PAX5, PDGFRA, PHOX2B, PIK3CA, PIK3R1, PPP2R1A, PRDM1, PRKAR1A, PTCH1, PTEN, PTPN11, RB1, RET, RUNX1, SETD2, SF3B1, SMAD4, SMARCA4, SMARCB1, SMO, SRC, SRSF2, STK11, TET2, TNFAIP3, TP53, TSHR, U2AF1, VHL, WT1, ZRSR2

COSMIC v59 Total Statistics

Experiments 4799855
Tumours 713382
Samples 716988
Mutant Samples 282779
Mutations 298517
Unique Mutations 140641
Papers curated 13728
Genes 21425
Fusions 7732
Structural Variants 2882

Additional CGP resources:

The Cancer Gene Census, a listing of all genes known to be involved in cancer promotion

The Cancer Cell Line Project, defining key driver mutations in 800 common cancer cell lines

Genomics of Drug Sensitivity, Analysis of drug sensitivity data in human cancer cell lines.

CGP Copy Number Analysis in Cancer, examining tumours for gains or losses of genomic content



29th Mar 2012

Genomics of Drug Sensitivity in Cancer v1 release

We have launched a new website to present genomic markers of sensitivity to anti-cancer compounds screened across our >1000 cancer cell line resource.

New website

The website (www.cancerRxgene.org) has enhanced user interfaces making it simple to access our data and analyses. All of our genetic and drug sensitivity data, including results from future screening, are freely available and can be downloaded. The website will be regularly updated as new data becomes available.

Drug sensitivity data

We have released sensitivity data for 130 anti-cancer drugs screened across a large subset of our cell line resource. Drug sensitivity data for more than 600 cell lines have been correlated with extensive genetic data to identify genomic events associated with sensitivity and resistance. This is the largest publicly available dataset of its type, representing >48,000 drug-cell line combinations, and provides a comprehensive view of the genomics of drug sensitivity in cancer.

Cancer cell line resource

Our >1000 cell line resource represents the spectrum of common and rare types of adult and childhood cancers of epithelial, mesenchymal and haematopoietic origin. Cell lines have been subjected to sequencing of the full coding exons of 64 commonly mutated cancer genes, genome-wide analysis of copy number gain and loss using Affymetrix SNP6.0 microarrays, and expression profiling of 14,500 genes using Affymetrix HT-U133A microarrays. The presence of seven commonly rearranged cancer genes and of microsatellite instability (MSI) has also been investigated. Genetic data for the cell lines are available through our website and the Cancer Genome Project webpages. The cell lines have been submitted for whole-exome sequencing and this data will soon be available.

Integration with COSMIC

Our analyses have been integrated with the Catalogue of Somatic Mutations in Cancer (COSMIC) database providing a comprehensive resource linking somatic mutations and other information related to cancer with drug sensitivity information.

Data for the following drugs are included in this release:

681640, 17-AAG, A-443654, A-769662, A-770041, ABT-263, ABT-888, AICAR, AKT inhibitor VIII, AMG-706, AP-24534, AS601245, ATRA, AUY922, Axitinib, AZ628, AZD-0530, AZD-2281, AZD-6244, AZD-6482, AZD-7762, AZD-8055, BAY613606, Bexarotene, BI-2536, BI-D1870, BIBW2992, Bicalutamide, BIRB 0796, Bleomycin, BMS-509744, BMS-536924, BMS-754807, Bortezomib, Bosutinib, Bryostatin 1, BX-795, Camptothecin, CEP-701, CGP-082996, CGP-60474, CHIR-99021, CI-1040, Cisplatin, CMK, Cyclopamine, Cytarabine, Dasatinib, DMOG, Docetaxel, Doxorubicin, Elesclomol, Embelin, Epothilone B, Erlotinib, Etoposide, FH535, FTI-277, GDC-0449, GDC0941, Gefitinib, Gemcitabine, GNF-2, GSK-650394, GSK269962A, GW 441756, GW843682X, Imatinib, IPA-3, JNK Inhibitor VIII, JNK-9L, JW-7-52-1, KIN001-135, KU-55933, Lapatinib, Lenalidomide, LFM-A13, Metformin, Methotrexate, MG-132, Midostaurin, Mitomycin C, MK-2206, MS-275, Nilotinib, NSC-87877, NU-7441, Nutlin-3a, NVP-BEZ235, NVP-TAE684, Obatoclax Mesylate, OSI-906, PAC-1, Paclitaxel, Parthenolide, Pazopanib, PD-0325901, PD-0332991, PD-173074, PF-02341066, PF-562271, PHA-665752, PLX4720, Pyrimethamine, QS11, Rapamycin, RDEA119, RO-3306, Roscovitine, S-Trityl-L-cysteine, Salubrinal, SB216763, SB590885, Shikonin, SL 0101-1, Sorafenib, Sunitinib, Temsirolimus, Thapsigargin, Tipifarnib, Vinblastine, Vinorelbine, Vorinostat, VX-680, VX-702, WH-4-023, WZ-1-84, XMD8-85, Z-LLNle-CHO, ZM-447439

Additional CGP resources:

The Cancer Gene Census, a listing of all genes known to be involved in cancer promotion

The Cancer Cell Line Project, defining key driver mutations in 800 common cancer cell lines

Genomics of Drug Sensitivity, Analysis of drug sensitivity data in human cancer cell lines.

CGP Copy Number Analysis in Cancer, examining tumours for gains or losses of genomic content


v58 - 15th Mar 2012

COSMIC v58 Release

Five new fusion gene pairs and one new cancer gene are curated in this 58th release of COSMIC, together with 6 recent genome-wide mutation screens.

Cancer Gene Census Update

Our curations have highlighted 13 new cancer genes KIF5B, C2orf44, CCDC6, SDC4, SLC34A2, EZR, LRIG3, H3F3A, DNM2, ECT2L, PHF6, WWTR1, CAMTA1, which have been added to the cancer gene census, bringing the total number of genes implicated in cancer to 487.

New curated genes:

AXIN1 functions as a scaffold protein regulating a variety of signalling pathways and biological functions. It is a key component of Wnt signalling binding to several of its members including APC, GSK3 and ??-catenin. Mutations in the AXIN1 gene, a tumour suppressor, lead to stabilization of ??-catenin and activation of target genes. AXIN1 mutations have been found in several cancers including colorectal, endometrial, prostate and hepatocellular as well as in hepatoblastoma and sporadic medulloblastoma. Mutations are found throughout the whole gene including the APC, GSK3 and ??-catenin binding domains. Biochemical and functional studies have shown that these mutations interfere with AXIN1 binding to GSK3 and interaction with two upstream activators Frat1 and DVL.

New Curated Gene Fusions:

HEY1-NCOA2, NUP107-LGR5
The novel, recurrent fusion HEY1-NCOA2 has been identified in mesenchymal chondrosarcoma as a potential molecular diagnostic marker. HEY1 is a downstream effector of Notch signalling and NCOA2 is a member of the p160 nuclear hormone receptor transcriptional co-activation family. In the HEY1-NCOA2 fusion, the N-terminal basic helix-loop-helix DNA-binding/protein dimerization domain from HEY1 is retained while the C-terminal portion is replaced by the NCOA2 ranscriptional activation domains AD1/CID and AD2. NUP107-LGR5 has also been identified, in dedifferentiated liposarcoma, as a novel but not recurrent translocation event.

SLC34A2-ROS1 and CD74-ROS1,
Two novel translocations have been identified in non small cell carcinoma of the lung. Fusion of ROS1, a receptor tyrosine kinase of the insulin receptor family, to the transmembrane solute carrier protein SLC34A2, where the N-terminal region of the latter is fused to the transmembrane region of ROS1, results in a protein with 2 transmembrane domains. Also fusion of ROS1 with CD74, a type II transmembrane protein with high affinity for the MIF immune cytokine, similarly results in a fusion protein with 2 transmembrane domains. Another ROS1 fusion, GOPC-ROS1, previously found in a glioblastoma cell line, has also been identified as a recurrent fusion in cholangiocarcinoma.

PAX8-PPARG
A subset of thyroid follicular carcinomas, and some follicular adenomas, has the PAX8-PPARG fusion. The DNA binding domains of PAX8, a thyroid cell transcription factor essential for the differentiation of follicular cells and the regulation of thyroid-specific genes, are fused to domains A-F of the peroxisome proliferator-activated receptor PPARG. Several splice variants have been identified in affected tumours with multiple transcripts expressed in some.

Systematic screen curations:

Yan XJ, et al. (2011). Exome sequencing identifies somatic mutations of DNA methyltransferase gene DNMT3A in acute monocytic leukemia. NatGenet. 43;309-15

Pasqualucci L, et al. (2011). Analysis of the coding genome of diffuse large B-cell lymphoma. Nat Genet. 43;830-7.

Quesada V, et al (2012). Exome sequencing identifies recurrent mutations of the splicing factor SF3B1 gene in chronic lymphocytic leukemia. Nat Genet. 44;47-52.

Jiao X, et al. (2012). Somatic mutations in the NOTCH, NF-KB, PIK3CA, and hedgehog pathways in human breast cancers. Genes Chromosomes Cancer.

The mutation data from exome sequencing of 10 Gastric tumours has been available in ICGC release 7 and is now incorporated into this release of COSMIC, here .

Stephens P and Tarpey P, et al. (2012).100 breast exomes have been sequenced by the CGP and are presented here as a pre publication release Nature (In Press)

The following genes have been updated in this release:

ABL1, ACVR1B, AKT1, ALK, APC, ARID1A, ASXL1, ATM, AXIN1, BAP1, BRAF, BRCA1, BRCA2, CARD11, CBL, CDC73, CDH1, CDKN2A, CEBPA, CREBBP, CSF1R, CTNNA1, CTNNB1, CYLD, DAXX, DNMT3A, EGFR, EML4, EP300, ERBB2, EZH2, FAM123B, FBXW7, FGFR1, FGFR2, FGFR3, FLT3, FOXL2, GATA1, GATA3, GNA11, GNAQ, GNAS, HNF1A, HRAS, IDH1, IDH2, IKZF1, IL7R, JAK1, JAK2, JAK3, KDR, KIT, KRAS, MAP2K4, MED12, MEN1, MET, MLH1, MLL2, MLL3, MPL, MSH2, MYD88, NF1, NF2, NOTCH1, NOTCH2, NPM1, NRAS, NTRK3, PAX5, PDGFRA, PIK3CA, PIK3R1, PPP2R1A, PRDM1, PRKAR1A, PTCH1, PTEN, PTPN11, RB1, RET, RUNX1, SETD2, SF3B1, SMAD4, SMARCB1, SRC, STK11, TET2, TNFAIP3, TP53, TSHR, VHL, WT1

COSMIC v58 Total Statistics

Experiments 4720191
Tumours 698444
Samples 701963
Mutant Samples 233349
Mutations 242608
Unique Mutations 89393
Papers curated 13457
Genes 20948
Fusions 7633
Structural Variants 2752

Additional CGP resources:

The Cancer Gene Census, a listing of all genes known to be involved in cancer promotion

The Cancer Cell Line Project, defining key driver mutations in 800 common cancer cell lines

Genomics of Drug Sensitivity, Analysis of drug sensitivity data in human cancer cell lines.

CGP Copy Number Analysis in Cancer, examining tumours for gains or losses of genomic content



Census Update - 1st Feb 2012

Census update

Four genes have been newly implicated in cancer : YWHAE, FAM22A and FAM22B translocations are associated with endometrial stromal carcinoma, whilst BCOR mutations are associated with Retinoblastoma, AML (Acute Myeloid Leukaemia) and APL (Acute promyelocytic leukemia). The census has been updated to reflect these new findings.



v57 - 18th Jan 2012

COSMIC v57 Release

Twelve new fusion gene pairs and one new cancer gene are curated in this 57th release of COSMIC, together with 8 recent genome-wide mutation screens.

Cancer Gene Census Update

Our curations have highlighted 6 new cancer genes FAM46C, FBXO11, HEY1, SRSF2, U2AF1, ZRSR2 which have been added to the cancer gene census, bringing the total number of genes implicated in cancer to 473.

New curated genes:

NFE2L2 encodes a transcription factor that induces expression of cytoprotective proteins upon oxidative stress. Oncogenic NFE2L2 mutations have been identified in lung, head-neck, oesophageal and skin cancers where they occur within or near 2 amino terminal motifs, DLG andETGE, both of which are important in the interaction of NFE2L2 with KEAP1, an E3 ubiquitin ligase involved in the negative regulation of NFE2L2 expression.

New Curated Gene Fusions:

EWSR1-NFATc2, EWSR1-SMARCA5
NFATc2, a transcription factor which is not a member of the ETS family and is involved in T-cell differentiation and immune response, is a recurrent novel 3' partner to EWSR1 in a histological variant of Ewings sarcoma. The resultant EWSR1-NFATc2 fusion protein lacks the COOH-terminal RNA binding domain of EWSR1, similar to other Ewing sarcoma-specific translocations, and the NH2-terminal transactivation domain and regulatory domain of NFATc2. Another novel 3' partner to EWSR1 in Ewings sarcoma is SMARCA5, a member of the WSTF-SNF2h chromatin-remodelling complex family of genes. The fusion protein, in addition to the SNYG N-terminal of EWSR1, contains 5 conserved domains and motifs of SMARCA5.

ZNF700-MAST1, TADA2A-MAST1, NFIX-MAST1, ARID1A-MAST2, GPBP1L1-MAST2, SEC16A-NOTCH1, NOTCH1-GABBR2,
Gene fusions involving members of the MAST kinase family have been identified in breast cancer at a frequency of 3-5%. All 5 MAST fusions encode contiguous ORFs, some of which retain the MAST serine-threonine kinase domain and all of which retain the PDZ domain and the 3' kinase-like domain. Additionally, gene fusions involving NOTCH1 have also been found in breast cancer where the exons that encode the Notch intracellular domain, responsible for inducing the transcriptional programme following Notch activation, are retained.

TCF12-NR4A3, TFG-NR4A3
TCF12 can replace EWSR1 or TAF15 as a fusion partner to NR4A3 in extraskeletal myxoid chondrosarcoma. The resulting fusion transcript encodes a protein in which the NH2-terminal domain of the basic helix-loop-helix protein TCF12 is fused to the whole NR4A3 protein. Also in extraskeletal myxoid chrondrosarcoma a fusion has been identified where TFG is a novel 5' partner to NR4A3.

DDX5-ETV4
In prostate carcinoma, DDX5 is an additional 5' partner to ETS transcription factor ETV4.

Systematic screen curations:

Focus on Melanoma

Prickett TD, et al. (2011). Exon capture analysis of G protein-coupled receptors identifies activating mutations in GRM3 in melanoma. NatGenet. [Epub ahead of print]

Wei X, et al. (2011). Analysis of the disintegrin-metalloproteinases family reveals ADAM29 and ADAM7 are often mutated in melanoma. Hum Mutat. 32:E2148-75.

Wei X, et al (2010). Mutational and functional analysis reveals ADAMTS18 metalloproteinase as a novel driver in melanoma. Mol Cancer Res.8:1513-25.

C??rdenas-Navia LI, et al. (2010). Novel somatic mutations in heterotrimeric G proteins in melanoma. Cancer Biol Ther. 10:33-7.

Cronin JC, et al (2009). Frequent mutations in the MITF pathway in melanoma. Pigment Cell Melanoma Res. 22:435-44.

Palavalli LH, et al. (2009). Analysis of the matrix metalloproteinase family reveals that MMP8 is often mutated in melanoma. Nat Genet.41:518-20.

Solomon DA, et al. (2008). Mutational inactivation of PTPRD in glioblastoma multiforme and malignant melanoma. Cancer Res. 68:10300-6.

Focus on Squamous Cell Carcinoma

Durinck S, et al. (2011). Temporal dissection of tumorigenesis in primary cancers. Cancer Discov. 1:137-143.

   (with related data from Wang NJ et al (2011). Loss-of-function mutations in Notch receptors in cutaneous and lung squamous cell carcinoma. PNAS 108:17761-6.)

The following genes have been updated in this release:

ABL1, AKT1, ALK, APC, ARID1A, ASXL1, ATM, BRAF, BRCA1, CARD11, CBL, CDC73, CDH1, CDKN2A, CREBBP, CTNNA1, CTNNB1, CYLD, DAXX, DNMT3A, EGFR, EML4, EP300, ERBB2, EZH2, FAM123B, FGFR3, FLT3, FOXL2, HRAS, IDH1, IDH2, IL7R, JAK2, JAK3, KDR, KIT, KRAS, MAP2K4, MED12, MLH1, MLL3, MPL, MSH2, MSH6, MYD88, NF1, NF2, NFE2L2, NOTCH1, NOTCH2, NPM1, NRAS, PDGFRA, PIK3CA, PIK3R1, PRDM1, PRKAR1A, PTCH1, PTEN, PTPN11, RB1, RUNX1, SETD2, SF3B1, SMAD4, SMARCA4, SMARCB1, SMO, TET2, TNFAIP3, TP53, VHL, WT1

COSMIC v57 Total Statistics

Experiments 4671089
Tumours 683702
Samples 687082
Mutant Samples 217031
Mutations 224634
Unique Mutations 75109
Papers curated 13121
Genes 20259
Fusions 7428
Structural Variants 2752

Additional CGP resources:

The Cancer Gene Census, a listing of all genes known to be involved in cancer promotion

The Cancer Cell Line Project, defining key driver mutations in 800 common cancer cell lines

Genomics of Drug Sensitivity, Analysis of drug sensitivity data in human cancer cell lines.

CGP Copy Number Analysis in Cancer, examining tumours for gains or losses of genomic content



v56 - 15th Nov 2011

COSMIC v56 Release

Three new cancer genes together with 13 new fusion gene pairs and 7 recent genome-wide screens have been fully curated into COSMIC for this latest release.

Cancer Gene Census Update

Our curations have highlighted 4 new cancer genes IL7R, VTI1A, TCF7L2, NDRG1 which have been added to the cancer gene census, bringing the total number of genes implicated in cancer to 468.

New curated genes:

IL7R Interleukin-7 receptor alpha (IL7R) is required for normal lymphoid development and has been shown to carry somatic, heterozygous, gain of function mutations in 8- 10% of B and T cell leukaemias. To-date mutations have commonly fallen into 2 classes, S185C substitution in the extracellular domain (B-ALL) or in-frame insertions/deletions in the extracellular juxtamembrane transmembrane interface region (T- and B-ALL), the majority of which also introduce an unpaired cysteine residue. Biochemical and functional assays have demonstrated that the mutations are activating.

MED12 Oncogenic somatic mutations in MED12, an X chromosome gene encoding a subunit of the Mediator Complex, have been found in uterine leiomyomas (fibroids) with a frequency of 70%. Mutations are clustered in exon 2.

SF3B1, a gene encoding a core component of the RNA splicing machinery, is oncogenic in myelodysplastic syndromes (MDS), particularly those associated with increased ringed sideroblasts. With high mutation frequency and specificity, SF3B1 mutations are strongly implicated in the pathogenesis of these MDS subtypes. Recurrent SF3B1 mutations have also been identified in chronic lymphocytic leukaemia with higher frequency in chemorefractory cases.

New Curated Gene Fusions:

VTI1A-TCF7L2
Recurrent VTI1A-TCF7L2 fusions have been found in colorectal adenocarcinomas. VTI1A encodes a v-SNARE protein while CF7L2 encodes the transcription factor TCF4 which dimerizes with beta-catenin. The fusion results in the omission of the TCF4 beta-catenin-binding domain.

GOPC-ROS1
A fusion involving the amino-terminal portion of GOPC and the carboxy-terminal kinase domain of ROS1, a receptor tyrosine kinase, occurs in glioblastoma multiforme. The resulting fusion protein is a constitutively activated tyrosine kinase.

HNRNPA2B1-ETV1, SLC45A3-ETV1, ACSL3-ETV1,KLK2-ETV1,KLK2-ETV4,CANT1-ETV4,SLC45A3-ERG,NDRG1-ERG
In prostate carcinoma, minor novel 5' partners, apart from TMPRSS2, have been identified in recurrent fusions with ETS transcription factors ETV1, ETV4 and ERG.

SLC45A3-ELK4, SLC45A3-ETV5, TMPRSS2-ETV5
Both ELK4 and ETV5 have been identified as other ETS transcription factors involved as 3' partners in recurrent fusions in prostate carcinoma.

Systematic screen curations:

TCGA (2011). Integrated genomic analyses of ovarian carcinoma. Nature 474:609-15

Stransky et al (2011). The mutational landscape of head and neck squamous cell carcinoma. Science 333:1157-60

Li et al (2011). Inactivating mutations of the chromatin remodeling gene ARID2 in hepatocellular carcinoma. Nature Genetics 43:828-9

Prickett et al (2009). Analysis of the tyrosine kinome in melanoma reveals recurrent mutations in ERBB4.Nature Genetics 41:1127-32

Bettegowda et al (2011).Mutations in CIC and FUBP1 contribute to human oligodendroglioma.Science 333:1453-5.

Papaemmanuil et al (2011).Somatic SF3B1 mutation in myelodysplasia with ring sideroblasts.N Engl J Med. 365:1384-95

Malcovati et al (2011).Clinical significance of SF3B1 mutations in myelodysplastic syndromes and myelodysplastic/myeloproliferativeneoplasms. Blood (epub).

The following genes have been updated in this release:

ABL1, AKT1, ALK, APC, ARID1A, ASXL1, ATM, ATRX, BAP1, BRAF, BRCA1, BRCA2, CARD11, CBL, CDC73, CDH1, CDKN2A, CREBBP, CSF1R, CTNNA1, CTNNB1, CYLD, DAXX, DNMT3A, EGFR, EP300, ERBB2, EZH2, FAM123B, FBXW7, FGFR1, FGFR2, FGFR3, FLT3, GATA1, GNA11, GNAQ, GNAS, HNF1A, HRAS, IDH1, IDH2, IKZF1, IL7R, JAK1, JAK2, JAK3, KDR, KIT, KRAS, MAP2K4, MED12, MEN1, MET, MLL2, MLL3, MPL, MSH2, MSH6, MYD88, NF1, NF2, NOTCH1, NOTCH2, NPM1, NRAS, NTRK3, PAX5, PDGFRA, PIK3CA, PIK3R1, PPP2R1A, PRKAR1A, PTCH1, PTEN, PTPN11, RB1, RET, RUNX1, SETD2, SF3B1, SMAD4, SMARCA4, SMARCB1, SMO, SRC, STK11, TET2, TNFAIP3, TP53, TSHR, WT1

COSMIC v56 Total Statistics

Experiments 4626046
Tumours 662467
Samples 665763
Mutant Samples 206711
Mutations 213615
Unique Mutations 67405
Papers curated 12818
Genes 20242
Fusions 7224
Structural Variants 2752

Additional CGP resources:

The Cancer Gene Census, a listing of all genes known to be involved in cancer promotion

The Cancer Cell Line Project, defining key driver mutations in 800 common cancer cell lines

Genomics of Drug Sensitivity, Analysis of drug sensitivity data in human cancer cell lines.

CGP Copy Number Analysis in Cancer, examining tumours for gains or losses of genomic content



v55 - 13th Sep 2011

COSMIC v55 Release

New curations include the tumour suppressor gene PRDM1, fusions of CRTC1/CRTC3-MAML2 and three new systematic screen publications. A new release of the Cancer Gene Census raises the number of known cancer genes to 464.

Cancer Gene Census Update

A number of genes have recently been implicated in oncogenesis and when this is confirmed, they are added to our Census of known cancer genes. The latest release details 464 known cancer genes, recently including SF3B1, ARID2, CCNE1, CDK12, FUBP1, XPO1, MED12.

New curated genes:

PRDM1 belongs to the PRDM gene family of transcriptional repressors characterized by DNA-binding Kruppel-type zinc fingers and the PR domain at the amino terminus. The encoded protein acts as a master regulator of B cell differentiation. PRDM1 has been identified as a tumour suppressor gene in the activated B cell-like subtype of diffuse large B cell lymphoma.

New Curated Gene Fusions:

CRTC1-MAML2 and CRTC3-MAML2,
CRTC1, a member of a family of CREB coactivators, is fused with MAML2, a gene belonging to a family of Mastermind-like genes, in mucoepidermoid carcinoma (MEC) of the salivary gland and lung. The recurrent fusion consistently has breakpoints in intron 1 of both gene partners resulting in the N terminal basic domain of MAML2 being replaced by the CREB-binding coiled-coil domain of CRTC1. The same CRTC1-MAML2 fusion is also found in a subset of Warthin's tumours of the salivary gland and in some hidradenomas. CRTC3 is occasionally an alternative fusion partner for MAML2 in MEC.

Systematic screen curations:

Agrawal et al (2011). Exome sequencing of head and neck squamous cell carcinoma reveals inactivating mutations in NOTCH1. Science.333:1154-7

Wei et al (2011). Exome sequencing identifies GRIN2A as frequently mutated in melanoma. Nature Genetics. 43:442-6.

Zang et al (2010).Genetic and structural variation in the gastric cancer kinome revealed through targeted deep sequencing. Cancer Research 71:29-39.

The following genes have been updated in this release:

ABL1, ACVR1B, AKT1, ALK, APC, ARID1A, ASXL1, ATM, BAP1, BRAF, BRCA2, CARD11, CBL, CDC73, CDH1, CDKN2A, CEBPA, CREBBP, CSF1R, CTNNB1, DNMT3A, EGFR, EML4, EP300, ERBB2, ERG, EZH2, FBXW7, FGFR1, FGFR2, FGFR3, FLT3, FOXL2, GNAS, HNF1A, HRAS, IDH1, IDH2, JAK1, JAK2, JAK3, KDR, KIT, KRAS, MAP2K4, MET, MLL3, MPL, MSH6, NF1, NF2, NOTCH1, NOTCH2, NPM1, NRAS, NTRK3, PAX5, PDGFRA, PIK3CA, PIK3R1, PPP2R1A, PRDM1, PTCH1, PTEN, PTPN11, RB1, RET, RUNX1, SMAD4, SMARCA4, SMARCB1, SMO, SRC, STK11, SUFU, TET2, TNFAIP3, TP53, TSHR, VHL, WT1

COSMIC v55 Total Statistics

Experiments 4577043
Tumours 639580
Samples 642744
Mutant Samples 186431
Mutations 192795
Unique Mutations 52524
Papers curated 12441
Genes 19885
Fusions 7062
Structural Variants 2752

Additional CGP resources:

The Cancer Gene Census, a listing of all genes known to be involved in cancer promotion

The Cancer Cell Line Project, defining key driver mutations in 800 common cancer cell lines

Genomics of Drug Sensitivity, Analysis of drug sensitivity data in human cancer cell lines.

CGP Copy Number Analysis in Cancer, examining tumours for gains or losses of genomic content



v54 - 12th Jul 2011

COSMIC v54 Release

COSMIC v54 Release

Five new cancer genes have received full curation of their mutation spectrum, together with seven new fusion gene pairs focusing on the zinc finger protein PLAG1. Four new systematic screens are included, covering a range of cancer phenotypes. This extensive literature curation brings the total number of publications in COSMIC to over 12,000.

New curated genes:

MLL2 and MLL3
MLL2 and MLL3, members of the mixed-lineage leukaemia family, have been confirmed as tumour suppressor genes in childhood medulloblastoma. They encode histone methyl-transferases that methylate key lysine residues of histone tails mediated through their SET domain activities. Both ofthese proteins are members of large transcriptional regulatory complexes involved in regulating chromatin structure and transcriptional co-activation.

NTRK3
NTRK3 encodes 1 of 3 high-affinity neurotrophin receptors that regulate growth, differentiation and apoptosis of neurons. It has been implicated in several tumour types including medulloblastoma, breast, colorectal, pancreas and large cell neuroendocrine tumours of the lung.

CREBBP and EP300
The histone acetyltransferase genes CREBBPand EP300 encode proteins which act as transcriptional co-activators in multiple signalling pathways. Both genes have been identified as tumour suppressors, with evidence of haploinsufficiency, in diffuse large B-cell lymphoma and follicular lymphoma where mutations affect the HAT coding domain. Mutations in CREBBP and EP300 are thus uncommon in epithelial cancers but have been found at high frequency in relapsed acute lymphoblastic leukaemia.

New Curated Gene Fusions:

CTNNB1-PLAG1, LIFR-PLAG1, CHCHD7-PLAG1, TCEA1-PLAG1, FGFR1-PLAG1,
A subset of pleomorphic adenomas of the salivary gland has PLAG1 fusions and 5 partner genes have so far been identified. Consistent breakpoints occur in the 5' non-coding region of PLAG1 leading to promoter swapping with the fusion partner gene. For 3 of the partners the breakpoint also occurs in the non-coding region while for TCEA1 and FGFR1 it interrupts the coding sequence. The PLAG1 protein contains an N-terminal zinc-finger DNA binding domain and a C-terminal transactivation domain.

HAS2-PLAG1 and COL1A2-PLAG1
PLAG1 fusions have also been detected in lipoblastomas where again the promoter region of the partner gene, in these cases HAS2 or COL1A2, is fused to the coding sequence of PLAG1.

Systematic screen curations:

Parsons DW, et al (2011). The genetic landscape of the childhood cancer medulloblastoma. Science 331:435-9.A large genome-wide candidate gene screen assessing childhood medulloblastoma, this study identified two new chromatin remodelling genes as cancer genes, MLL2 and MLL3.

Kan Z, et al (2010). Diverse somatic mutation patterns and pathway alterations in human cancers. Nature 466:869-73. This study examined 441 tumours from a variety of sites and morphologies, through over 1500 known or candidate cancer genes, defining roles for over 100 in oncogenesis.

Jones S, et al (2010). Frequent Mutations of Chromatin Remodeling Gene ARID1A in Ovarian Clear Cell Carcinoma. Science. 330:228-31 A whole-exome resequencing study of 8 ovarian clear cell carcinomas further implicates chromatin remodelling genes (PPP2R1A and ARID1A) in cancer.

Puente, et al (2011).Whole-genome sequencing identifies recurrent mutations in chronic lymphocytic leukaemia. Nature. 475:101-5 Full-genome resequencing of four Chronic Lymphocytic Leukaemia patients suggests significant roles for at least four known cancer genes (NOTCH1,XPO1, MYD88 and KLHL6). The data in this publication has been extended with extra annotations from their submission to the ICGC DCC (R5 release).

The following genes have been updated in this release:

ABL1, ACVR1B, AKT1, ALK, APC, ARID1A, ASXL1, ATM, BAP1, BRAF, BRCA1, BRCA2, CARD11, CBL, CDC73 , CDH1, CDKN2A, CEBPA, CREBBP, CRLF2, CSF1R, CTNNA1, CTNNB1, CYLD, DAXX, DNMT3A, EGFR, EML4, EP300, ERBB2, ERG, EZH2, FAM123B, FBXW7, FGFR1, FGFR2, FGFR3, FLT3, FOXL2, GATA1, GATA2, GATA3, GNA11, GNAQ, GNAS, HNF1A, HRAS, IDH1, IDH2, IKZF1, JAK1, JAK2, JAK3, KDR, KIT, KRAS, MAP2K4, MEN1, MET, MLH1, MLL2, MLL3, MPL, MSH2, MSH6, MYD88, NF1, NF2, NOTCH1, NOTCH2, NPM1, NRAS, NTRK3, PAX5, PDGFRA, PHOX2B, PIK3CA, PIK3R1, PPP2R1A, PRKAR1A, PTCH1, PTEN, PTPN11, RB1, RET, RUNX1, SETD2, SMAD4, SMARCA4, SMARCB1, SMO, SOCS1, SRC, STK11, SUFU, TET2, TNFAIP3, TP53, TSHR, VHL, WT1

COSMIC v54 Total Statistics

Experiments 4531163
Tumours 619320
Samples 622464
Mutant Samples 177322
Mutations 183107
Unique Mutations 46615
Papers curated 12026
Genes 19737
Fusions 6365
Structural Variants 2752

Additional CGP resources:

The Cancer Gene Census, a listing of all genes known to be involved in cancer promotion

The Cancer Cell Line Project, defining key driver mutations in 800 common cancer cell lines

Genomics of Drug Sensitivity, Analysis of drug sensitivity data in human cancer cell lines.

CGP Copy Number Analysis in Cancer, examining tumours for gains or losses of genomic content



v53 - 18th May 2011

COSMIC v53 Release

COSMIC v53 release includes full curation of IKZF1, PIK3R1, PAX5 and 11 new gene fusion pairs.

This fifty third release of COSMIC brings the number of fully curated cancer genes to 98, together with a total of 81 curated fusion gene pairs. With the inclusion of 10 systematic screen publications, together with substantial output from the ICGC, TCGA and CGP studies, over 3 million curated gene screening experiments are now available in COSMIC, covering 19439 genes, with mutations identified on 11111 of these.

New curated genes:

IKZF1
IKZF1 encodes the zinc finger transcription regulator IKAROS involved in normal lymphoid development and is an important cancer gene in human B-cell progenitor acute lymphoblastic leukaemia (B-ALL). Whole gene deletions resulting in haploinsufficiency, focal somatic IKZF1 deletions (usually including exons encoding the DNA binding domain or regions immediately upstream of IKZF1) and somatic missense and frame shift mutations are observed in B-ALL. Focal deletions have been reported to result in the truncated transcripts that can encode dominant-negative proteins, as have some missense/frameshift mutations.

PIK3R1
PIK3R1 encodes p85alpha, the regulatory subunit of the phosphatidylinositol 3-kinase. It has an N-terminal SH3 domain, a domain homologous to the Rho GTPase-activating protein domain of the BCR gene product (BCR domain) and 2 SH2 domains that flank an intervening anti-parallel coiled-coil domain (iSH2) which mediates binding to p110alpha (PIK3CA) catalytic subunit, also a known cancer gene. PIK3R1 mutations have been identified in both colorectal and endometrial cancers where most are localized to the iSH2 domain.

PAX5
PAX5, a gene within the B-cell development pathway, encodes a transcription factor belonging to the family of paired-box domain transcription factors. Somatic mutations in PAX5, including partial and whole deletions, and point mutations which mostly cluster in the DNA-binding or transcriptional regulatory domains, have been identified in B-cell progenitor acute lymphoblastic leukaemia.

New Curated Gene Fusions:

BCR-JAK2, PCM1-JAK2, ETV6-JAK2, SSBP2-JAK2, SEC31A-JAK2, PAX5-JAK2
JAK2, JAK2 fusions involving multiple partner genes have been identified in haematological malignancies. Whereas the fusion proteins BCR-JAK2, ETV6-JAK2 and PAX5-JAK2 include the protein tyrosine kinase domain (JH1) of JAK2, PCM1-JAK2 includes JH1 and JH2 domains. In the SSBP2 fusion protein the N-terminal LisH domain may be analogous to the dimerization and DNA binding domains of other JAK2 fusion partners. SEC31A-JAK2 has been identified in classic Hodgkin's lymphoma while PAX5-JAK2 has been found in childhood acute lymphoblastic leukaemia.

KIF5B-ALK, PPFIBP1-ALK, SQSTM1-ALK, VCL-ALK
Novel fusion genes involving ALK have been found in a variety of cancers. In non-small cell lung cancer (NSCLC) the occurrence of KIF5B as an alternative to EML4 in ALK fusions strengthens the role of ALK signalling in the pathogenesis of NSCLC. The KIF5B-ALK fusion protein comprises the motor domain and coiled-coil domain of KIF5B and the juxtamembrane intracellular region of ALK, including the entire tyrosine kinase domain. Other novel ALK fusions include VCL-ALK which has been identified as a recurrent oncogenic mechanism in renal medullary carcinoma, a highly malignant sickle cell trait-associated cancer; PPFIBP1-ALK which has been found in pulmonary inflammatory myofibroblastic tumour; and SQSTM1-ALK which has been identified in ALK-positive large B cell lymphoma (ALK+ LBCL) where SQSTM1, a ubiquitin binding protein, replaces the more common ALK partners in ALK+ LBCL i.e. NPM1 and CLTC.

AKAP9-BRAFFusions between AKAP9 and BRAF have been identified in papillary thyroid carcinoma where it is rare in sporadic tumours and more common following radiation exposure. The fusion protein lacks the N-terminal regulatory domain of BRAF and has an intact catalytic domain. AKAP9 belongs to the group of A-kinase anchor proteins which have the common function of binding to the regulatory subunit of protein kinase A.

Systematic screen curations - Focus on pancreatic cancers:

Jones S, et al (2008). Core signalling pathways in human pancreatic cancers revealed by global genomic analyses. Science 321:1801-6.Analysis of over 20,000 candidate genes for DNA mutations through 114 tumours revealed high mutation rates in 12 cell signalling pathways. The curation of this publication describes the mutations from this study not included in the recent ICGC release r3, detailing 337 additional mutations.

Jiao Y, et al (2011). DAXX/ATRX, MEN1, and mTOR pathway genes are frequently altered in pancreatic neuroendocrine tumors.Science 331:1199-203. This exome resequencing analysis of pancreatic neuroendocrine tumours examined the link between gene mutations and clinical prognosis, finding genes in the DAXX/ATRX, MEN1, and mTOR pathways particularly important. Full exomes were sequenced in 10 tumours, key genes followed in in a further 58; 218 mutations were described.

The following genes have been updated in this release:

ABL1, ACVR1B, AKT1, ALK, APC, ARID1A, ASXL1, ATM, BAP1, BRAF, BRCA1, BRCA2, CARD11, CBL, CDC73, CDH1, CDKN2A, CEBPA, CRLF2, CSF1R, CTNNA1,CTNNB1, CYLD, DAXX, DNMT3A, EGFR, EML4, ERBB2, ERG, EZH2, FAM123B, FBXW7, FGFR1, FGFR2, FGFR3, FLT3, FOXL2, GATA1, GATA2, GATA3,GNA11, GNAQ, GNAS, HNF1A, HRAS, IDH1, IDH2, IKZF1, JAK1, JAK2, KDR, KIT, KRAS, MAP2K4, MEN1, MET, MLH1, MPL, MSH2, MSH6, NF1, NF2,NOTCH1, NOTCH2, NPM1, NRAS, PAX5, PBRM1, PDGFRA, PHOX2B, PIK3CA, PIK3R1, PPP2R1A, PRKAR1A, PTCH1, PTEN, PTPN11, RB1, RET, RUNX1,SETD2, SMAD4, SMARCA4, SMARCB1, SMO, SOCS1, SRC, STK11, SUFU, TET2, TNFAIP3, TP53, TSHR, VHL, WT1

COSMIC v53 Total Statistics

Experiments 3432750
Tumours 604950
Samples 608042
Mutant Samples 171209
Mutations 176856
Unique Mutations 43182
Papers curated 11680
Genes 19439
Fusions 6307
Structural Variants 2752

Additional CGP resources:

The Cancer Gene Census, a listing of all genes known to be involved in cancer promotion

The Cancer Cell Line Project, defining key driver mutations in 800 common cancer cell lines

Genomics of Drug Sensitivity, Analysis of drug sensitivity data in human cancer cell lines.

CGP Copy Number Analysis in Cancer, examining tumours for gains or losses of genomic content



v52 - 23rd Mar 2011

COSMIC v52 integrates Genomics of Drug Sensitivity

Also, 4 new genes and 16 new fusion pairs are curated from the literature, TCGA and ICGC portal data are updated and some improvements are made to COSMIC web pages.

Genomics of Drug Sensitivity in Cancer integration

The Genomics of Drug Sensitivity in Cancer, a collaborative project between the Sanger Institute and Massachusetts General Hospital, is screening a range of anti-cancer therapeutics against a large number of genetically characterized human cancer cell lines, generating drug sensitivity correlations. This release of COSMIC includes references to this work, detailing drugs where mutant gene/drug interactions have been shown to modify cell growth responses. For example, the recently described drug PLX4720 has been shown to have a significant growth modifying effect on cells containing mutant BRAF.

Cancer Gene Census update

21 new genes have been added to the cancer gene census. With the rise of large-scale genomic sequencing, the number of novel genes implicated in cancer is increasing. We aim to keep the Census up-to-date with the latest publications.

Website improvements

The front page of COSMIC has been redesigned to make it much easier to find data and sub-projects that were previously difficult to identify. For instance, the Cell Line Project, characterising the cancer genotypes of 800 common tumour cell lines has always been a significant COSMIC sub-project, and is now appropriately highlighted. Also, the FTP site which comprises export files of each release, is now much easier to find.

In addition, we have improved the informativeness of the Tissue Overview page (eg summary page for Skin). The graphic has been extended to include the top 20 genes mutated in the tissues/phenotypes selected, followed by much simpler summary of the mutation load in the selection.

Data Curation

This release (v52) of COSMIC contains full curations of 4 new cancer genes together with 16 new fusion gene pairs. In addition, our curation of TCGA data, output via the TCGA portal, has been updated with further new Ovarian serous carcinoma mutations. Also, we have completed our curation of the validated mutations in the third release of the ICGC, bringing in structural rearrangements for two Japanese Liver cancer screens, HX5T & RK-003-C

New curated genes:

DAXX and ATRX
DAXX and ATRX have been established as tumour suppressor genes in pancreatic neuroendocrine tumours. The protein product of ATRX has an ADD domain at the amino-terminus and a carboxy-terminal helicase domain, and forms a heterodimer with DAXX. The latter gene is an H3.3-specific histone chaperone.

MYD88
Highly recurrent oncogenic mutations in MYD88 have been identified in activated B-cell-like subtype of diffuse large B-cell lymphoma. MYD88 encodes an adaptor protein which mediates toll and interleukin-1 receptor signalling. The most common mutation is MYD88 L265P, which is also detected in mucosa-associated lymphoid tissue lymphoma.

CARD11
CARD11 has been identified as an oncogene in diffuse large B-cell lymphoma, particularly the activated B-cell-like subtype. The mutations commonly occur in the coiled-coil domain which mediates CARD11 oligomerization and NF-kappaB pathway activation.

New Curated Gene Fusions:

BRD4-C15orf55, BRD3-C15orf55
BRD4, from the BET family of nuclear proteins that carry 2 bromodomains and an additional extra terminal domain, functions in the regulation of cell cycle progression. It is fused to C15orf55 (NUT) in poorly differentiated (midline) carcinomas, a clinically aggressive form of carcinoma. The oncogenic fusion protein contains the N-terminal BRD4 sequence up to the serine-rich region followed by almost the entire NUT sequence. In a few cases BRD4 is replaced by the highly homologous BRD3.

PAX3-FOXO1, PAX7-FOXO1, PAX3-NCOA1, PAX3-NCOA2
FOXO1, a member of the fork head family of transcription factors, fused to either PAX3 or PAX7 is characteristic of alveolar rhabdomyosarcoma, although the former is more frequently detected. A consistent fusion protein results where the PAX DNA binding domain is fused to the fork head domain and C-terminal region of FOXO1. The novel variant translocations PAX3-NCOA1 and PAX3-NCOA2 have also been detected in rhabdomyosarcoma.

HMGA2-LPP, HMGA2-RAD51L1, HMGA2-LHFP, HMGA2-EBF1, HMGA2-CCNB1IP1, HMGA2- COX6C, HMGA2-ALDH2, HMGA2-NFIB, HMGA2-FHIT, HMGA2-WIF1
HMGA2, which encodes a protein belonging to the non-histone chromosomal high mobility group protein family, is fused to multiple partners in lipoma, pulmonary chondroid hamartoma, chondroma and uterine leiomyoma. The most common fusion is HMGA2-LPP which results in a protein consisting of the N-terminal DNA-binding domain of HMGA2 and the C-terminal LIM domain of LPP. HMGA2 fused to NFIB, FHIT and WIF1 has been detected in salivary gland pleomorphic adenomas.

The following genes have been updated in this release:

ABL1, AKT1, APC, ASXL1, ATRX, BRAF, BRCA1, BRCA2, CARD11, CBL, CDKN2A, CTNNB1, DAXX, EGFR, EML4, ERBB2, EZH2, FAM123B, FGFR3, FLT3, GATA1, GATA3, HNF1A, HRAS, IDH1, IDH2, JAK2, KIT, KRAS, MAP2K4, MEN1, MET, MPL, MYD88, NF2, NOTCH1, NPM1, NRAS, PDGFRA, PIK3CA, PPP2R1A, PTCH1, PTEN, PTPN11, RB1, RET, SMAD4, SMARCB1, STK11, TET2, TNFAIP3, TP53, VHL

v52 Statistics

Experiments 2968661
Tumours 589549
Samples 592626
Mutant Samples 166206
Mutations 171651
Unique Mutations 42491
Papers curated 11437
Genes 19389
Fusions 6261
Structural Variants 2752



v51 - 27th Jan 2011

COSMIC v51 Release

Cosmic release v51 includes the curation of 3 newly identified cancer genes, 4 gene fusions and updates from the ICGC and TCGA international consortia. In addition, upgrades to the website are now available, to improve the analysis of mutation distribution within a gene, and to navigate the increasing number of curated large-scale systematic screens. Both the Genomics of Drug Sensitivity in Cancer website and the Cancer Gene Census Listing were also recently updated (in December 2010).

New genes curated from the literature:

DNMT3A
DNMT3A is a member of a family of methyltransferase enzymes which catalyse addition of methyl groups to sequences containing CpG dinucleotides. Somatic mutations have been found in 22% of adult acute myeloid leukaemia patients and are associated with poor overall survival. Over half of the case examined have a recurrent missense mutation at arginine 882. The mutations have been shown to impair normal enzymatic function and are heterozygous. These data suggest potential dominant negative activity of the mutations. Additional missense, nonsense and frame-shift mutations have also been found throughout the latter half of the gene.

BAP1
Mutational inactivation of BAP1, a tumour suppressor gene, has been identified in uveal melanomas where it coincides with metastasis. BAP1 encodes a nuclear ubiquitin carboxy-terminal hydrolase (UCH) as well as a UCH37-like domain, and binding domains for BRCA1, BARD1 and HCFC1.

GNA11
As previously described for GNAQ, mutations affecting Q209 have been found in GNA11 in melanocytic tumours. The frequency of mutations increasesprogressively from blue nevus to primary melanomas to uveal melanoma metastases; an inverse pattern to that seen with GNAQ Q209 mutations. Activation of this GTPase pathway appears to be a predominant route to the development of uveal melanoma.

New curated fusion pairs:

TAF15-NR4A3
TAF15 can replace EWSR1 as a fusion partner to NR4A3 in extraskeletal myxoid chondrosarcoma. The resulting transcript (TAF15-NR4A3) is structurally and functionally similar to the EWSR1-NR4A3 fusion.

FUS-CREB3L2, FUS-CREB3L1
FUS-CREB3L2 is tumour specific for low-grade fibromyxoid sarcoma (LGFMS) so enables the accurate diagnosis of a sarcoma with sometimes indistinct histological features. In these fusions there's a diversity of genomic breakpoints and these are often exonic rather than intronic. The rare variant FUS-CREB3L1 is occasionally detected in LGFMS.

FUS-DDIT3
Myxoid/round cell liposarcoma is characterized by the recurrent fusion FUS-DDIT3 where the 5' half of the FUS gene is fused to the entire reading frame of DDIT3, which encodes a leucine-zipper transcription factor belonging to the CCAAT/enhancer-binding protein family.

Curations of large international systematic screens:

TCGA - Full exome resequencing of Ovarian tumours
The Cancer Genome Atlas (TCGA) has recently released somatic mutation data from the exon screening of 325 serous ovarian cystadenocarcinoma tumours. We have now curated the majority of this information into COSMIC, comprising over 13,000 mutations. This can now be viewed here.

Study Samples Mutations
Ovarian Cystadenocarcinoma 325 13650

ICGC - Curation of third ICGC release 'Simple Mutations'
The International Cancer Genome Consortium (ICGC) has recently completed its third data release. We have curated into COSMIC all the Validated Somatic Simple Mutationdata from this ICGC release, which can be viewed in the followingstudies :

Study Samples Mutations
Japanese liver cancer (Riken) 1 30
Japanese liver cancer (NCC) 1 59
Breast cancer (JHU) 18 28
Colorectal cancer (JHU) 13 29
Glioblastoma Multiforme (JHU) 23 209
Pancreatic cancer (JHU) 71 1450
Glioblastoma Multiforme (TCGA) 20 29
Lung Adenocarcinoma (TSP) 21 26

Website improvements:

We have improved the Distribution section of the Histogram page, providing much more analytical pie charts. Once a gene is selected, the mutation spectrum can be explored in a number of different ways. Nucleotide substitution breakdowns are presented as pie charts, and lengths of insertions and deletions are presented as histograms. The same filters as usual can be applied to this Distribution section, including tumour phenotype, mutation type and sample source. Links are also provided to view the mutation data in tabulated detail form, and to export it for external analysis. An example of this new system, presenting the data from the KIT oncogene can be seen here.

Systematic screens:

In the last few years, cancer genome screens have been growing substantially in size, and both whole-genome and large candidate gene screens are being curated into COSMIC. As well as curating publications,we are also collecting somatic mutation information from the data portals of large international consortia, beginning with the TCGA and ICGC. A new page now makes these easier to identify and navigate, please click here.

Related CGP Resources

Genomics of Drug Sensitivity in Cancer
The Genomics of Drug Sensitivity Website was updated on 23rd December 2010 with cell line sensitivity data and genomic correlates of sensitivity for Docetaxel, Gefitinib, CI-1040, BIBW 2992 and PLX4720. This large-scale project is a joint collaboration between the Wellcome Trust Sanger Institute (WTSI) and Massachusetts General Hospital Cancer Centre to correlate genomics with response to cancer drugs in 1,000 cancer cell lines. Within the next year we plan to integrate the data from this project into the COSMIC database and develop new web-based tools to browse and mine these data. Click here to visit the site.

Cancer Cell Line Project
The Cancer Cell Line Project website holds the data from the Cancer Genome Projects large-scale systematic characterization of a panel of 770 cancer cells across 64 known cancer genes. Click here to visit the site.

Cancer Gene Census
The Cancer Gene Census was updated at the beginning of December 2010. The Census is an up-to-date listing of genes causally implicated in cancer and now stands at 436 genes. Click here to view the listing.

The following genes have been updated in this release:

ABL1, ACVR1B, AKT1, ALK, APC, ARID1A, ASXL1, ATM, BAP1, BRAF, BRCA1, BRCA2, CBL, CDC73, CDH1, CDKN2A, CEBPA, CTNNA1, CTNNB1, DNMT3A, EGFR, EML4, ERBB2, ERG, EZH2, FAM123B, FBXW7, FGFR1, FGFR2, FGFR3, FLT3, FOXL2, GATA3, GNA11, GNAQ, GNAS, HNF1A, HRAS, IDH1, IDH2, JAK1, JAK2, JAK3, KDR, KIT, KRAS, MAP2K4, MET, MLH1, MPL, MSH2, MSH6, MYB_ENST00000341911, NF1, NF2, NOTCH2, NPM1, NRAS, PBRM1, PDGFRA, PHOX2B, PIK3CA, PPP2R1A, PTCH1, PTEN, PTPN11, RB1, RET, RUNX1, SETD2, SMAD4, SMARCA4, SMARCB1, TET2, TNFAIP3, TP53, TSHR, VHL, WT1

COSMIC v51 Total Statistics
Experiments 2946792
Tumours 577304
Samples 580306
Mutant Samples 161787
Mutations 167193
Unique Mutations 41405
Papers curated 11062
Genes 19001
Fusions 5573
Structural Variants 2729



v50 - 30th Nov 2010

COSMIC v50 Release

Our 50th release significantly enhances the genomic focus of the COSMIC system, including a full genome browser linked directly into the COSMIC websystem. Also included are genome-wide examinations of a further 24 tumours, comprising rearrangement screens of 17 pancreatic tumours and full exome analyses of 7 renal tumours. Two new cancer genes are curated from the scientific literature, together with a further systematic screen publication.

GBrowse

With the inclusion of increasing quantities of genomic data in COSMIC, new methods to navigate and visualise this information are now required. We have implemented a version of GMOD GBrowse and linked it into COSMIC where genomic co-ordinates are available, including coding and non-coding mutations, gene footprints, structural rearrangements and copy number variants (from CONAN ). Full genome annotations have been imported from Ensembl, so that COSMIC data can be examined in the context of these full genome annotations. Throughout the COSMIC websystem, links to GBrowse are available as either links in descriptive text, or via the icon , which will present the selected data in the local genomic context.

Genome rearrangement screens of 17 Pancreatic tumours

The recent Campbell et al (2010) publication "The patterns and dynamics of genomic instability in metastatic pancreatic cancer." , describing genome structure rearrangements as early events in this cancer type, can be viewed in COSMIC, here.

Full-exome sequencing of 7 renal tumours; identification of PBRM1 gene as a tumour suppressor

The Cancer Genome Project has sequenced the full exomes of 7 renal tumours, These data, soon to be published, are available in COSMIChere.

Curation of Ding et al (2010) full-genome resequencing of 3 related breast tumours

In the Ding et al (2010) paper, 3 basal-like breast tumours were taken from one individual and their genomes were fully resequenced and comparatively analysed. We have curated the coding, non-coding and structural rearrangement mutations described in their study, available here.

New Curated genes:

ARID1A has been identified as a tumour suppressor gene in ovarian clear cell and endometrioid carcinomas. It encodes AT-rich interactive domain-containing protein 1A which is a component of the ATP-dependentchromatin modelling complex SWI/SNF.

PPP2R1A has been identified as an oncogene in ovarian clear cell carcinoma and in breast and lung carcinomas. It encodes a regulatory subunit of serine-threonine phosphatase 2 and this subunit forms the scaffold of the holoenzyme.

These curated genes have been updated this release

ABL1, AKT1, ALK, APC, ARID1A, ASXL1, ATM, BRAF, CDC73, CDH1, CEBPA, CSF1R, CTNNB1, CYLD, EGFR, EML4, ERBB2, EZH2, FBXW7, FGFR3, FLT3, FOXL2, GATA1, GNAQ, HRAS, IDH1, IDH2, JAK2, JAK3, KIT, KRAS, MEN1, MET, MLH1, MPL, NF2, NPM1, NRAS, PDGFRA, PIK3CA, PPP2R1A, PTEN, PTPN11, RUNX1, SETD2, SMARCA4, SMARCB1, STK11, TET2, TP53, TSHR, VHL, WT1


COSMIC v50 Total Statistics

Experiments 2908403
Tumours 562843
Samples 565823
Mutant Samples 142586
Mutations 147613
Unique Mutations 25601
Papers curated 10819
Genes 18660
Fusions 5112
Structural Variants 2729



v49 - 29th Sep 2010

COSMIC v49 Release

This release of COSMIC focuses on curation of data from the scientific literature. 57 curated genes have received updates; an additional systematic candidate gene screen has added 1121 mutations, and a further full-genome screen has contributed 45 confirmed coding mutations.

Systematic screen curations:

Shah et al (2009) describes the full-genome examination of a metastatic ER+ lobular breast cancer and compares the mutation spectrum to the primary tumour, finding 32 somatic non-synonymous coding mutations in the metastasis, but only 13 in the primary:
Mutational evolution in a lobular breast tumour profiled at single nucleotide resolution.Shah SP, Morin RD, Khattra J, Prentice L, Pugh T, Burleigh A, Delaney A, Gelmon K, Guliany R, Senz J, Steidl C, Holt RA, Jones S, Sun M, Leung G, Moore R, Severson T, Taylor GA, Teschendorff AE, Tse K, Turashvili G, Varhol R, Warren RL, Watson P, Zhao Y, Caldas C, Huntsman D, Hirst M, Marra MA, Aparicio S. (2009) Nature 461:809-13.

A further analysis expanding that of Sjoblom et al (2006), Wood et al (2007) examines the same tumour set through a set of RefSeq genes additional to the earlier analysis of CCDS genes. 1121 mutations were identified, expanding the mutation spectrum defined in the earlier publication:
The genomic landscapes of human breast and colorectal cancers.Wood LD, Parsons DW, Jones S, Lin J, Sjoblom T, Leary RJ, Shen D, Boca SM, Barber T, Ptak J, Silliman N, Szabo S, Dezso Z, Ustyanksky V, Nikolskaya T, Nikolsky Y, Karchin R, Wilson PA, Kaminker JS, Zhang Z, Croshaw R, Willis J, Dawson D, Shipitsin M, Willson JK, Sukumar S, Polyak K, Park BH, Pethiyagoda CL, Pant PV, Ballinger DG, Sparks AB, Hartigan J, Smith DR, Suh E, Papadopoulos N, Buckhaults P, Markowitz SD, Parmigiani G, Kinzler KW, Velculescu VE, Vogelstein B. (2007) Science. 318:1108-13.

These curated genes have been updated this release

ABL1, AKT1, ALK, APC, ASXL1, ATM, BRAF, BRCA1, CBL, CDKN2A, CEBPA, CRLF2, CTNNA1, CTNNB1, CYLD, EGFR, EML4, ERBB2, EZH2, FAM123B, FBXW7, FGFR2, FLT3, FOXL2, GATA1, GNAQ, GNAS, HRAS, IDH1, IDH2, JAK1, JAK2, KIT, KRAS, MEN1, MET, MPL, NOTCH1, NPM1, NRAS, PDGFRA, PHOX2B, PIK3CA, PTCH1, PTEN, PTPN11, RB1, RUNX1, SETD2, SMARCA4, STK11, TET2, TNFAIP3, TP53, TSHR, VHL, WT1


COSMIC v49 Total Statistics

Experiments 2888511
Tumours 548399
Samples 551325
Mutant Samples 138836
Mutations 143772
Unique Mutations 25079
Papers curated 10578
Genes 18647
Fusions 5050
Structural Variants 2306



v48 - 27th Jul 2010

COSMIC v48 Release

This release brings the majority of curated p53 mutation data into COSMIC in collaboration with IARC, significantly improving our coverage of the key cancer genes. Other new curated genes are TET2 & SETD2, together with ten new fusion pairs. Two new systematic screens are also included. The system has been updated to provide genomic coordinates on both NCBI36 and the later GRCh37 genome builds.

TP53

In collaboration with the p53 group at IARC (http://www-p53.iarc.fr/), we have imported the majority of p53 mutation data into COSMIC. The system has been previously lacking substantial coverage of this gene, since it has been fully curated at IARC. However, our new collaboration has brought these two datasets together in COSMIC. Over 73% of the IARC Somatic dataset is now present in COSMIC, comprising a total of 20129 samples mutated, of 66242 samples analysed. The remaining of IARC's curated p53 data will become available in a later release. This p53 data can be viewed here.

Full mutation details have been curated from the scientific literature for two new genes and are now available:

SETD2, a tumour suppressor gene, encodes a histone H3 lysine 36 methyltransferase and is inactivated in clear cell renal carcinoma.

TET2, The tumour suppressor gene TET2 (ten-eleven-translocation gene, 4q 24) was found to be heterozygously deleted in MDS and Leukaemia patients whose remaining copy carried a somatic point mutation. Subsequently a wide spectrum of somatic mutations - often nonsense or frame shifts - have been found in a variety of myeloproliferative neoplasms, leukaemias (AML, sAML, CMML) and mastocytosis; These mutations are thought likely to be an early event in the pathology of these diseases and there is some evidence that TET2 is both a tumour suppressor and haematopoietic regulator.

10 new fusion gene pairs have been curated from the scientific literature:

ASPSCR1-TFE3, A characteristic translocation in alveolar soft part sarcoma results in the fusion of the N-terminal region of ASPSCR1 to the C-terminal region of TFE3. Two alternative fusion breakpoints are observed in TFE3 resulting in expression of 2 distinct fusion transcripts. ASPSCR1-TFE3 is also found in a subset of renal cell carcinomas.

PRCC-TFE3, SFPQ-TFE3, NONO-TFE3, CLTC-TFE3 Papillary renal carcinomas have fusions involving the TFE3 transcription factor gene. Most commonly the fusions are ASPSCR1-TFE3 or PRCC-TFE3 but variant translocations have also been identified in which TFE3 is fused to SFPQ, NONO or CLTC.

ETV6-NTRK3,A recurrent rearrangement in congenital (infantile) fibrosarcoma fuses the helix-loop-helix protein dimerization domain of ETV6 with the protein tyrosine kinase domain of NTRK3. ETV6-NTRK3 fusions are also found in congenital mesoblastic nephroma, a pathogenetically related tumour, and in a rare form of breast cancer, secretory carcinoma.

SLC45A3-BRAF, ESRP1-RAF1, AGTRAP-BRAF While SLC45A3-BRAF and ESRP1-RAF1 fusions have been identified in ETS rearrangement-negative prostate cancers, AGTRAP-BRAF has been found in stomach cancer; all highlighting the role of RAF pathway fusions in solid tumours.

MYB-NFIB,A recurrent fusion of the MYB oncogene to NFIB, a member of the human nuclear factor I gene family of transcription factors, has been identified in adenoid cystic carcinomas of the breast and the head and neck. The common feature of multiple splice variations is the deletion of exon 15 of MYB and its 3'-UTR.

Systematic screen curations:

Mardis et al (2009) describes the full-genome sequencing of a single AML tumour, resulting in 12 coding mutations and 52 confirmed high quality non-coding mutations, with a follow-up study examining a set of 187 AML tumours through the genes mutated in the primary sample.

Ding et al (2008) examines 188 lung adenocarcinomas through 623 genes known to be involved with cancer. 1013 mutations were described, the most mutated genes being TP53, KRAS, STK11 and EGFR.

Genome co-ordinate update to GRCh37

All genes and mutations with NCBI36 genome co-ordinates have now been updated to GRCh37. New DAS tracks have been created, and the website and data exports have been altered to include both co-ordinate systems.

These curated genes have been updated this release

ABL1, ACVR1B, AKT1, ALK, APC, ATM , BRAF, BRCA1, BRCA2, CBL, CDC73, CDH1, CDKN2A, CEBPA, CSF1R, CTNNB1, CYLD, EGFR, EML4, ERBB2, ERG, FBXW7, FGFR1, FGFR2, FGFR3, FLT3, FOXL2, GATA1, GNAQ, GNAS, HNF1A, HRAS, IDH1, IDH2, JAK1, JAK2, JAK3, KDR, KIT, KRAS, MAP2K4, MEN1, MET, MLH1, MPL, MSH2, MSH6, NF1, NF2, NOTCH1, NOTCH2, NPM1, NRAS, PDGFRA, PHOX2B, PIK3CA, PRKAR1A, PTCH1, PTEN, PTPN11, RB1, RET, RUNX1, SETD2, SMAD4, SMARCB1, SMO, SOCS1, SRC, STK11, SUFU, TET2, TP53, TSHR, VHL, WT1


COSMIC v48 Total Statistics

Experiments 2760220
Tumours 541928
Samples 544809
Mutant Samples 136326
Mutations 141212
Unique Mutations 23907
Papers curated 10383
Genes 18490
Fusions 4946
Structural Variants 2306



v47 - 24th May 2010

COSMIC v47 Release

This latest release of COSMIC includes six new point-mutated genes with full literature curation, together with curation of five new fusion genes involving ALK. Additionally, recent updates to the public TCGA Glioblastoma dataset have been included, as have a number of whole gene deletions interpreted from SNP6.0 microarray data. Recent publications have been curated to update sixty one other curated genes.

New curated genes

GATA2, This gene encodes a member of the GATA family of zinc-finger transcription factors and the encoded protein plays an essential role in regulating transcription of genes involved in normal haematopoietic cell differentiation and survival. Somatic mutations in GATA2 are associated with acute myeloid transformation in a subset of chronic myelogenous leukaemia (CML) patients.

GATA3, This gene encodes a protein which belongs to the GATA family of transcription factors. The protein contains two GATA-type zinc fingers; it is a regulator of T-cell development and plays an important role in endothelial cell biology. Germline mutations in this gene are found in individuals with HDR syndrome (hypoparathyroidism with sensorineural deafness and renal dysplasia). These mutations cluster in the region of the highly conserved second zinc finger. Somatic mutations in the same region have been identified in tumour tissue from both familial and sporadic breast cancer patients.

EZH2, This gene encodes a member of the Polycomb-group (PcG) family; a histone methyltransferase responsible for trimethylating Lys327 of histone H3 (H3K27). Somatic mutations have been found within the catalytic SET domain in diffuse large B-cell lymphoma and follicular lymphoma.

KDR, KDR encodes the kinase insert domain receptor, one of the two receptors of the VEGF (Vascular endothelial growth factor) - a major growth factor for endothelial cells. It is a vascular specific, type III receptor tyrosine kinase. Germline mutations of this gene are implicated in infantile capillary haemangiomas. Somatic mutations have been found in samples derived from angiosarcomas of the breast/chest wall.

CRLF2, The type I cytokine receptor subunit CRLF2 (thymic stromal lymphopoietin receptor, TSLPR) has been identified as a proto-oncogene in adult and high risk paediatric B-ALL. Over-expression is common in cases lacking rearrangements of TEL, MLL, TCF3 and BCR-ABL and can result from CRLF2 rearrangement or mutually exclusive somatic gain of function point mutations including those in CRLF2, JAK1 or JAK2. The predominant CRLF2 mutation, Phe232Cys, promotes constitutive dimerization and cytokine independent growth.

JAK1,A member of the Janus kinase family comprising JAK's 1, 2 and 3 as well as Tyk2, activating JAK1 somatic mutations have been found in the SH, pseudokinase and kinase domains of T-ALL, B-ALL and, more rarely, AML patients. Mutations have also been found in a variety of non-haematopoietic cancers. Mutations include JAK1 V658F, corresponding to the JAK2 V617F mutation commonly found in PV and ET as well as other MPNs; and R724H, corresponding to JAK2 R683 and JAK3 R657, mutations of which have been found in DS-ALL/B-ALL and DS-AMKL respectively.

New curated ALK fusion genes:

The following gene fusions have been curated from the scientific literature:
ATIC-ALK, CARS-ALK, TFG-ALK, TPM3-ALK, TPM4-ALK Variant ALK fusions, including ATIC-ALK, TPM3-ALK, TPM4-ALK and TFG-ALK, have been identified in ALK-positive anaplastic large cell lymphoma. Each translocation product retains the ALK kinase domain. ALK activation is also a recurrent oncogenic event in inflammatory myofibroblastic tumours, where this is sometimes achieved through fusion with ATIC, TPM3, TPM4 or CARS.

TCGA Glioblastoma update

We have updated our recent curation of the TCGA somatic Glioblastoma mutation data, now including phase II data direct from the public TCGA data portal. The combined data can be browsed here.

Statistics

Glioblastoma Samples 424
Genes 1,212
Sequencing Experiments 513,888
Mutations 1,243

Whole gene deletions

55 whole gene or whole-exon deletions have been defined in the core cell lines by interpretation of SNP6.0 microarray data. While many of these have been confirmed, the few unverified mutations are presented with links to the originating data for independent examination. Mutations involving the deletion of whole gene sequences have been reannotated "p.0?" in line with current HGVS recommendations.

New search system

COSMIC is now combined into the new Sanger-wide search system. This allows much richer searching, using multiple terms such as "BRAF melanoma", to much more easily find data without complex navigation through the website. The system additionally searches other Sanger genomic data, giving indications where compatible data might be found elsewhere on the Institute website.

These curated genes have been updated this release

ABL1, AKT1, ALK, APC, ASXL1, ATM, BRAF, BRCA1, BRCA2, CBL, CDH1, CDKN2A, CEBPA, CSF1R, CTNNB1, CYLD, EGFR, ERBB2, EZH2, FAM123B, FBXW7, FGFR1, FGFR2, FGFR3, FLT3, GATA1, GATA2, GATA3, HNF1A, HRAS, IDH1, IDH2, JAK1, JAK2, JAK3, KDR, KIT, KRAS, MAP2K4, MEN1, MET, MLH1, MPL, MSH2, MSH6, NF1, NF2, NOTCH1, NPM1, NRAS, PDGFRA, PIK3CA, PTCH1, PTEN, PTPN11, RB1, RET, RUNX1, SMAD4, SMARCA4, SMO, SOCS1, SRC, STK11, SUFU, VHL, WT1


COSMIC v47 Total Statistics

Experiments 2564761
Tumours 464139
Samples 466851
Mutant Samples 113286
Mutations 116977
Unique Mutations 20090
Papers curated 9202
Genes 18485
Fusions 4722
Structural Variants 2306



v46 - 8th Mar 2010

COSMIC v46 Release

The second full-genome resequencing study from the CGP at the Sanger Institute, UK is now available, together with the curation of Parsons et al (2008), a systematic candidate gene screen of Glioblastomas. In addition, the published literature has been fully curated for fusion mutations between seven new gene pairs.

Full Genome resequencing of NCI-H209

The recent Pleasance et al (2010) publication "A small-cell lung cancer genome with complex signatures of tobacco exposure" (Nature 463, 184-190) is now available within COSMIC; please click here.

Systematic Screen Curation

The largest published candidate gene screen of Glioblastomas Parsons et al (2008), is now curated in COSMIC; please click here:

An integrated genomic analysis of human glioblastoma multiforme.Parsons DW, Jones S, Zhang X, Lin JC, Leary RJ, Angenendt P, Mankoo P, Carter H, Siu IM, Gallia GL, Olivi A, McLendon R, Rasheed BA, Keir S, Nikolskaya T, Nikolsky Y, Busam DA, Tekleab H, Diaz LA, Hartigan J, Smith DR, Strausberg RL, Marie SK, Shinjo SM, Yan H, Riggins GJ, Bigner DD, Karchin R, Papadopoulos N, Parmigiani G, Vogelstein B, Velculescu VE, Kinzler KW Science. 2008;321;1807-12. PMID: 18772396 DOI: 10.1126/science.1164382

Statistics

Samples 105
Mutations 2449

Fusion mutations between 7 new gene pairs have been curated from the literature for this release.

FUS-ERG , FUS-FEV , FUS-ATF1 Both FUS-ERG and FUS-FEV fusions have been identified as alternatives to EWSR1-ETS transcription factor fusions in Ewing's sarcoma, and FUS-ERG also occurs in t (16,21) myeloid leukaemia as well as in these solid tumours. FUS-ATF1 is found in angiomatoid fibrous histiocytoma, where the fusion of the N-terminus of FUS and the DNA binding domain of ATF1 is similar to the EWSR1-ATF1 fusion found in clear cell sarcoma.

SS18-SSX1 This fusion is characteristic for synovial sarcoma along with SS18-SSX2 and more rarely, SS18-SSX4 fusions. Through its N-terminal SNH domain SS18 protein is involved in the remodelling of chromatin structures and functions as a transcriptional activator whereas SSX proteins have 2 putative transcription-repressor domains, one of which, an SSXRD domain in the C-terminal region, is preserved in the fusion protein.

SRGAP3-RAF1 This oncogenic fusion has been identified in paediatric pilocytic astrocytoma as an alternative to the previously described KIAA1549-BRAF fusion. It also activates the ERK/MAPK pathway; the auto-inhibitory domain of RAF1 being replaced by SRGAP3.

COL1A1-PDGFB This recurrent fusion characterizes dermatofibroma protuberans and its juvenile form, giant cell fibroblastoma. The fusion consistently deletes exon 1 of PDGFB releasing this growth factor from its normal regulation. The breakpoints in COL1A1, which encodes an extracellular matrix protein, occur in various exons in the alpha-helical domain.

JAZF1-SUZ12 A fusion involving these two genes is common but not universal in endometrial stromal sarcomas, occurring less frequently in high-grade tumours. The genes encode novel proteins with zinc finger motifs and these are retained in the fusion.

The following curated genes have been updated in this release

ABL1, ACVR1B, AKT1, ALK, APC, ASXL1, ATM, BRAF, BRCA1, BRCA2, CBL, CDC73, CDH1, CDKN2A, CEBPA, CSF1R, CTNNA1, CTNNB1, CYLD, EGFR, EML4, ERBB2, ERG, FAM123B, FBXW7, FGFR1, FGFR2, FGFR3, FLT3, FOXL2, GATA1, GNAQ, GNAS, HNF1A, HRAS, IDH1, IDH2, JAK2, JAK3, KIT, KRAS, MAP2K4, MEN1, MET, MLH1, MPL, MSH2, MSH6, NF1, NF2, NOTCH1, NOTCH2, NPM1, NRAS, PDGFRA, PHOX2B, PIK3CA, PRKAR1A, PTCH1, PTEN, PTPN11, RB1, RET, RUNX1, SMAD4, SMARCA4, SMARCB1, SMO, SOCS1, SRC, STK11, SUFU, TNFAIP3, TSHR, VHL, WT1

COSMIC v46 Total Statistics


Experiments 2077858
Tumours 449676
Samples 451972
Mutant Samples 108773
Mutations 112256
Unique Mutations 19239
Papers curated 8911
Genes 18478
Fusions 4657
Structural Variants 2307



v45 - 21st Jan 2010

COSMIC v45 Release

The first full-genome resequencing study is now available, together with the genome-wide rearrangement screens of 24 breast tumours. In addition, five new cancer genes have been curated from the literature.

To make the data easier to investigate in depth, the website has been upgraded with new specialisation features, together with new views on mutation spectrum and distribution. Finally, we are introducing a new COSMIC Biomart, where all COSMIC's information will be available in this industry-standard data mining tool.

Full Genome resequencing of COLO-829

The recent Pleasance et al (2010) publication "A comprehensive catalogue of somatic mutations from a human cancer genome" (Nature 463, 191-196) is now available within COSMIC; please click here.

Whole-genome rearrangement screen of 24 Breast tumour samples:

Also, the CGP Stephens et al (2009) paper "Complex landscapes of somatic rearrangement in human breast cancer genomes" (Nature 462, 1005-1010) is now available in COSMIC; please click here . A paired-end genome-wide Illumina sequencing strategy revealed numerous rearrangements in very diverse patterns between the samples examined.

New genes curated from the scientific literature

GNAQ is the alpha subunit of one of the heterotrimeric GTP-binding proteins that mediate stimulation of protein kinase C signalling. Mutations in GNAQ, occurring at codon 209 in the catalytic domain, have been found as common and early mutational events in uveal melanomas.

TNFAIP3 is a negative regulator of the NF-kappa B pathway functioning through the removal of activating Lys63-linked ubiquitins and the Lys48-linked ubiquitination of receptor-interacting proteins. TNFAIP3 has been shown to be a genetic target in B-lineage lymphomas such as mucosa-associated lymphoma and Hodgkin's lymphoma of nodular sclerosing histology.

CBL encodes a protein with multiadaptor function and E3 ubiquitin ligase activity that targets a variety of tyrosine kinases for degradation. Mutations in CBL have been identified in myeloid malignancies, occurring in the critical linker and ring finger domains of the protein.

JAK3 is a member of the non-receptor tyrosine kinase family which includes JAK2. Rare but significant JAK3 activating mutations located in the JH2 (pseudokinase) and JH6 (receptor binding) domains have been found in Down syndrome and Non-DS acute megakaryoblastic leukaemia (AML-M7). Mutations have also been found in various myeloproliferative neoplasms, lymphomas and carcinomas.

NOTCH2 is a Type 1 transmembrane protein with an extracellular domain consisting of multiple epidermal growth factor-like (EGF) repeats, and an intracellular domain consisting of multiple different domain types. The Notch2 receptor and its 5 ligands, which include Jagged1, Jagged2, and Delta-like 1, 3 and 4, send signals that are important for development before birth. After birth,Notch2 signaling is involved in tissue repair. Mutations in the NOTCH2 gene have been identified in a small percentage of people with Alagille syndrome and malformations in the kidneys, especially in filtering structures. NOTCH2 is also preferentially expressed in mature B cells,is essential for marginal zone B-cell generation, and mutations are evident in a subset of individuals with diffuse large B-cell lymphomas.

Web site enhancements

The main histogram page of the COSMIC website had been improved to provide better ways of selecting and viewing subsets of data. In the navigation bar on the left side, new options are now available to redraw the histogram and associated tables based on four parameters: mutation type (eg deletion, nonsense substitutions, etc), sample source (cultured or tissue sample), somatic status (confirmed somatic or unknown) and systematic screen (genome-wide screen). In addition to redrawing the histogram and tables, a new "Distribution" button displays pie charts of relevant information about the data selected.

The sample summary page has also been upgraded, with every CGP sample (examined through numerous genes) receiving a mutation spectrum diagram. This comprises a histogram showing the relative frequencies of each substitution type, together with a count of insertion/deletion mutations. This is highly useful when looking for mutation signatures which may show characteristsics of, for instance, tobacco or UV light exposure.

Biomart

The new COSMIC biomart is now available, please click here. This system allows much more specialised selection of data in COSMIC and is very useful for data mining. In addition, it can be directly linked to Ensembl for federilsed querying across both databases.

The following curated genes have been updated in this release

JAK2, JAK3, MAP2K4, GNAS, MPL, SOCS1, WT1, CYLD, FBXW7, MEN1, NF1, RUNX1, ASXL1, NOTCH2, IDH1, IDH2, APC, CDH1, VHL, GNAQ, BRAF, HRAS, CEBPA, CTNNB1, FLT3, KIT, PDGFRA, PTEN, RB1, RET, SMARCB1, AKT1, EGFR, ERBB2, CDKN2A, CBL, GATA1, NPM1, PTPN11, NRAS, FGFR3, BRCA1, MSH6, PRKAR1A, KRAS, PIK3CA, MET, TNFAIP3

COSMIC v45 Total Statistics

Experiments 1654274
Tumours 434364
Samples 436577
Mutant Samples 101860
Mutations 105171
Unique Mutations 16788
Papers curated 8624
Genes 13634
Fusions 3635
Structural Variants 2249



v44 - 4th Nov 2009

COSMIC v44 Release

This release of COSMIC includes 4 new curated genes, 8 new curated fusion pairs and the TCGA systematic screen publication of 91 Glioblastoma tumour samples. In addition, a new CGP study is available (Adenoid cystic carcinoma) together with substantial updates to existing data.

New curated genes

IDH2 encodes a mitochondrial NADP(+)-dependent isocitrate dehydrogenase which catalyzes oxidative decarboxylation of isocitrate to alpha-ketoglutarate. It is now implicated in the pathogenesis of malignant gliomas and some secondary glioblastomas lacking IDH1 mutations have IDH2 mutations at the analogous amino acid (R172).

AKT1 encodes a serine-threonine protein kinase which is activated by phosphorylated phosphoinositides and is a central mediator of the PI3kinase signalling pathway. A common mutation (E17K) has been identified in the pleckstrin homology domain in cancers of the colon, breast, lung and ovary.

ASXL1 belongs to a family of proteins regulating chromatin remodelling. Originally implicated via aCGH on MDS/AML samples, mutations are mainly frameshift mutations, the predicted truncated proteins lack the PHD finger domain potentially compromising the function of the associated chromatin modifiers.

FOXL2, forkhead box L2 is a winged helix/forkhead transcription factor gene, encoding a nuclear protein that is specifically expressed in eyelids and in fetal and adult ovarian follicular cells. Germline mutations in FOXL2 are responsible for BPES - blepharophimosis ptosis epicanthus inversus syndrome - an autosomal dominant disorder consisting of eyelid abnormalities (only, in Type II) and ovarian failure (Type I). Somatic mutations have recently been described in ovarian granulosa cell tumours.

New curated gene fusion pairs:

The following gene fusions have been curated from the scientific literature:
EML4 / ALK
MSN / ALK
NPM1 / ALK
CLTC / ALK
SEC31A / ALK
RANBP2 / ALK
SS18 / SSX2
SS18 / SSX4

Systematic screen curation:

Comprehensive genomic characterization defines human glioblastoma genes and core pathways.The first systematic screen of the Cancer Genome Atlas Research Network (PMID 18772890) is now curated in COSMIC .

Comprehensive genomic characterization defines human glioblastoma genes and core pathways.
Cancer Genome Atlas Research Network
Nature. 2008;455;1061-8. PMID: 18772890 DOI: 10.1038/nature07385

Statistics


Glioblastoma Samples 91
Genes 599
Sequencing Experiments 54509
Mutations 662

New CGP resequencing study: Adenoid Cystic Carcinoma Candidate Gene Screen

Adenoid cystic carcinoma is a slow growing tumour of the secretory glands, arising most commonly in the salivary glands but also occurring in other parts of the body. As part of an ongoing research effort funded by the Adenoid Cystic Carcinoma Research Fund (www.accrf.org), 400 candidate gene (including genes implicated in cancer, cell signaling and growth control) were sequenced for small point mutations. This work was carried out on 25 samples (provided by ACCRF collaborative research group member Dr. Adel El-Naggar) utilising an approach of PCR product generation for the entire set of PCR amplimers followed by individual concatentation of all amplimers for each tumour and matching normal DNA sample, then sequencing this material utilising next generation sequencing. In total 8 somatic point mutations were identified in 8 genes. No highly prevalent point mutation was identified in this set of genes.

These curated genes have been updated this release

KRAS, PIK3CA, FGFR2, MET, ABL1, FGFR1, JAK2, MAP2K4, GNAS, EML4, FOXL2, PTCH1, MPL, SOCS1, HNF1A, WT1, NF2, CYLD, FBXW7, MEN1, NF1, RUNX1, IDH1, IDH2, ASXL1, FAM123B, APC, CDH1, SMAD4, VHL, BRAF, HRAS, CEBPA, CSF1R, CTNNB1, FLT3, KIT, PDGFRA, PTEN, RB1, RET, SMARCB1, SUFU, ACVR1B, AKT1, ALK, ATM, EGFR, ERBB2, SRC, STK11, CDKN2A, GATA1, SMO, NOTCH1, NPM1, PTPN11, NRAS, FGFR3, BRCA1, BRCA2, MLH1, MSH2, MSH6, PRKAR1A

COSMIC v44 Total Statistics

Experiments 1631186
Tumours 419018
Samples 421193
Mutant Samples 97932
Mutations 101138
Unique Mutations 16072
Papers curated 8336
Genes 13501
Fusions 3521
Structural Variants 40



v43 - 26th Aug 2009

COSMIC v43 Release

The COSMIC curation systems have been extended to encompass the entry of large-scale systematic screen papers. For this release, we have entered the first such paper, the Sjoblom et al (2006) screen of human breast and colorectal cancers. This release also contains two new genes successfully curated from the scientific literature (IDH1, SMARCA4) and the finalisation of two of the Cancer Genome Project's current resequencing studies.

Systematic Screen Papers Curated in COSMIC

For this release of COSMIC we have entered the Sjoblom et al (2006) systematic screen paper of human breast and colorectal cancers. An additional 8,648 genes have been added to COSMIC along with the 1,672 mutations from the paper. The COSMIC reference overview page for this publication is available here.

The consensus coding sequences of human breast and colorectal cancers. Sjoblom T, Jones S, Wood LD, Parsons DW, Lin J, Barber TD, Mandelker D, Leary RJ, Ptak J, Silliman N, Szabo S, Buckhaults P, Farrell C, Meeh P, Markowitz SD, Willis J, Dawson D, Willson JK, Gazdar AF, Hartigan J, Wu L, Liu C, Parmigiani G, Park BH, Bachman KE, Papadopoulos N, Vogelstein B, Kinzler KW, Velculescu VE. Science. 2006 Oct 13;314(5797):268-74. Epub 2006 Sep 7. PMID: 16959974

CGP resequencing studies completed

The resequencing of candidate genes in Pilot and Renal tumour sets has now been completed. The finalised studies examined 2978 samples through 4766 genes, discovering a total of 5437 mutations. All of these can be found in COSMIC's CGP Resequencing Studies Site.

New curated genes

IDH1 is a catalytic enzyme causing NADP+ dependent oxidative decarboxylation of isocitric acid. It plays an important role in the control of glucose-stimulated insulin secretion and the cholesterol and fatty acid biosynthetic pathways. Originally implicated in human cancer in genome-wide sequencing scans, when mutated it is an indicator for the longer survival of these patients.

SMARCA4, is a scaffold protein, forming a functional part of the SWI/SNF complex involved in the control of transcription.

These curated genes have been updated this release

FBXW7, MEN1, NF1, BRAF, HRAS, CSF1R, CTNNB1, FLT3, KIT, PDGFRA, PTEN, RET, SMARCB1, SUFU, ACVR1B, ATM, EGFR, ERBB2, SRC, CDKN2A, FAM123B, GATA1, SMO, NOTCH1, NPM1, PTPN11, NRAS, FGFR3, BRCA1, BRCA2, APC, CDH1, SMAD4, VHL, TSHR, MLH1, MSH2, MSH6, SMARCA4, RUNX1, PHOX2B, GNAS, KRAS, PIK3CA, FGFR2, FGFR1, IDH1, JAK2, JAK3, MAP2K4, TET2, PRKAR1A, CDC73, PTCH1, MPL, CTNNA1, SOCS1, HNF1A, WT1, ERG, NF2

COSMIC v43 Total Statistics


Experiments 1506545
Tumours 366477
Samples 368592
Mutant Samples 85749
Mutations 88727
Unique Mutations 14971
Papers curated 7797
Genes 13423
Fusions 2770
Structural Variants 40



v42 - 28th May 2009

COSMIC v42 Release

For this release of COSMIC two known cancer genes (GNAS and ALK) and 3gene fusions (FCHSD1 / BRAF, KIAA1549 / BRAF, EWSR1 / NR4A3) have beensuccessfully curated from the scientific literature. The Cancer CellLine Project has also been updated with the addition of 80 mutations.


Cancer Cell Line Project Update

The Cancer Cell Line Data has been updated with the addition of 80 mutations. The project has also published a further set of variants identified by the screen which have been classified as Tentatively Oncogenic Variant (TOV) or Unknown Variant (UV). These variants are currently available from our website as an excel file.

Curation of known cancer genes ALK and GNAS

Two further cancer genes have been curated with the addition of 95mutations for ALK and 235 mutations for GNAS.

Curation of gene fusions

The following gene fusions have been curated from the scientificliterature:
FCHSD1 / BRAF
KIAA1549 / BRAF
EWSR1 / NR4A3

Genes updated:KRAS, PIK3CA, ABL1, FGFR1, JAK2, BRAF, HRAS, CEBPA, CSF1R, CTNNB1, KIT, PDGFRA, PTEN, RB1, SUFU, ERBB2, SRC, STK11, CDKN2A, GATA1, SMO, NPM1, PTPN11, NRAS, BRCA2, MLH1, MSH2, MSH6, APC, CDH1, SMAD4, MET, EGFR, FLT3, PTCH1, MPL, WT1, CYLD, FBXW7, NF1, ALK, FGFR3, RET, NOTCH1, NF2, GNAS

COSMIC v42 Total Statistics


Experiments 1111579
Tumours 339481
Samples 341522
Mutant Samples 76132
Mutations 78933
Unique Mutations 12905
Papers curated 7386
Genes 4775
Fusions 2424
Structural Variants 40



v41 - 4th Mar 2009

COSMIC v41 release

This release of COSMIC comprises an update of published data in which 44 genes have been updated with the addition of 22516 samples and a further 7387 mutations.


Gene Update

STK11, CDKN2A, GATA1, SMO, NPM1, PTPN11, NRAS, BRCA2, MSH2, KRAS, PIK3CA, JAK2, MAP2K4, BRAF, HRAS, CEBPA, CTNNB1, KIT, PDGFRA, PTEN, RB1, ATM, ERBB2, FBXW7, NF1, FAM123B, APC, CDH1, VHL, MET, EGFR, FLT3, PTCH1, MPL, SOCS1, HNF1A, WT1, CYLD, FGFR3, RET, RUNX1, TSHR, PHOX2B, NOTCH1.


COSMIC v41 Total Statistics

Experiments 1078748
Tumours 313780
Samples 315778
Mutant Samples 70086
Mutations 72718
Unique Mutations 12349
Papers curated 6876
Genes 4773
Fusions 2266
Structural Variants 40



v40 - 26th Nov 2008

COSMIC release 40

This release of COSMIC comprises an update of the existing genes totalling almost 3000 new mutations.

Gene Update
2947 new mutations have been added in release 40; the following curated genes have been updated:BRAF, HRAS, CEBPA, CTNNB1, KIT, PDGFRA, PTEN, ERBB2, MLH1, MSH2, KRAS, PIK3CA, JAK2, CDKN2A, GATA1, NPM1, PTPN11, NRAS, FAM123B, APC, VHL, MET, EGFR, FLT3, PTCH1, MPL, HNF1A, FBXW7, PRKAR1A, RUNX1, FGFR3, RET, TSHR, NOTCH1

Cancer Gene Census

On the 5th November, the Cancer Gene Census was brought up to date, with the addition of three genes newly identified in the causation of cancer, IDH1, MDM2, KIAA1549.

COSMIC v40 Total Statistics

Experiments 1050480
Tumours 291551
Samples 293262
Mutant Samples 62930
Mutations 65331
Unique Mutations 11789
Papers curated 6486
Genes 4773
Fusions 2266
Structural Variants 40



v39 - 15th Oct 2008

COSMIC release 39, Annotating Cancer Genomes

For this release of COSMIC the database and web interfaces have been upgraded to handle Next Generation Sequencing Data. This is part of ongoing work to allow COSMIC to handle the increased volumes and complexity of somatic data that is anticipated from Next Generation Sequencers. In particular, for this release we have concentrated on adapting COSMIC to handle large-scale structural variants (including translocations, large insertions/deletions, inversions, and duplications).

The structural variants from the Campbell et al. 2008 paper, which comprehensively characterizes 2 lung cancer cell lines, have been entered into COSMIC (click here for study overview). Sample Summary pages are available for both cancer cell lines (NCI-H2171 and NCI-H1770).

Circular plots (Circos plots developed by Martin Krzywinski) have been added to the sample overview page which gives a clear overview of all the structural variants along with copy number changes and COSMIC point mutations for a particular sample (Figure 1). More detailed views of complex rearrangements are available on the mutation details page.



Figure 1. Circos Plot showing structural variants in relation to copy number and COSMIC Point Mutations.

Tabular views and exports are also available for these data (Figure 2). Due to the complexity of these rearrangements, where possible, a short description term of the variant is given (e.g. deletion, tandem duplication translocation). The variant is also fully described using HGVS mutation nomenclature. For example chr11:g.36585230_76606619del, where chr11: denotes the chromosome involved, g. for genomic coordinates, 36585230 for the deletion start point, 76606619 for deletion end point and del indicates a deletion event.


Figure 2. Summary Structural Variants Table


Bioinformatics Primer on COSMIC published

NCI/Nature Pathway Interaction Database Primer on COSMIC published and is available from here.


Update of the Cancer Gene Census

The Cancer Gene Census was updated on 11th August 2008. The Census now contains information of 379 genes of which 343 harbour somatic alterations and 70 germline.


COSMIC v39 General Statistics

Experiments 1035943
Tumours 281307
Samples 282777
Mutant Samples 60007
Mutations 62352
Unique Mutations 11642
Papers curated 6168
Genes 4773
Fusions 2266
Structural Variants 40



v38 - 3rd Jul 2008

COSMIC release 38

For this release of COSMIC we have concentrated our efforts on significantly updating the following genes:BRAF, HRAS, CTNNB1, KIT, PDGFRA, PTEN, RB1, ERBB2, MAP2K4, CDKN2A, GATA1, SMO, NPM1, NRAS, MLH1, MSH2, KRAS, PIK3CA, JAK2, APC, SMAD4, EGFR, FLT3, PTCH1, MPL, HNF1A, FBXW7, NF1, FGFR3, RET, NF2, NOTCH1.

External links

In collaboration with the Human Gene Nomenclature committee (HGNC) and the Atlas of Genetics and Cytogenetics in Oncology and Haematology (Atlas Genetics Oncology), links are now available from COSMIC's gene summary page to further information at these resources.



A Current Protocol for COSMIC

An article describing COSMIC, its contents and usage, has been published in Current Protocols in Human Genetics, unit 10.11. Describing in detail how the website and exported datasheets may be used and interpreted, this is available at the Wiley Interscience website.



COSMIC v38 total statistics


Experiments 1019304
Tumours 268938
Samples 270095
Mutant Samples 56918
Mutations 59187
Unique Mutations 11400
Papers curated 5902
Genes 4773
Fusions 2266



v37 - 7th May 2008

COSMIC release 37

This months release extends our complete curation of oncogenic EWSR1 fusion partners, together with two new curated genes, PHOX2B & PRKAR1A. CGP's resequencing studies and cell line projects are also significantly updated, each receiving over 100 new mutations. In total, over 1200 new mutations have been added to COSMIC this release.


Curated genes

PHOX2BThis gene encodes a highly conserved homeobox transcription factor known to cause congenital central hypoventilation syndrome with associated neuroblastoma.

PRKAR1AThis is a regulatory subunit of the cAMP dependent protein kinase holoenzyme. An apparent tumour suppressor gene, it has also been observed to be oncogenic in fusions with RET and RARA.


Gene Unique Samples Samples Experiments Mutants Papers Mutations Unique Mutations
PHOX2B 410 410 411 6 4 6 5
PRKAR1A 232 232 233 7 5 7 7


Curated EWSR1 fusions

EWSR1/ETV4; EWSR1/FEV; EWSR1/PATZ1; EWSR1/PBX1; EWSR1/POU5F1; EWSR1/ZNF384

EWSR1 has been observed in oncogenic gene fusions with over 15 partners. This month we release our curation of the literature describing its fusion with a further six partners, bringing the total to 14.

Genes Unique Breakpoints Mutations Unique Fusions Papers Mutant Samples
EWSR1 / ETV4 3 6 4 2 3
EWSR1 / FEV 5 10 4 4 5
EWSR1 / PATZ1 1 2 2 1 1
EWSR1 / PBX1 1 3 3 1 1
EWSR1 / POU5F1 5 10 4 2 5
EWSR1 / ZNF384 2 4 4 1 2

The following curated genes have received significant updates:BRAF, HRAS, KIT, PTEN, RB1, SMARCB1, ERBB2, STK11, CDKN2A, PTPN11, NRAS, BRCA2, MLH1, MSH2, KRAS, PIK3CA, JAK2, APC, VHL, MSH6, MET, EGFR, MPL, FBXW7, PRKAR1A, RET, RUNX1, NOTCH1, NF2, PHOX2B.



COSMIC v37 total statistics

Experiments 1006553
Tumours 258584
Samples 259684
Mutant Samples 53569
Mutations 55779
Unique Mutations 11207
Papers curated 5706
Genes 4773
Fusions 2249



v36 - 5th Mar 2008

COSMIC release 36

The March 2008 release of COSMIC contains full curation of the TSHR gene together with a further 6 EWSR1 gene fusion pairs.

Curated genes

TSHR - Thyroid stimulating hormone receptor is a 7-TM cell surface receptor expressed in follicular thyroid cells. Upon binding of its ligand, thyrotropin, a signalling cascade is commenced resulting in a range of transcriptional alterations. Somatic mutations in this gene have been described in thyroid adenomas and carcinomas.

Gene Samples Experiments Mutants Papers Mutations Unique Mutations
TSHR 665 669 210 36 210 61

Curated Fusions

EWSR1/ATF1 ; EWSR1/CREB1 ;EWSR1/DDIT3 ; EWSR1/ETV1 ; EWSR1/SP3 ;EWSR1/WT1
EWSR1 is fused to multiple partner genes via recurrent chromosomaltranslocation in, primarily, Ewing sarcoma. We are currently curating the complete mutation data for this gene, which has so far been fused with over 10 partners; we have released our curation of EWSR1 with ERG & FLI1, we now release the data for six more gene partners.


Genes Mutant Samples Mutations Unique fusions Papers
EWSR1 / ATF1 72 175 16 17
EWSR1 / CREB1 24 36 5 3
EWSR1 / DDIT3 11 22 7 6
EWSR1 / ETV1 4 7 3 3
EWSR1 / SP3 1 3 3 1
EWSR1 / WT1 102 198 22 28

The following curated genes have received significant updates:
BRAF, BRCA1, BRCA2, CDH1, CDKN2A, CEBPA, EGFR, ERBB2, FLT3, HRAS, KRAS, MLH1, MSH2, MSH6, NF2, NRAS, PDGFRA, PTEN, SMARCB1, STK11, TSHR, VHL


COSMIC v36 total statistics

Experiments 1000842
Tumours 254673
Samples 255767
Mutant Samples 52343
Mutations 54519
Unique Mutations 10995
Papers curated 5614
Genes 4772
Fusions 2174



v35 - 16th Jan 2008

COSMIC release 35

This release of COSMIC contains the new curation of four new tumour suppressor genes, and further curation of EWSR1/FLI1 gene fusions in Ewing's sarcoma. We also announce a significant upgrade to the CGP Trace Archive, which is now updated daily with our latest sequencing results.



Literature Curation



MLH1

MLH1 is a tumour suppressor gene, involved in mismatch repair. The encoded protein is a subunit of the large 'BRCA1-associated genome surveillance complex' (BASC) involved in DNA damage detection and repair. This particular subunit dimerises with PMS2 to provide endonuclease capacity within the complex. MLH1 germline mutations give rise to HNPCC (hereditary non-polyposis colorectal cancer). Somatic mutations in this gene are important in sporadic colorectal cancers. Mutations of MLH1 lead to a mutator phenotype often manifested by microsatellite instability.



MSH2

MSH2 is a tumour suppressor gene, also involved in mismatch repair. It resides within the 'BRCA1-associated genome surveillance complex' (BASC) which detects and repairs DNA damage. MSH2, in complex with MSH6, forms a sliding clamp which traverses the DNA backbone detecting mismatched bases. MSH2 germline mutations also give rise to HNPCC. Similar to MLH1, somatic mutations in MSH2 are found predominantly in colorectal cancers. Mutations of MSH2 lead to a mutator phenotype often manifested by microsatellite instability.



CDC73

CDC73 (HRPT2) is a tumour suppressor forming part of the PAF protein complex, which is associated with RNA polymerase II and may therefore be involved in both initiation of RNA synthesis and RNA elongation. Mutations in this gene have been identified in tumours of the parathyroid, most often causing the endocrine disorder hyperparathyroidism (with or without jaw tumour).



MAP2K4

MAP2K4 is one part of the mitogen-activated protein kinase (MAPK) pathway, a signal transduction cascade which mediates certain extracellular signals via RAS/RAF resulting in transcriptional control of a wide range of genes. The MAP2K family of peptides regulate MAPK activity by phosphorylation. MAP2K4 mutations appear involved in many tumour types.



Gene Samples Experiments Mutations Unique Mutations Papers
MLH1 1328 1325 44 38 25
MSH2 1306 1304 36 33 23
CDC73 278 272 39 32 11
MAP2K4 1557 1559 22 19 9


EWSR1/FLI1 Gene fusions

Ewing's sarcoma is a rare bone tumour, infrequently of extraskeletal origin, most frequently occurring in teenage children. The majority of these tumours contain a t(11;22)(q24;q12) translocation which fuses the EWSR1 gene on chromosome 22 with the FLI1 gene on chromosome 11. We have now curated the existing literature describing fusions between this gene pair.



Genes Mutant Samples Papers Unique Mutations
EWSR1/FLI1 1133 115 28


The following curated genes have been updated for this release:CDKN2A, PTPN11, NRAS, MLH1, MSH2, KRAS, JAK2, MAP2K4, BRAF, HRAS, CTNNB1, MEN1, NF1, APC, VHL, EGFR, FLT3, PTCH, MPL, WT1, RET, CDC73, RUNX1, EWSR1, FLI1.



Web site upgrade

Genomic co-ordinates for individual mutations are now available in the data export section, together with the datasheets in the FTP site.



CGP Trace Archive

The CGP trace archive has been updated to contain all the sequencing traces used in our analysis of the samples and genes presented in the CGP Resequencing project (COSMIC red pages). The number of traces available for download is now approaching 9.5 million. The Archive itself has also been upgraded, so that it receives daily updates of CGP sequencing traces as they pass through our sequencing pipeline. Daily updates are available as separate files; these will be integrated into the main download files once per week.

Samples with trace data Total number of traces available
276 9465645


COSMIC Statistics


Experiments 991743
Tumours 250869
Samples 251847
Mutant Samples 50949
Mutations 53098
Unique Mutations 10779
Papers curated 5449
Genes 4763
Fusions 1957




v34 - 8th Nov 2007

COSMIC 34

This release of COSMIC includes the addition of BRCA1, BRCA2, and EWSR1/ERG gene fusion from the scientific literature. The website has been enhanced with an update of old gene names and the addition of further links (NCBI Entrez Gene, CCDS, Swiss-Prot and TrEMBL). The CGP Trace and Genotype Archive holding the groups sequence traces and genotype data is also now available.

Literature Curation



BRCA1 and BRCA2

BRCA1 and BRCA2 are tumour suppressor genes initially identified as inherited cancer susceptibility genes for breast and ovarian cancer. Both proteins been shown to have roles in genome surveillance, detection of DNA damage and its subsequent repair. However, they associate with different DNA repair complexes and generate different tumour histologies and spectra. Somatic mutations of either gene are rare, with BRCA2 being more frequently found to have somatic mutations, particularly in ovarian and pancreatic carcinomas.

We report that mutations in these two genes have been discovered at fairly low frequencies (2-3%), with BRCA2 mutated in a wider tissue range than BRCA1.

Gene Unique Samples Samples Experiments Mutant Samples Papers Mutations Unique Mutations
BRCA1 1106 1106 1114 25 22 25 23
BRCA2 1142 1146 1145 29 16 33 29


EWSR1/ERG fusion

Fusions of EWSR1 and ERG are common events in skeletal (and the rarer extraskeletal) Ewing's Sarcoma. These fusions, found at a frequency of approximately 10% in bone tumours result from complex rearrangements, since the two partner genes are not transcribed in the same chromosomal direction.

Genes Mutant Samples Papers Unique Mutations
EWSR1/ERG 77 49 11


COSMIC Data Updates

The CGP Resequencing screens and the following curated genes havereceived updates: BRAF, HRAS, CSF1R, CTNNB1, KIT, PDGFRA, PTEN, ACVR1B, ATM, ERBB2, BRCA1, BRCA2, KRAS, PIK3CA, FGFR2, ABL1, FGFR1, JAK2, SRC, STK11, CDKN2A, PTPN11, NRAS, FAM123B, APC, SMAD4, VHL, MSH6, MET, EGFR, FLT3, FBXW7, MEN1, NF1, RUNX1, FGFR3, RET.



CGP Trace and Genotype Archive

The groups sequence traces and genotype data are now available from the CGP Trace and Genotype Archive site. In order to access the data a Data Transfer Agreement must be completed and approved. A unique username and password will then be provided to access this resource.

Samples with trace data 276
Samples with genotyping data 1,135
Total number of traces 7,254,445


Gene Name Update

244 genes had their names updated (5.2%). It is still possible to search by the old gene name.



Website Upgrades

There has been an addition of several external gene links on the gene summary page. This includes links to NCBI Entrez gene, CCDS, Swiss-Prot and TrEMBL.

The sample summary page now also contains sample source information.



General Statistics


Experiments 984673
Tumours 246369
Mutant Samples 50032
Mutations 52146
Unique Mutations 10533
Papers curated 5271
Genes 4762
Fusions 685




v33 - 5th Sep 2007

COSMIC 33: Improved CGP data release

The WTSI Cancer Genome Project (CGP) announces an updated data release policy. We will now be releasing confirmed somatic mutations on a bi-monthly basis. Confirmed and annotated somatic mutations identified in the previous two months will be released in COSMIC, continuing on at two-monthly intervals. Data will still appear within current COSMIC architecture of gene family/gene set and under appropriate studies. This new policy will result in expedited pre-publication release of curated somatic mutations as they are identified.

This new data will be available in the COSMIC blue pages, but will be most noticeable in COSMIC's CGP resequencing studies site (red pages), as this distinguishes CGP data from the literature curation.

CGP resequencing data is broadly divided (in the red pages) into 3 categories, 'Kinase', 'Pilot' and a new project, 'Renal'. Whilst the Kinase data is completed and published, the other two studies are much larger and still in progress. A collection of approximately 4000 genes has been selected for resequencing in a set of 40 matched pair cell lines ('Pilot' project) and 96 primary clear cell renal cancers. Each tumour sample in these projects has a matched normal sample, whichallows the distinction of somatic mutations from germline variants. The pilot project currently comprises 1865 somatic sequence changes, whilst the Renal project, although less advanced than the Pilot, has identified 84 mutations to date. These will be automatically updated with all our confirmed data every bimonthly release.



Literature curation


RUNX1 (AML1) has been fully curated

RUNX1 is one subunit of the PEBP2 transcription factor, binding to DNA at enhancer sequences. This gene is one of the most frequent targets of chromosome translocations associated with leukemia. Small somatic mutations have also been observed, most frequently in myeloblastic leukaemia types (Acute myeloblastic Leukaemia, MyeloDysplastic Syndrome) and it is these that we have curated in COSMIC. Our data suggests a somatic mutation rate of approximately 10% in this phenotype.


Curated Gene Update

The following curated genes have received updates from the literature:APC, ATM, BRAF, CDH1, CDKN2A, CTNNA1, CTNNB1, CYLD, EGFR, ERBB2, ERG, ETV1, FBXW7, FGFR3, FLT3, GATA1, HRAS, JAK2, KIT, KRAS, MADH4, MPL, MSH6, NF1, NF2, NOTCH1, NPM1, NRAS, PIK3CA, PTCH, PTEN, PTPN11, RB1, RET, SMARCB1, SMO, SOCS1, STK11, SUFU, TMPRSS2, VHL, WT1, WTX.


General Statistics

This release includes 1563 new mutations identified in the set of 4799 genes; 1495 genes are new this month.

Experiments 968416
Tumours 239766
Mutant Samples 48959
Mutations 51054
Unique Mutations 10390
Papers curated 5103
Genes 4799
Fusions 445




v32 - 8th Aug 2007

COSMIC v32

This release includes four new tumour suppressor genes and improved availability in Ensembl.

New external integration: Ensembl

We are continually striving to improve the utility of the data in COSMIC by integrating it closely with external resources. In this release, we provide a much closer integration with the Ensembl genome browser than previously. All our gene & mutation data now have location coordinates on the NCBI36 genome sequence, allowing us to use Ensembl "DAS" technology to display this information within their genome browser, aligned with their standard genome annotations. We have made this easily available, via a single link from our pages.





Literature curation

Four new tumour suppressor genes have been introduced to COSMIC this month, all receiving full literature curation of their somatic mutation data.

NF1

Neurofibromatosis is a familial disease with a complex phenotype including tumours of the central nervous system, caused by mutations in the NF1 tumour suppressor gene. Somatic mutations in tumours have also been identified in this gene, and it is these that we have fully curated.

NF2

The central form of neurofibromatosis is a similar familial central nervous system tumour syndrome, caused by mutations in the NF2 tumour suppressor gene. Somatic mutations in tumours have also been identified in this gene, and it is these that we have fully curated.

SOCS1

SOCS1 downregulates cellular cytokine signalling by its direct interaction with JAK1. It was first implicated in cancer after aberrant methylation was observed to inactivate its activity causing Hepatocellular Carcinoma. Somatic mutations have also been observed which inactivate this tumour suppressor and these have been curated.

TCF1

TCF1 binds to the promoters of several (largely liver-specific) genes, to enhance their expression. Somatic and germline mutations in this gene have been found which cause liver adenomas, and we have curated the somatic component.



The following curated genes have received updates from the scientific literature: KRAS, PIK3CA, JAK2, BRAF, HRAS, KIT, PDGFRA, PTEN, CDKN2A, VHL, EGFR, FBXW7, MEN1, RET



General Statistics for this release


Experiments 521624
Tumours 235207
Mutant Samples 47470
Mutations 49491
Unique Mutations 9699
Papers curated 5053
Genes 3304
Fusions 445




v31 - 27th Jul 2007

COSMIC (v31) now includes Gene Fusion Data

The CGP COSMIC team is pleased to announce the addition of gene fusion/translocation somatic mutation data from the literature to thedatabase. Currently, the census of known cancer genes is dominated by somatically generated fusion genes that have been identified primarily in leukaemias, lymphomas and soft tissue tumours. Until now, we have concentrated on curating somatically point mutated cancer genes for COSMIC. Almost all known cancer genes that have somatic point mutations are, however, now curated in COSMIC. In the coming months we will therefore be searching the scientific literature and annotating genes involved in gene fusions and their partners for addition into the COSMIC database.

We have launched this new facility, complete with new views for this data type, with the curation of TMPRSS2, a gene frequently found to be fused to ETS family transcription factors in adenocarcinoma of the prostate. These mutations have served to spur increased investigation into the potential role of fusion genes in adult solid tumours. The move to curate fusion genes is an important addition and will further enhance COSMIC as the most comprehensive source for somatic mutation data from human cancers.



Fusion Gene Pairs



TMPRSS2/ETV1

TMPRSS2/ERG



Website Upgrades

The fusion data has been integrated into existing pages and overviewed in new pages: Translocations Overview and Translocations Summary.

This new data can be viewed graphically and textually.

The image above shows the table of inferred breakpoints (determined from a sample's observed fusion mRNA spectrum) for a fusion gene pair.

The image above shows a graphical representation of the observed mRNA transcripts from which the inferred breakpoints are calculated.

Further information of the new gene fusion website features is available in the help pages.



Genes from Literature Curation

A new homepage has been created for genes which have received full curation of the scientific literature. This is a new page which allows the distinction of these genes from CGP's data release, for which no literature has been curated.

Curated Gene Update

The following curated genes have also received updates from the scientific literature: CDKN2A, GATA1, NOTCH1, NPM1, NRAS, JAK2, KRAS, PIK3CA, BRAF, HRAS, CEBPA, CSF1R, CTNNB1, KIT, PDGFRA, PTEN, RB1, MET, EGFR, FLT3, WT1, APC, MADH4, FBXW7, FGFR3.



General Statistics for this release


Experiments 515535
Tumours 230057
Mutant Samples 46978
Mutations 48911
Unique Mutations 9014
Papers curated 4938
Genes 3302
Fusions 438



v30 - 6th Jun 2007

COSMIC v30

Today we release full literature curations of five tumour suppressor genes MEN1, ATM, CYLD, FBXW7, WTX; 4712 samples were examined in 112 papers, recording 468 mutations. Additionally, we release two new CGP resequencing studies which add a further 91 new genes to COSMIC.



Literature Curation

Curation of the scientific literature has been completed for five new genes from the cancer census. All five genes are tumour suppressors, causing phenotypes via their inactivation:



MEN1 (Multiple Endocrine Neoplasia Type 1)

Somatic mutations in this gene have been found in tumours from several endocrine sites, recapitulating those seen in patients carrying germline mutations including tumours in the pituitary, pancreas and parathyroid. MEN1 encodes a nuclear protein thought to be a transcriptional regulator.



CYLD (Cylindromatosis)

This gene has been found to have mutations in sporadic cylindromas, tumours arising from skin adnexal structures (such as hair follicles and glands), principally on the face and scalp. CYLD encodes a deubiquitinating enzyme regulating cell signalling including the NF-kappaB pathway.



FBXW7 (CDC4)

Mutations inactivating FBXW7 have been found in a range of cancer types including colorectal, ovarian and T-ALL. The protein is involved targeting a number of key proteins, including NOTCH1 and MYC, for ubiquitin-mediated degradation.



ATM

This gene encodes a protein kinase involved in cell cycle checkpoint control. Amongst other key cell cycle components, it has been shown to phosphorylate TP53 and CHEK2 in response to DNA damage. Germline mutations causes Ataxia-telangiectasia (AT) a recessive disorder characterized by cerebellar ataxia, telangiectases, immune defects, and a predisposition to malignancy, primarily lymphoid in origin.



WTX (FAM123B)

Recently discovered, WTX is inactivated in approximately 30% of Wilms Tumours. Located on the X chromosome, this tumour suppressor only requires a 'single-hit' for tumourigenic inactivation.



New tumour suppressor gene statistics:


Gene Samples Experiments Mutations Papers
MEN1 1680 1683 196 66
ATM 1714 1692 198 33
FBXW7 1207 1204 60 10
WTX 82 82 7 1
CYLD 29 29 7 2


The following curated genes have also received updates: BRAF,HRAS,CEBPA,CTNNB1,KIT,PDGFRA,PTEN,SMARCB1,ERBB2,JAK2,CDKN2A,PTPN11,NRAS,KRAS,PIK3CA,APC,CDH1,MADH4,EGFR,FLT3,MPL,WT1,FGFR3



CGP resequencing studies

91 new genes have been examined in our pilot set of matched pair cell lines, resulting in the discovery of 22 new mutations:

Study Genes Experiments Samples Mutations
Integrin alpha family 16 640 40 11
Miscellaneous genes of interest from literature sources 75 3000 40 11


General Statistics for this release


Experiments 499958
Tumours 217944
Mutant Samples 44491
Mutations 46364
Unique Mutations 8855
Papers curated 4794
Genes 3302



v29 - 9th May 2007

COSMIC v29 released

COSMIC release 29 includes 22 new CGP resequencing studies, comprising 567 new genes within which 192 new mutations have been identified. Additional updates to our curation of the scientific literature have also been included, adding a total of 1041 mutations to this release.



CGP Resequencing Studies



567 genes have been examined in our pilot set of matched pair cell lines:



2
Study Genes Mutations
PAX transcription factor family 11 5
Tripartite motif-containing protein family 56 28
Genes on APC/CTNNB1 pathway 59 25
FK506/rapamycin binding protein family 26 5
Diacylglycerol kinases and other lipid kinases 18 17
SMAD protein family 24 7
Histone acetyltransferase 7 2
Dual specificity phosphatases 23 2
Genes associated with ERB family of RTKs 8 3
Genes associated with MYC proteins 21 12
Ubiquitin specific peptidase family 50 16
C-X-C/C-C motif chemokine receptor genes 19 8
Essential For Cell Division - derived from a siRNA screen in human cells 21 5
Genes from RNAi TSG gene screen 5
Glycolysis associated genes 23 4
Integrin beta family 8 8
Small ubiquitin-like modifier (SUMO) protein family 14 2
14_3_3 family of scaffold protein 8 1
STAT and SOCS gene families 43 7
Serpin/TIMP peptidase inhibitor families 46 17
Sorting NeXin family 27 3
Genes associated with RAS proteins 53 13




Literature curation



89 new publications have been curated, updating the information for the following genes: JAK2,HRAS,CEBPA,PTEN,RB1,RET,ERBB2,CDKN2A,GATA1,NRAS,KRAS,PIK3CA,EGFR,CTNNA1,APC,CDH1,MADH4



General Statistics for this release


Experiments 482902
Tumours 206972
Mutant Samples 42266
Mutations 44062
Unique Mutations 8420
Papers curated 4515
Genes 3220




v28 - 4th Apr 2007

COSMIC v28 Released

This months COSMIC release comprises a substantial increase in the CGP resequencing data, adding 1033 new genes to the system, together with updates to the scientific literature curation.

CGP Resequencing Studies


26 new studies have been included in this release, containing 1033 new genes which have been examined through the pilot matched pair cell line set.

Study Genes Mutations
ADAM metallopeptidase family 40 27
Cyclins and Genes associated with RB 68 20
Nfkappa signalling family 58 14
Phospholipase C Family 13 16
Protein Kinase anchor proteins 32 15
Ral Guanine nucleotide dissociation factors 6 3
Hypoxia inducible factor pathway 23 11
SerThr Phosphotases (PPP) 69 17
Integrin Binding proteins 27 1
K homology RNA-binding domain, type I 25 9
Cytochrome C oxidase family 24 3
DNA methylation and histone deacetylation 38 10
Heat shock proteins 81 20
Ets transcription factor family 28 5
High Mobility Group proteins 24 2
Immediate early/regulator of G-protein signalling family 25 5
Kallikrein protease family 16 5
Matrix metallopeptidase family 21 7
Genes implicated in stem cell regulation 63 12
TCA cycle genes 56 16
Forkhead transcription factor family 43 11
TP53 responsive genes 76 22
Ubiquitination pathway genes 63 21
Ubiquitin Ligases 72 36
DEAD Box proteins 60 25
Genes associated with TP53 and targets 47 16


Curated Gene Update


The following fully curated genes also received minor updates : APC, BRAF, CDKN2A, CTNNB1, EGFR, ERBB2, HRAS, KIT, KRAS, NOTCH1, NPM1, NRAS, PDGFRA, PTEN, PTPN11, RB1, WT1.

General Statistics for this Release

Experiments 455765
Tumours 204457
Mutant Samples 41259
Mutations 43021
Unique Mutations 8122
Papers curated 4426
Genes 2671



v27 - 14th Mar 2007

COSMIC v27 released

This months release of COSMIC comprises upgrades to both the web site (which now allows searching by gene/sample name or keyword) and data, with new CGP resequencing studies and curated genes. COSMIC now contains data on over 200,000 tumour samples and 400,000 individual experiments. Of these 202109 tumours, 40331 were found to contain one or more mutations (19.9%).



CGP Resequencing Studies

Two new studies examine our pilot data set comprising 40 cancer celllines that have a matched normal cell line, allowing all of the mutations to be confirmed as somatic.



Notch signalling proteins
This group of proteins comprise the Notch receptors and other proteins which are involved in Notch signalling. The Notch signalling pathway allows cells to communicate with each other and plays a crucial role in developmental regulation. NOTCH1 mutations have been associated with T-cell acute lymphoblastic leukaemia.



Phosphatidylinositol metabolism
This gene set includes proteins which control the synthesis and turnover of phosphatidylinositol which is synthesised in the endoplasmic reticulum before translocating to cytosolic membrane surfaces where it plays an important role in many cellular processes including cell signalling. Mutations in the phosphatidylinositol-3-kinase PIK3CA and the lipid phosphatase PTEN are associated with many types of cancer.



Literature Curation



STK11 (LKB1)
STK11 is a tumour suppressor, physically associating with p53 to effect growth suppression via p53-dependent apoptosis pathways; restoring gene activity into cancer cell lines defective for its expression results in a G1 cell cycle arrest. It has been identified as the cause of Peutz-Jeghers syndrome, an autosomal dominant disorder inducing an increased risk of melanocytic macules, gastrointestinal polyps and various neoplasms.



STK11 Statistics
Samples 2344
Mutations 92
Unique Sequence Changes 63




WT1
Wilms tumour is a solid cancer usually occurring in childhood, caused by malignant transformation of renal stem cells retaining embryonic differentiation potential. Several tumour suppressor genes have been associated with the development of WT, most classically the WT1 zinc finger DNA binding protein located at chromosome 11p13. A number of isoforms of the transcription factor WT1 exist, unusually exerting control over expression of target genes during both their transcription and splicing.



WT1 Statistics
Samples 1710
Mutations 106
Unique Sequence Changes 68




Website Upgrades



Search Facility
A major update to the COSMIC website this month is the Exalead search facility, allowing for easier navigation of the site. In the 'Text Search' field on the home page, you can search for a gene name or accession number, a sample name or id, or a tumour primary site or sub-site. There is a help page for more advanced searches, which can be accessed by clicking on the question mark in the search box, or the help button in the sidebar.



General Statistics for this release


Experiments 408164
Tumours 202109
Mutant Samples 40331
Mutations 42057
Unique Mutations 7736
Papers curated 4348
Genes 1638




v26 - 14th Feb 2007

COSMIC third anniversary release (v26)

This release comprises a significant increase in the number of CGP resequencing studies. The five new studies all examine our pilot sample set comprising 40 cancer cell lines that all have a matched normal cell line, allowing all of the mutations to be confirmed as somatic.

Nuclear receptors and cofactors

A related but diverse array of transcription factors interacting with a wide range of coregulatory proteins to form a complex network of multicomponent assemblies serving as coactivators or corepressors of transcription.

SCF and APC cell cycle control complex components

Skp1-cullin-F-box-protein complex (SCF) and the anaphase-promoting complex/cyclosome (APC) are ubiquitination complexes regulating progression through the cell cycle.

Nucleocytoplasmic transport components

Factors involved in both import and export of proteins from the nucleus, including nuclear pore components.

Human homologues of putative target "cancer" genes from transposon screens in the mouse

Human orthologues of genes targeted by insertions in transposon insertion screens for cancer genes in the mouse.

Protein Tyrosine Phosphatases (PTP)

Critical regulators of signal transduction, effecting the reversible phosphorylation of tyrosine residues in cell signalling proteins.

Curated Gene Update

The following fully curated genes also received minor updates : BRAF, CDKN2A, EGFR, ERBB2, FLT3, KIT, KRAS, PDGFRA, PTEN, PTPN11.

General Statistics for this release

Experiments 394675
Tumours 194928
Mutant Samples 39520
Mutations 41228
Unique Mutations 7505
Papers curated 4180
Genes 1516



v25 - 10th Jan 2007

COSMIC v25 released

This month's COSMIC release comprises significant updates to CGP resequencing studies and curation of the scientific literature.

Update to CGP Resequencing Studies

The six non-kinase CGP resequencing studies have received substantial updates to the number of genes included and the number of mutations found (the kinase studies were updated in November 2006). Fifty two new genes have been added to the DNA repair study, together with three in the Apoptosis and two in the GAP-GEF studies. The number of mutations discovered in each of the six studies has increased as shown below:

StudyMutation Count
v24v25
Inositol Polyphosphate Phosphatases812
Heterotrimeric G-Proteins56
DNA repair genes114194
Apoptosis genes3882
Small monomeric GTPases828
GAP-GEF genes4891


Literature Curation

In addition to the CGP resequencing studies, significant updates have been made to those genes which have received complete scientific literature curation. Three genes have been extensively updated, BRAF (19.1%, increased to 19224 samples), JAK2 (25.1%, increased to 11190 samples) and NOTCH1(75.4%, increased to 488 samples), whilst eighteen other genes have received minor updates (less than 10% increase in sample number): ABL1, APC, CDKN2A, CEBPA, CTNNB1, EGFR, ERBB2, FLT3, KRAS, MET, NRAS, PDGFRA, PIK3CA, PTEN, PTPN11, RET, SRC, VHL.

General Statistics for this release

Experiments 387254
Tumours 193513
Mutant Samples 39003
Mutations 40672
Unique Mutations 7272
Papers curated 4082
Genes 1398



v24 - 14th Dec 2006

COSMIC v24 released

This months release of Cosmic includes the curation of NPM1 and CDH1.

Curation details for NMP1

NPM1 (Nucleophosmin), is a nucleocytoplasmic shuttling protein andcritical regulator of TP53. Frequent mutations have been found in bothchildhood and adult AML. 20 papers have been manually curated for thisgene resulting in the addition of 45 unique mutations (exon 12).


NPM1 statistics

Samples 3870
Experiments 3875
Mutants 1171
Papers 20
Unique Mutations 45

Curation details for CDH1

CDH1 (E-cadherin), is a calcium ion-dependent cell adhesion moleculewith loss of function of this gene implicated in cancer invasion andmetastasis. In particular, somatic mutations of this gene have beenreported in gastric and lobular breast cancer. 181 mutations have beenadded to Cosmic for this gene from the curation of 46 papers.


CDH1 statistics

Samples 1958
Experiments 1970
Mutants 205
Papers 46
Unique Mutations 181

General Statistics for this release

Experiments 380741
Tumours 190358
Mutant Samples 38206
Mutations 39839
Unique Mutations 7032
Papers curated 4023
Genes 1343



v23 - 30th Nov 2006

COSMIC v23 released

This months release of Cosmic includes a major update to the protein kinase screens.

Protein Kinase Somatic Data Information

The Cancer Genome Project is pleased to release the full set of proteinkinase somatic mutation data resulting from the screening of over 200human cancers through the full set of 518 annotated genes. Over 1000mutations have been identified in a combined total of 247 megabasessequenced. This dataset is intended to serve as a catalyst for furtherbiological investigation of mutated kinases and pathways, hopefullyleading to new insights and therapeutic opportunities in human cancer.


http://www.sanger.ac.uk/genetics/CGP/Studies/Kinases/

Copy number data update

Oligo array CGH data (using the Affymetrix 10K SNP array) for a further233 cancer cell lines and 70 primary tumours has been made availableincreasing the total available from 834 to 1136 samples.



General Statistics for this release

Experiments 374169
Tumours 184092
Mutant Samples 36252
Mutations 37857
Unique Mutations 6758
Papers curated 3945
Genes 1342



v22 - 11th Oct 2006

COSMIC v22 released

This months release of Cosmic includes the curation from the scientific literature of the APC oncogene and information on the similarity between cell lines is now recorded and displayed in Cosmic.

Curation of APC literature


Mutations in the APC gene are one of the initiating events in colorectal tumorigenesis, both familial and sporadic. Our curation confirms that the majority of mutations occur in the central portion of the gene (the mutation cluster region, 'MCR') where mutations are associated with the most severe phenotype of huge numbers of polyps at a young age, often with extracolonic manifestations. Mutations outside the MCR cause a much milder and late onset phenotype, generating few polyps. We curated 206 papers for APC, finding 1420 mutated samples out of 7115 (almost 20%). As expected, there were no major hotspots; the most frequent mutation was p.R1450*, found in 87 samples (6%).

Web site


On the web site, we have begun to include information in Cosmic on samples found to be significantly similar by their genotype as assessed using Affymetrix SNP arrays. To date, 241 cell lines have been found to have genotypes which are greater than 80% identical with at least one other. This data is now displayed in the Cosmic Sample page under the heading of "Other Cancer Samples from the Same Individual"; here is an example: http://www.sanger.ac.uk/perl/genetics/CGP/cosmic?action=sample&id=687448

General Statistics for this release

Experiments 290332
Tumours 173929
Mutant Samples 32891
Mutations 34390
Papers curated 3781
Genes 1342



v21 - 14th Sep 2006

COSMIC v21 released

This months release of Cosmic includes major updates to the Cancer CellLine Project and microsatellite instability status data sets. In addition,published somatic mutation data from two additional genes, MPL andFGFR1, have been added to Cosmic.

Cancer Cell Line Project Major Update


The Cancer Cell Line Project aims to systematically screen a large panel ofcancer cell lines for mutations in known cancer genes, thus empoweringthese cell lines as biological reagents for further work in anti-canceragent development and further work on cancer molecular and cellularbiology.

For this release of Cosmic, a further 137 cell lines have been added tothe working set and 78 duplicate cell lines have been removed. This bringsthe total number of samples to 787. A further 98 mutations have also beenadded (See: http://www.sanger.ac.uk/genetics/CGP/CellLines/).

Statistics for the CGP Cancer Cell Line Project

Experiments 12887
Samples 787
Mutant samples 1087
Mutations 1144
Unique Mutations 3519
Genes 21

MSI Data Update


The microsatellite instability (MSI) status for CGP samples under studyhas been updated, bringing the total number of samples with MSI statusto 1,530. (See:-http://www.sanger.ac.uk/genetics/CGP/MSI/msi_page.shtml).

Curation of MPL and FGFR1


Somatic mutations reported in the published scientific literature forthe FGFR1 and MPL genes has now been added to Cosmic.
MPL-http://www.sanger.ac.uk/perl/genetics/CGP/cosmic?action=gene&ln=MPL
FGFR1-http://www.sanger.ac.uk/perl/genetics/CGP/cosmic?action=gene&ln=FGFR1

General Statistics for this release

Experiments 278786
Tumours 164905
Mutant samples 30566
Mutations 31933
Papers Curated 3611
Genes 1339



v20 - 5th Jul 2006

COSMIC v20 released

This months release includes NCI-60 updates and mutation data from the scientificliterature for VHL.

NCI-60 update


The CGP is pleased to release mutation data for 24 known cancer genes on theNCI-60 series of cancer cell lines. These data should allow for greaterpower in interpretation of biological data using the lines as well asproviding a genetic framework for evaluating response to the largeseries of compounds screened against this reference cell line set.


Microsatellite instability


Microsatellite instability occurs due to a defect in mismatch repair.This is usually a result of inactivation of MSH2, MLH1 or MSH6 due to a mutation or to reduced expression associated with promoter methylation. Analysis of microsatellite instability was carried out using the BAT markers as described by Rodriguez-Bigas et al. All samples were screened using the markers BAT25, BAT26, D5S346, D2S123 and D17S250.Details of this, when available, are posted on the sample overview page. An example of which can be seen at http://www.sanger.ac.uk/perl/genetics/CGP/cosmic?action=sample&id=905950


Curation of VHL


VHL mutation data from the literature is now available. We have curated 93 paperscovering 3412 experiments. These experiments used 3386 samples, in which 879mutations were recorded.


General Statistics for this release

Experiments 270623
Tumours 160594
Mutant samples 29154
Mutations 30439
Papers curated 3519
Genes 1338



v19 - 7th Jun 2006

COSMIC v19 released

This month's release of COSMIC includes the Cancer Genome Project screen of the GAP-GEF gene set and new information displays.

GAP-GEF Screen


This gene set, consisting of 173 genes, is comprised of proteins thatfunction to regulate the activity of proteins with GTPase activities.GTPase activating proteins (GAPs) promote hydrolysis of GTP-GDP. Guaninenucleotide exchange factors (GEFs) promote GDP/GTP exchange. Bothclasses modulate the function of the small monomeric GTPases (includingthe RAS oncogene family) and other key signalling proteins that use theconversion of GTP-GDP as a molecular switch to regulate function. Thissystem of GTPase/GAP/GEFs regulates a wide variety of cellular processesincluding growth, differentiation, survival and motility.


Web Improvements


Zygosity and somatic/germline status information are now available formutations in COSMIC, CGP Resequencing and Cancer Cell Project websites.The somatic/germline status is listed on the sample detail page and theexport function with the following statuses:-


  • Not specified
  • Confirmed somatic mutant
  • Reported elsewhere as a somatic variant
  • Confirmed germline variant
  • Reported elsewhere as a germline variant
  • Variant of unknown origin

Zygosity information is available on the mutation detail page with thefollowing statuses:-


  • Unknown
  • Homozygous
  • Heterozygous

General Statistics for this release

Experiments 264296
Tumours 155902
Mutant samples 27732
Mutations 28859
Papers curated 3419
Genes 1337



v18 - 4th May 2006

COSMIC v18 released

The CGP Resequencing Studies Website is released this month, which will actas a repository for data from CGP resequencing efforts to identify novelsomatic mutations in human cancer. The pages have their own distinctive redcolour scheme to denote this. Prior data on sets of genes/samplessystematically screened for mutations were previously integrated into the "blue"COSMIC pages. This will continue with data now being submitted,prepublication, to and held on the new site. This will allow users tobrowse, search and evaluate these data more effectively. The web resourcesthat are now available are detailed below:-

  • COSMIC (Blue): All data screened from literature and CGP based projects.
  • CGP Resequencing Studies (Red): Somatic mutations from systematic large scale resequencing of genes in human cancers.
  • CGP Cancer Cell Line Project (Green): Resequencing of known cancer genes and other analyses of human cancer cell lines.

Curation of PTCH

37 papers from the scientific literature have been curated for the PTCH gene in this release. Adding an additional 897 experiments and 168 mutations.


General Statistics for this release (COSMIC)
Experiments 254544
Tumours 153528
Mutant samples 27426
Mutations 28534
Papers curated 3393
Genes 1176

COSMIC DAS track

Ensembl has recently moved to the NCBI 36 assembly of the human genome whilst COSMIC genes and mutations are currently mapped to build 35. This has caused some disparity with the COSMIC DAS track. Therefore we suggest only using the cosmic DAS track on the most recent ensembl archive site(http://feb2006.archive.ensembl.org/index.html).Provided below is a link that will open the appropriate website with the DAS source attached:

http://feb2006.archive.ensembl.org/Homo_sapiens/contigview?conf_script=contigview;c=7:139949999.5:1;w=200000;h=7;add_das_source=(name=COSMIC+url=http://das.ensembl.org/das+dsn=cosmic_ncbi_35+type=ensembl_location+color=blue+strand=b+labelflag=n+stylesheet=y+group=n+depth=0+score=n+active=1)



v17 - 4th Apr 2006

Cosmic v17 release

This month's release of COSMIC includes the Cancer Genome Project screen ofsmall monomeric GTPases and mutation data from the scientific literature for MADH4.

CGP Small Monomeric GTPase Screen

The small monomeric GTPases function as key molecular switches impacting a large variety of cellular functions such as motility, cell signalling, transcription and the binding, hydrolysis andexchange of GTP/GDP. The RAS subfamily (HRAS, NRAS, KRAS) of small monomeric GTPases were amongst the first identified human oncogenes and are mutationally activated in a wide variety of human cancers.

Curation of MADH4

70 papers from the scientific literature have been curated for the MADH4gene in this release. Adding an additional 2275 experiments and 259 mutations.

Gene Updates

Further data from the scientific literature for 9 genes, including KRAS and NRAS, has been added for this release. A detailed breakdown for each gene can be seen below.

Gene Name Additional experiments Additional mutations
CDKN2A 668 71
CTNNB1 140 8
EGFR 150 4
ERBB2 84 1
HRAS 54 1
KRAS 3050 647
NRAS 1611 203
PTEN 37 7
PTPN11 309 17

General Statistics for this release
Experiments 249331
Tumours 149251
Mutant samples 26574
Mutations 27637
Papers curated 3324
Genes 1175



v16 - 8th Mar 2006

COSMIC v16 released

Released for March are data from a kinase domain screen of malignantgliomas. These data cover approximately 400kb of sequence in each of 9tumours, including data from recurrent/resistant tumours.

We have recently completed a screen for somatic mutations of the kinasedomain encoding exons of the entire protein kinase family in a series ofhuman malignant gliomas. The results are presented in this release ofCOSMIC. No commonly mutated kinase domain was found in these studies.However, as is the case with our other work in this area, deepsequencing data from human tumours is informative about the processesthat have contributed to oncogenesis in the patient. Two gliomasrecurrent after temozolomide (alkylator) chemotherapy, but not a thirdrecurrent after XRT alone, had the highest mutation prevalence of anytumours we have analysed to date. These data suggests a link betweenmutation prevalence and recurrent/resistant brain tumours treated withalkylator chemotherapy.

Statistics
Experiments 235213
Tumours 143427
Mutant samples 25360
Mutations 26388
Papers curated 3207
Genes 1035



v15 - 7th Feb 2006

A COSMIC Expansion

The Catalogue Of Somatic Mutations In Cancer is two years old and has mutation data for over 1,000 genes, curated from over 3,000 published papers and unpublished data from the Cancer Genome Project.

The original aim of COSMIC continues with the curation of somatic mutation information from the literature for known cancer genes. During 2005 data for 9 genes was collected; ABL1, CDKN2A, EGFR, GATA1, JAK2, MSH6, NOTCH1, PTPN11 and SMO. In addition to this, genes that were curated in 2004 were updated as new data was published.

The number of genes in COSMIC expanded rapidly when the Cancer Genome Project at the Wellcome Trust Sanger Institute published 3 studies of somatic mutations in the protein kinase gene family (518 genes in total). This data provides a unique insight to the somatic mutations in breast, lung and testicular cancers.

More recently the Cancer Genome Project has been submitting unpublished somatic mutation data to COSMIC (link). The data comes from genes involved in apoptosis, DNA repair, maintenance and metabolism and the Inositol Polyphosphate Phosphatase and Heterotrimeric G-Protein families.

In another new departure the COSMIC software was used to create a new web site the Cancer Cell Line Project. This separatesite, with it's own 'mint' colour scheme, contains the results from the sequence analysis of 14 known cancer genes in over 700 cancer cell lines. Initial sequence data for 4 genes analysed in the NCI-60 is also available. This work is in progress and more results will be posted in the coming months. What is more, the number of genes in this project will continue to increase; providing genetic data for this wide set of cancer cell lines.

There have been many enhancements to the web site over the past 12 months. A tissue overview provides a summary of mutations reported in a selected tissue. New pages were created to show more details of mutations and samples and give greater depth to the data. There are also links to other data such as genome copy number information.

COSMIC has been summarised in The British Journal of Cancer (Forbes et al, 2006).

This month sees the update of; BRAF, CDKN2A, EGFR, ERBB2, HRAS, KRAS, NRAS, PTEN, PTPN11 and SMARCB1. In addition the Cancer Genome Project has submitted unpublished data for genes involved in apoptosis.

There are plans to continue the development of COSMIC in terms of data content and data presentation. We are always happy to receive feedback and suggestions (email: cosmic@sanger.ac.uk).

Statistics
Experiments 228,669
Tumours 142,569
Mutant samples 25,176
Mutations 26,194
Papers curated 3,013
Genes 1,035



v14 - 10th Jan 2006

COSMIC v14 released

The COSMIC team is proud to announce the release of COSMIC-14 with data forCDKN2A(p16) and more unpublished data from the CGP.

DNA REPAIR, MAINTENANCE AND METABOLISM

The Cancer Genome Project has released further unpublished somatic mutation data from a screen of 41 cancer cell lines. The 302 genes in this release are involved or associated with DNA repair, maintenance and metabolism. The genes can be viewedtogether or in 5 subgroups; Telomerase Complex, SWI/SNF, DNA replication,Nucleotide Metabolism and DNA Damage Response and Repair. In total 119 somatic mutations were identified in this study.

CURATION OF CDKN2A

CDKN2A (also known as p16) is a tumour suppressor. It induces cell cycle arrest byinhibiting the phosphorylation of Rb by the cyclin-dependent kinases CDK4 and CDK6.So far 453 papers have been curated for this gene with 2,591 mutations recordedfrom 16,883 samples.

STATISTICS

Experiments 219,037
Tumours 140,212
Mutant samples 24,817
Mutations 2,637
Papers curated 3,379
Genes 870



v13 - 13th Dec 2005

COSMIC v13 released

Somatic mutation data from new gene families

In a major new departure the Cancer Genome Project is proud to releasefurther somatic mutation data. The results from the sequencing of twogene families, Inositol Polyphosphate Phosphatases and HeterotrimericG-Proteins, have been added to the data for the Protein Kinase genes. This data willbe expanded in the future with the addition of further gene sets.

Updates to existing genes

Nine genes in COSMIC have been updated with further data; NRAS, RB1,ERBB2, HRAS, PTEN, TP53, KRAS, APC and CDKN2A

New DAS data source

The Cancer Genome Project is pleased to announce the release of a DASsource devoted to the genes and mutations within COSMIC. Using thissource you will be able to view the genes and mutations from COSMICwithin a genome browser or the DAS client of your choice.

All 587 genes in COSMIC are exported as features. Each of these featuresdisplays the genomic 'footprint', which encompasses both exonic andintronic sequence between the start and end points of the CDS sequence.A link is attached to each feature, providing a mechanism for the clientto link back directly to the gene entry on the COSMIC website.

In addition to the gene footprints, there are also a large number ofunique mutations. These are also displayed as features, with links backto the mutation summary page in COSMIC. The database currently holds2812 unique mutations, of which 1035 are currently exported. This subsetis comprised of all the single nucleotide substitutions. More complexmutations will be included, as the genomic coordinates are mapped.

The DAS source can be found at the following URI:

http://das.ensembl.org/das/cosmic_genomic

The easiest way to view this source is to place the following URI inyour browser:

http://www.sanger.ac.uk/turl/6d8

This will attach the DAS source and display some of the mutations foundin BRAF. Additional configuration can be performed on the track, byclicking on the track name. For more information, see the help pages onthe Ensembl website.

COSMIC statistics
Experiments 190,576
Tumours 124,381
Mutant samples 23,232
Mutations 2,228
Papers curated 2,812
Genes 587



v12 - 1st Nov 2005

COSMIC version 12 released

The November release of COSMIC has further data on 9 known cancer genes.

GENE UPDATES

The genes with additional data are; BRAF, PTEN, RB1, EGFR, TP53, CDKN2A,NRAS, KRAS and PIK3CA.

VERSION

We have implemented a versioning system for the data in COSMIC. Thecurrent release is version 12 with a plan to release a new version everymonth.

CANCER CELL LINE PROJECT.

There are additional mutations for the known cancer genes beingsequenced through the cancer cell lines. Notably there is data forhomozygous deletions in the CDKN2A gene.

COPY NUMBER DATA

The Cancer Genome Project has released more copy number data derivedfrom the analysis of cancer cell lines and primary tumours usingAffymetrix SNP microarrays. So far a total of 834 samples have beenanalysed consisting of 161 primary tumours and 673 cancer cell lines.This data is freely available from the CGP website. The primarytumours overlap with those being sequenced by the CGP while the cancercell lines include those being sequenced in the Cancer Cell LineProject.

COSMIC STATISTICS
Tumours 124,367
Experiments 188,529
Mutations 23,157
Papers 2,224
Genes 538



v11 - 3rd Oct 2005

COSMIC Update

COSMIC has been updated with the addition of 2 new curated genes and new mutation descriptions.

MUTATION DESCRIPTIONS

COSMIC has adopted the Human Genome Variation Society sequence variation/mutation nomenclature for the bulk of the mutations in COSMIC. This represents a major upgrade with the aim of improving clarity and enables the listing of intronic variants for the first time.

GENE UPDATES

Two genes have further data in COMSIC; EGFR and PTEN.

PROTEIN KINASE MUTATIONS IN TESTICULAR CANCER

The sequence analysis of the protein kinase gene family in human testicular germ-cell tumours of adolescents and adults has been published. The mutation data from this work was previously available in COSMIC and is now joined by the published analysis of the data.

CANCER CELL LINE PROJECT

There are additional mutations from the screening of known cancer genes through an extensive set of cancer cell lines.

STATISTICS FOR COSMIC
Tumours 123,197
Experiments 186,181
Mutations 22,711
Papers 2,157
Genes 537



v10 - 6th Sep 2005

COSMIC Update

COSMIC has been updated with the addition of 3 new curated genes;MSH6, NOTCH1 and PTPN11.

There is a new member to the COSMIC family; theCancer CellLine Project. This portal uses the COSMIC code to serve mutationdata from the cancer cell lines being sequenced by the Cancer GenomeProject at the Wellcome Trust Sanger Institute. The cell line datais presented in the same style as the COSMIC data with a unique colourscheme. There are links to jump from the Cancer Cell Line Project pagesto view all of the data in COSMIC. At present there is data from 12known cancer genes in the Cancer Cell Line Project database.

In addition the results from the screen of all 518 protein kinasegenes in lung cancer, that were available in the previous release of COSMIC,have been published inCancer Research

.

NEW GENES IN COSMIC

  • MSH6 - is a member of the MutS homolog family and is required for DNA mismatch specific binding. Almost one third of tumours of the large intestine have somatic mutations in this gene.
  • NOTCH1 - has somatic small intragenic mutations in 60% of haematopoietic and lymphoid tumours.
  • PTPN11 - is a nontransmembrane protein-tyrosine phosphatase. Approximately 6% of haematopoietic and lymphoid tumours have mutations in this gene.

COSMIC STATISTICS

Experiments 186014
Tumours 123039
Mutations 22598
References 2153
Genes 537



v9 - 1st Sep 2005

Protein kinase mutations in lung cancer

The Cancer Genome Project has sequenced all protein kinase genes in lung cancer - the most common cause of cancer deaths worldwide

There are over 27,000 new cases of lung cancer in the United Kingdom each year. Protein kinases are frequently mutated in human cancer and inhibitors of mutant protein kinases have proven to be effective anticancer drugs. The Cancer Genome Project has screened the complete coding sequence of all 518 protein kinase genes in 33 lung cancers. This study, published in Cancer Research, is the largest survey reported to date of somatic mutations in lung cancer.

The Cancer Genome Project at the Wellcome Trust Sanger Institute was established in 2000. Its goal is to identify mutations that occur in cancer cells to enable the development of new diagnostics and new treatments and advance our understanding of the biology of cancer.

The Wellcome Trust Sanger Institute Cancer Genome Project and their collaborators have published the latest results of their survey of genes that might be implicated in cancer. The report is published in Cancer Research on Thursday 1st September 2005 and is also available through COSMIC.

The gene set chosen was a class called protein kinases, key controllers of cell growth and death. Members of this family have been shown to be important in cancer. However, the whole set has never been sequenced in a single set of lung tumours. The study generated over 40 million bases of DNA sequence (1.3 million for each sample).

This work identified 188 somatic mutations in 141 protein kinase genes. There was considerable variation in the number of mutations found in each tumour. The results indicate that several mutated protein kinases may be contributing to lung cancer development, but that mutations in each one are infrequent. Larger studies are warranted to further explore these initial findings. Cancer is a complex set of diseases that will affect 1 in 3 people. This work in the CGP is but one part of a global effort to further understanding of cancer and move towards better diagnosis and treatment.



v8 - 3rd Aug 2005

COSMIC Website Update

The COSMIC web site has been updated with additional data from theliterature and unpublished data from the Cancer Genome Project.

SOMATIC MUTATION DATA FOR KNOWN CANCER GENES

Data for 3 genes has been curated from the literature and included inCOSMIC; ABL1, GATA1 and SMO.

SOMATIC MUTATIONS OF THE PROTEIN KINASE GENE FAMILY

The screen of the protein kinase gene family by the Cancer GenomeProject now includes two new tumour types; lung cancer and testiculargerm-cell tumours. Thereare marked differences in the mutation prevalence between these twotumour types.

CANCER CELL LINE PROJECT

The mutation data for 9 further genes has been included on the web site giving a total of550 mutations. The genes are APC, CDH1, CTNNB1, HRAS, MADH4, PIK3CA,PTEN, RB1 and STK11. The sequencing of these genes is not necessarilycomplete but the cell lines with mutations have been confirmed and theexperiments will continue to finish this work.

COSMIC STATISTICS

179,563 Experiments
118,134 Tumours
22,005 Mutations
2,090 References
534 Genes



v7 - 23rd May 2005

Cosmic Update

COSMIC now includes data from a screen of all protein kinase genes in breast cancer and an update of mutation data from the literature.

New Data

The data in COSMIC has expanded to include a new data type and the number of known cancer genes has been extended with updates on some of the existing cancer genes.

A screen of the coding sequence of the protein kinase genes in breast cancer.

The Wellcome Trust Sanger Institute Cancer Genome Project and their collaborators have published the latest results of their survey of genes and their mutations in cancer. The report was published online in Nature Genetics on Sunday 22 May 2005 (more). This data has been integrated with the existing data in COSMIC and made available through the web site.

New cancer genes in COSMIC

The mutation data for two further cancer genes has been curated from the scientific literature and added to COSMIC.

  • EGFR - mutations in the epidermal growth factor receptor (EGFR) have been reported in lung cancer and have been associated with the tumour response of patients receiving gefitinib.

  • JAK2 - A single somatic point mutation (V617F) has been identified in JAK2 in patients with polycythaemia vera. The mutation alters a highly conserved valine present in the negative regulatory JH2 domain, and is predicted to dysregulate kinase activity.

Updates to existing COSMIC genes

Further published data has been curated for 5 genes in COSMIC; BRAF, ERBB2, FGFR2, PDGFRA and PIK3CA.

Website

Home Page
  • We have created a new mailing list: COSMIC-announce, with a subscription link located at the bottom of the home page. As a subscriber to this list you will recieve announcements about the latest COSMIC news and website releases.

Browsing by Gene

More improvements have been made to the gene selection pages. The alphabetical lists have been seperated into 3 groups to reduce the amount of guess work involved in finding your gene of interest.

  • Genes from the Cancer Gene Census: - This list contains genes that have been included in the Cancer Gene Census. All of the genes in previous releases of COSMIC are included in this census.

  • Other Genes with Mutations: - These genes are not in the census, but have been found during the curation of the literature and so are included in the database. All these genes have a documented mutation which is thought to be linked to cancer.

  • Other Genes without Mutations: - The final list contains all the other genes that have been recorded during curation. These do not have a documented mutation in the references found in COSMIC.

The karyotype has also been updated. Genes from the census can be located quickly by clicking on the red trinagles. All other genes are indicated by blue lines across the chromosome.

Mutation Overview Page

Each mutation in COSMIC now has its own overview page containing information about the type of mutation and samples/tissues containing the mutation. This page can be reached by clicking on various links throughout the website.

  • Main Histogram:
  • Main Mutation table:
  • Sample Overview Page:

The overview page is divided into 8 main sections:

  • Mutation Id: This id is used to identify a mutation within the COSMIC database and is assigned as the mutation is curated.

  • Mutation type: The mutation type is used to describe the type of mutation that has occurred. This can be anything from a single base inframe substitution, to a frameshift deletion.

  • Mutation Location: Here, an image displays the location of the mutation within the peptide sequence.
    • The grey bar at the top of this section shows the full length sequence. Below this can be found a red box, which indicates the area around the mutation. At the bottom of the image, the red box has been expanded and the peptide sequence around the mutation is shown. Here you will find a red triangle which indicates the starting point of the mutation. Clicking on the triangle will produce a pop-up window showing the mutation at both the peptide and nucleotide level.
    • Additionally there is a link, 'Show all mutations in area', to the main histogram page for the gene. This link will show the gene histogram zoomed into the area displayed on this page. This allows you to see any other mutations that have been identified in the surrounding area.


  • Gene: The name of the gene in which the mutation was found. Clicking on the gene name will link to the summary page for that gene.

  • AA Mutation: This section details the change that has occurred in the peptide sequence as a result of the mutation. Formatting is as follows:
    • Substitutions - X(Y)Z
      Where X is the amino acid found in the wildtype sequence. Y is a number representing the position, within the peptide sequence, at which the mutation occurred. Finally, Z is the amino acid found in the mutant sequence.

    • Deletions - delY(Z)
      Where Y is a number representing the position at which the deletion starts and Z is the amino acid sequence which has been deleted.

    • Insertions - insY(Z)
      Where Y is a number representing the position at which the insertion begins and Z is the amino acid sequence that is inserted.


  • CDS Mutation: This section details the change that has occurred in the nucleotide sequence as a result of the mutation. Formatting is identical to the method used for the peptide sequence.

  • Tissue Distribution (Top 5): The top five tissues in which this mutation has been identified are described in the following bar chart.
    • Each bar represents the number of samples, for a specific tissue type, that have exhibited the selected mutation. A label indicating the name of the tissue type and the number of samples is located below each bar.

    • Clicking on one of the bars will take you to the tissue overview page for the selected tissue.


  • Associated Samples: A list showing all the samples, including their primary tissue types, that have the selected mutation. Clicking on a sample name will take you to the sample summary page for the selected sample. Clicking on the primary tissue type will take you to the tissue overview page.

Sample Overview

Two new sections have been added to this page:

  • Tumour Features: In this section details about the tumour, from which the sample was obtained, are listed whenever they have been supplied by the reference source.

  • External Data Sources: Additional data sources, with information about the sample, are listed here when available. This includes information from some of the studies within the Cancer Genome Project.

References

COSMIC now includes review papers. There is a review section that can be found at the bottom of the reference overview page for each gene. This section includes references that review other works. As the data from these references has already been added to the data from the original sources, this data is not added again.

Statistics
529 Genes
114300 Tumours
20536 Mutations
1894 References



4th Mar 2005

COSMIC Website Update

COSMIC presents 'Tissue Overview' another way to view somatic mutationdata. The Tissue Overview page details the Top 5 Genes for any tissue /histology selection ranked by mutation frequency and data volume. Inaddition it lists other genes with and without mutations for theselection. From the Tissue Overview page you can click through to thespecific details of the listed genes.

Website

Home Page
  • We have updated the entry point system.
    • Detailed Search - This has been the standard search pathway and has not changed from previous releases. Please continue to use this pathway to build complex queries, if you are interested in specific subtissues or histologies
    • Quick Search - This is a new pathway greatly reducing the number of steps required to access the tissue overview page and any subsequent pages. This increase in speed does however reduce the complexity of the available search to just primary tissues.
Tissue Overview

As stated above, this new page details all the genes that have samplesfor the tissues / histologies selected. It is split into three major sections,with the first section detailing what we feel are the most important genes,based on mutation frequency and data volume.

  • Section One: Top Genes With Sample Data
  • This section provides an interactive bar chart and table showing data for the highest ranked genes containing samples from the chosen tissues / histologies

    The coloured bars in the image represent:

    • All Samples
    • Samples With Mutations
    • Clicking on any portion of the bar or name associated with a particular gene will reveal a pop-up menu.
      • Sample Number: (count) - This indicates the number of samples that have been found with the selected tissue/histology type
      • Mutated Samples: (count) - This is the number of the above samples that have shown mutations.
      • Go to Full Gene Display - Clicking on this link will take you to the histogram display for the selected gene.
  • Below the bar chart image is a table that displays all the information found in the image, in a tabular format.
  • Section Two: Other genes with mutations
    • This section contains a list of additional genes with mutated samples that didn't make it into the top 5. Each gene name is linked to the full histogram image.
  • Section Three: Other genes without mutations
    • This section contains a list of additional genes without mutated samples that didn't make it into the top 5. Again, each gene name is linked to the full histogram image.



v6 - 4th Feb 2005

COSMIC's first anniversary

The COSMIC database and web site have been updated and now have somaticmutation data from 21 genes.

New Data
  • CEBPA is mutated in 7% of haematopoietic and lymphoid tissue tumours. It arrests cell proliferation by inhibiting the kinases CDK2 and CDK4.
  • CTNNB1 or beta-catenin is mutated in a variety of tumours. The gene encodes an adherens junction protein that is critical for the establishment and maintenance of epithelial layers
  • KIT is characterised by two clusters of mutations in and around the kinase domain of the gene with frequent mutations in haematopoietic and lymphoid tissue tumours (19%) and soft tissue tumour (32%).
  • PTEN has mutations through the whole coding sequence with a hot spot at codon 130. Tumours of the central nervous system and endometrium frequently have mutations in this gene (19% and 34% respectively)
  • SRC is homologous to the v-src gene of the Rous sarcoma virus and has one mutation that has been found in 10 samples.
  • SUFU encodes a component of the sonic hedgehog/patched signaling pathway and is mutated in central nervous system tumours.
Statistics
21 genes
104,682 tumours
18,478 samples have mutations
1,755 unique mutations
1,672 papers have been curated



v5 - 17th Dec 2004

COSMIC Update

The COSMIC team is proud to release somatic mutation data for CSF1R,RB1, RET and SMARCB1. This information has been curated from the scientific literature. Somatic mutation data from 15 genes can be queried and viewed through the COSMIC web site.

Data
  • The 4 new genes in COSMIC give data on specific tumour types and increase the breadth of information that can be queried and displayed.
  • CSF1R, also known as the oncogene FMS, is a receptor kinase that is mutated in ~5% of myelodysplastic syndrome cases. Mutations in this gene have been associated with a predisposition to myeloid malignancy.
  • RB1 is mutated in more than 11% of the tumours that have been studied. It is frequently somatically mutated in cases of retinoblastoma (47%) while germline mutations predispose to the same disease.
  • RET, a tyrosine kinase receptor, is somatically mutated in 38% of thyroid medullary carcinomas. Germline mutations in the RET gene are associated with multiple endocrine neoplasia, type IIA and type IIB, medullary thyroid carcinoma and Hirschsprung disease.
  • SMARCB1, also known as SNF5/INI1, is frequently somatically mutated in soft tissue rhabdoid tumours (41%). These are highly malignant cancers that usually occur in young children.
Statistics
15 genes
73,767 tumours
13,420 samples have mutations
536 unique mutations
1,104 papers have been curated



v4 - 12th Nov 2004

COSMIC Update

The COSMIC team are proud to include somatic mutation data for FGFR2, FGFR3, FLT3, MET, PDGFRA and PIK3CA on the COSMIC web site.

Data

The number of genes with data in COSMIC has more than doubled in this release of the database. The additional data represents a set of genes that have a lower, but nevertheless important, mutation frequency in human cancer as a whole. In specific malignancies genes such as FLT3 do have a significant role as can be seen from the data collected in COSMIC.

Gene Number of analysed samples Number of samples with mutations
BRAF 5158 736
ERBB2 714 8
FGFR2 30 2
FGFR3 1735 481
FLT3 7610 1499
HRAS 11876 477
KRAS2 35716 8302
MET 1081 59
NRAS 13884 1132
PDGFRA 146 25
PIK3CA 396 89
TOTAL 78346 12810

Number of unique mutations 307

Number of curated papers 976

Website Changes

Home Page
  • We have added a link to an ATOM feed for those people with ATOM enabled news feed readers. Adding this link to your feeds list will allow you to see the latest news from the COSMIC site as and when it is available.
Distribution View
  • A totals column has been added to the 'Details' table to show the total number of mutated samples that are listed.
  • Links to show only negative data have been added to the'More Details' links in the 'Details' table.
  • The Insertions and deletions table has been split to show different information for the two types of mutation.
Gene Selection
  • Genes can now be selected by chromosome, from the karyotype graphic, or as always from an alphabetical list.
References Page
  • The complete list of references for a specific gene can now be exported in a variety of formats including Excel.
Mutation Data Page
  • All pages with samples containing more than 100 samples have been split to reduce their size. However, the export function will still export all the samples as selected.



v3 - 29th Sep 2004

COSMIC Update

We are pleased to announce an update to the COSMIC website. To coincide with the nature paper on ERBB2 we have added all the data for this gene to COSMIC. There have also been a number of improvements to the interface that we hope you will find useful.

New Data

ERBB2

Today, Nature publish our recent findings, the first description of small intragenic ERBB2 mutations in human cancer. Primarily found in non-small cell lung adenocarcinomas, the mutations identified are suggestive of inappropriate activation of ERBB2 kinase activity.

This addition brings 8 new mutations and 714 new samples to the database. Increasing the total number of mutant samples to 10655 and the total number of samples to 58032.

Website Changes

Distribution View
  • The summary table has been removed in favour of a new gene summary page. Containing all the data from this table, plus much more.
  • The mutations tables have been expanded to show insertions, deletions and complex mutations.
  • Information about the negative samples is now available and can be viewed by clicking on the 'More Details' link in the Details table. Like the positive samples, this data can also be exported in various formats.
  • A new insertions and deletions track has been added to the main image. This will allow us to display a larger number of genes with more complex mutation sets.
  • A complex mutations track has also been added to display those mutations (multiple base substitutions) which don't quite fit into any of the other categories.
Gene Summary Page

This has grown from the original four row summary table, on the distribution page, into a full page overview of the information stored about a specific gene.

  • Mutation hot spots: The mutation summary shows those areas of the transcript that have a high density of mutations. This can be used to go directly to the area of interest on the mutation distribution view.
  • References: A quick glance will show the most recently published paper that was analysed by the COSMIC staff.
Sample Summary Page

Here you will find a page containing all the information about a particular sample. Some of the previously unavailable information, such as details about the individual, has been been made available.

  • Genes Tested: Quickly identify all the genes in COSMIC that have been tested against the selected sample.
  • References: Locate all the references that have included the sample.
Reference Summary Page

For the first time in COSMIC you can see all the samples from one paper in one location. In addition to this there are also details about the genes screened and the mutations that were found.



v2 - 28th Jun 2004

Cosmic Update

We are pleased to announce a minor update to the COSMIC website. The user interface has been updated to include new features that we hope will make your experience with the site more productive and enjoyable.

Web Site Changes

  • Shorter URLs' - These have been shortened to reduce the amount of text required to link to a specific page within the site. The old style identifiers have been replaced with shortened initials. For example, 'locus_name' has been replaced with 'ln'. The old style links still work and any existing bookmarks should not be affected by this change.
  • Nucleotide Tracks - All mutations can now be viewed with respect to the changes they would cause to the nucleotide sequence, in addition to the already present amino acid changes. The nucleotide views can be accessed by selecting 'cDNA' in the navigation menu on the main display pages. An example of this view can be seen here
  • Navigation & Selection Improvements - The selection process has been updated to allow users to select sub types across a range of up to five tissue types. Adding a new level of refinement to the search process.



23rd Jun 2004

COSMIC Detailed

The British Journal of Cancer have released an advance online version of an article describing COSMIC. Detailed information is provided about the curation and structure of the database. Followed by a description of the facilities provided by the website.



6th May 2004

COSMIC Website Unavailable 8th May

On Saturday 8th May the COSMIC website, as part of the Sanger website, will be unavailable whilst major network upgrades and essential maintenance work is carried out. We apologise in advance for this loss of service.



20th Feb 2004

Nucleotide data available

COSMIC displays mutations at the amino acid level to show the potential implication of the mutations on the protein sequence. In addition to this COSMIC holds the mutations at the nucleotide level. This data is available through the Export function that can be found at the top of the Distribution figure (example) or at the bottom of the expanded Mutation Data tables (example).



v1 - 4th Feb 2004

COSMIC version 1 released

Wellcome Trust Sanger Institute launches Catalogue Of Somatic Mutations In Cancer.In the quest to develop rational approaches to treating cancer, researchers need efficient access to existing knowledge. COSMIC (Catalogue Of Somatic Mutations In Cancer), launched today by the Cancer Genome Project at The Wellcome Trust Sanger Institute, is a new tool that provides integrated genetic data from cancer genes, and will make research faster and easier.



3rd Feb 2004

BRAF V599E becomes V600E

The original BRAF mutations reported by Davies et al were mapped to the DNA sequence NM_004333[gi;4757867] with the common BRAF mutation being V599E. On the 24th July 2003 this sequence was updated to NM_004333[gi;33188458] with the insertion of 3bp in the coding sequence. The net effect of this update was to increase the length of the BRAF protein by one amino acid and increase the position of all published mutations by one amino acid. The beginning of both versions of the proteins are;

MAALSGGGGGGAEPGQALFNGDMEPEAGAGR PAASSAADP	NM_004333[gi;4757867]||||||||||||||||||||||||||||||  |||||||||MAALSGGGGGGAEPGQALFNGDMEPEAGAGAGAAASSAADP	NM_004333[gi;33188458]

The BRAF mutations in COSMIC are mapped to the latest version of the cDNA and V599E has become V600E.