Release Notes

v99 - 28th November 2023

Summary

v99 (November 2023) A focus on 7 expertly curated genes; 6 census genes, and 8 cancer hallmark genes are updated along with a new resistance gene drug pair. In this release of COSMIC, we have 438,063 new genomic variants, over a million of new coding mutations, 274,853 non-coding mutations, 6,810 new samples, and 1,303 new whole genomes. We have also curated 32 new systematic screen papers.

Our other products Cancer Mutation Census (CMC) and COSMIC-3D are also updated with the latest datasets. Actionability v10 has been updated with a new release in October 2023; more details can be found here.

Mutational Signatures has been updated with a new release v3.4.

 

Key Updates

  1. Gene focused curation on 7 new genes IKZF3, ARHGAP35, BAX, ASPM, GSK3B, NTRK2, SOX2
  2. New IDH1-Ivosidenib : 14 new samples, 9 new resistance mutations
  3. Cancer Gene Census: 3 new Tier 1 genes (HGF, RAD50 , RRAS2 ) and 3 new Tier 2 genes (GSK3B, MUC6 , RAP1B) have been added
  4. We have created cancer hallmark annotations for each of 8 Cancer Gene Census Tier 1 genes (SRC, SRSF2, STAT3, STAT5B, STK11, SUFU, TBX3, TNFRSF14)
  5. 32 new systematic screen papers.
  6. Cancer Mutation Census data is also updated to align with the latest COSMIC data (v99).
  7. COSMIC-3D is also updated to align with the latest COSMIC data (v99), with 279 new genes mapped to PDB structures, 1698 more mapped PDB structures and 9 new census gene structures.
  8. Updates in the new download files - Added 20 missing sample features e.g. age, grade, drug response in the COSMIC_samples and Cell_lines sample files. Also, the mutation somatic status column is added in COSMIC and Cell Lines download files. A gentle reminder, we are supporting the legacy COSMIC downloads for a whole year (until May 2024).
  9. New features on the download web page: The functionality of Scripted and Filtered download features (similar to the archive downloads) are added to the new download files with enhanced user experience.

 

 

Website Changes

We have added the newly styled Scripted and Filtered downloads onto the Download pages.

Scripted download - This feature allows the products to be downloaded programmatically using the command line or scripts. Once select the option for "Scripted download" in the pop-up window, detailed help with examples is provided on the web page to guide you through the process.

Filtered download - This feature allows a product to be downloaded for a small chunk of data. The categories a product can be filtered on are the Gene symbol, Sample name, and Primary Site (Cancer). The filters could vary depending on the product if it has the filtered fields. E.g. Genome Screen Mutants product can be filtered on all 3 categories of Gene symbol, Sample name, and Primary Site whilst Cancer Gene Census product can only be filtered on Gene Symbols as this product file doesn't include Sample Name or Primary Site. More details and help are provided on the Download pages.

In future releases, we are aiming to expand the options for filtered categories.

To help transition to the new files we are supporting the legacy downloads for v99 until the next year May 2024, but thereafter these downloads will be discontinued. Until then access to the legacy downloads is available from the “Archive Downloads” page.

We value your feedback on the new Download page and download files. Please help us as we work to improve the useability and accessibility of COSMIC data by sending your thoughts to cosmic@sanger.ac.uk

 

Download File Updates

We have added missing data fields in the new download files.

  • 20 missing sample features - Age, Therapy Relationship, Sample Differentiator, Mutation Allele specification, MSI, Average ploidy, Sample Remark, drug response, Grade, Age at tumour recurrence, stage, cytogenetics, Metastatic site, Tumour remark, Ethnicity, Environmental variables, Germline mutation, Therapy, Family, Individual remark are added to Cosmic_Sample and CellLinesProject_Sample files.
  • Added Mutation somatic status column to Cosmic_CompleteTargetedScreensMutant, Cosmic_GenomeScreensMutant, Cosmic_MutantCensus, Cosmic_NonCodingVariants, Cosmic_ResistanceMutations, CellLinesProject_GenomeScreensMutant, CellLinesProject_NonCodingVariants

A complete list of changes in all the files is available on the Download page

 

Curated Genes

IKZF3 (Tier 1 Census Gene)

IKZF3 (IKAROS family zinc finger 3), a haemopoietic zinc finger DNA-binding protein, is a central regulator of lymphoid differentiation and is implicated in leukaemogenesis. IKZF3 was identified as CLL driver gene, recurrent L162R substitutions (11, 2.0%) targeting a highly conserved amino acid (COSP40730). Moreover, the same hotspot mutation has been identified in diffuse large B cell lymphoma (DLBCL) and mantle cell lymphoma, suggesting its critical role in the malignant transformation of B cells. Adult low-hypodiploid acute B-lymphoblastic leukaemia with IKZF3 deletion has also been reported. IKZF3 gene fusions have been identified in colorectal (PMID:29955133) and breast cancer (PMID:21247443).

ARHGAP35 (Tier 1 Census Gene)

ARHGAP35 codes for a Rho GTPase-activating protein and it ranks among the top ∼30 most significantly mutated genes in human cancers. ARHGAP35 is frequently mutated in epithelial tumours, and the high proportion of inactivating mutations coupled with functional evidence indicates that it is a tumour suppressor gene in endometrial carcinoma. We have 2,262 new samples tested for ARHGAP35, and 142 of these had mutations, mostly missense substitutions, followed by synonymous and nonsense.

BAX (Tier 1 Census Gene)

The BAX gene encodes the BCL2 Associated X, apoptosis regulator protein and is a member of the BCL2 protein family.  It forms a heterodimer with the BCL2 protein with the ratio of BAX to BCL2 determining the death or survival of a cell following an apoptotic stimulus.  BAX mutations have been implicated in many cancers but have a particular prevalence in colorectal cancer, endometrial cancer and haematopoietic and lymphoid neoplasms.  Many mutations occur within a poly (G) 8 tract within exon 3 and are associated with microsatellite instability (MSI) -  around 90% of the new mutations curated for BAX were insertions or deletions in this region, with the majority being involved in cancers of the stomach and intestines.  However, missense mutations in other parts of the BAX gene have also been curated in a broader spectrum of cancers including the haematopoietic or lymphoid cancers, cancers of the skin, especially malignant melanomas, and liver and breast cancers.

NTRK2 (Tier 1 Census Gene)

NTRK2 encodes a receptor tyrosine kinase that is involved in neural cell differentiation, survival, and proliferation. A common mechanism of NTRK2 oncogenic activation involves fusion of the tyrosine kinase domain to an N-terminal portion donated by various partner genes, leading to the production of a chimaeric, constitutively activated receptor. NTRK2 gene fusions have reported in glioma, HNSCC and lung adenocarcinoma. Some somatic NTRK2 SNVs have been detected in a variety of tumour types (COSP51285, 8958, 50832).

SOX2 (Tier 1 Census Gene)

SOX2 is a member of the Sox family of transcription factors that are essential for many aspects of mammalian development. Normally, SOX2 maintains the pluripotency of embryonic stem cells. In cancer, it functions as a tumour suppressor such that amplifications of SOX2 are known to induce hyperplasia leading to neoplasms. As the main driver mechanism for SOX2 is amplification rather than mutation, we have limited somatic mutation data for this gene, nevertheless it is now manually curated as a Tier 1 gene.

ASPM (Tier 2 Census Gene)

ASPM encodes a large microtubule-binding protein that plays an important role in neurogenesis and cell proliferation. The gene is frequently affected by somatic mutation (predominantly missense) in lung adenocarcinoma (PMID:25079552) and endometroid carcinoma (PMID:23636398). ASPM is also overexpressed in many types of cancer, where it correlates with tumor progression and poor clinical prognosis.

GSK3B (Tier 2 Census Gene)

GSK3B encodes a serine-threonine kinase that is a negative regulator of many physiological processes, including glycogen metabolism, neuronal function, and microtubule dynamics. We have 1,959 new tested samples for GSK3B; only 18 had mutations. Increased protein-level expression is frequently observed in several cancer types, including NSCLC (PMID:24618715), and decreased protein expression in cutaneous squamous cell carcinoma and basal cell carcinoma (PMID:17699780). Functional evidence suggests that the gene's role in cancer is cell type-specific. However, it has been implicated in playing roles in cancers which are resistant to chemo-, radio-, and targeted therapy [PMID:21881296).

 

Drug Resistance

IDH1-Ivosidenib

IDH1-Ivosidenib; IDH2 - Enasidenib

Mutations in IDH1 and IDH2 are capable of driving cancer in several types of cancer, including leukaemias, lymphomas, gliomas and some bone cancers, amongst others. The most common driver mutations are IDH1 R132H and IDH2 R140Q/R172K. The small molecule inhibitor drugs, ivosidenib and enasidenib target these mutations in IDH1 and IDH2, respectively. While these drugs tend to deliver better outcomes, it is inevitable that their use also drives the development of secondary resistance mutations. These predominantly include secondary IDH mutations and isoform switching between mutant IDH1 and IDH2. We have compiled data from multiple publications capturing patient response to treatment and the subsequent development of resistance, including several new resistance mutations.

 

Cancer Gene Census (CGC)


The Cancer Gene Census (CGC) has been compiled over a 19-year period and is subject to periodic review. This ensures that gene assignment to the Census reflects the latest evidence indicative of the strength of a causal association between a gene and one, or more, cancer types, and consistency in the application of the COSMIC inclusion criteria for CGC Tier 1 and Tier 2 assignment.

Following a recent review, TSHR has been re-assigned from Tier 1 of the Census to Tier 2, and its previous designation as an oncogene rescinded. TSHR is notably affected by recurrent missense mutation (in particular p.T632I and p.M453T) in ~33% of toxic thyroid adenoma, a benign cancer type which progresses to carcinoma in 1 - 10% of cases. Although 39% of the missense mutations occur in a mutation hotspot encompassing codons 630 - 633, there is a paucity of experimental evidence demonstrating functionally how TSHR may contribute to oncogenic transformation. In particular, ‘avoiding immune destruction’ is the only Hallmark of Cancer that wild type TSHR has been shown to promote.

New Census Genes (Tier 1)

HGF (Hepatocyte growth factor)

A growth factor for a broad spectrum of tissues and cell types, and a ligand for MET Proto-Oncogene, Receptor Tyrosine Kinase.

Somatic alterations: Amplified in 10.5%, and affected by missense mutation, in 4.4% of lung adenocarcinomas. Gained/amplified in 25.4% of breast tumours in pre-menopausal, and in 33.7% of post-menopausal breast cancer patients. Promoter activity increase-associated truncating mutations occur in a promoter 30b poly(dA) transcriptional repressor sequence in 51.4% of African-American and 15.1% of European patient breast tumours. HGF-CACNA2D1 fusion in multiple myeloma.

Germline alterations: Germline promoter poly(dA) tract ≥3b truncating mutations affect 50% of bladder cancer patients and 24.2% of healthy controls.

Functional evidence:

NSCLC and breast carcinoma cells cultured in the presence of HF display increased migration and invasion. Mammary epithelium-specific expression in transgenic mice (during pregnancy and lactation) leads to the development of mammary carcinomas (89.1% of mice), and lung metastases (21.8% of mice).

RAD50 (RAD50 double strand break repair protein)

Part of the MRN complex, involved in DNA double-strand break repair, recombination and telomere maintenance.

Somatic alterations: Encompassed in 5q11-35 deletions, associated with decreased mRNA expression, in 50% - 60% of basal-like subtype breast cancers.

Germline alterations: Predicted pathogenic germline variants (including nonsense and frameshift-indels) associated with breast and ovarian cancer susceptibility.

Functional evidence: Deletion in BRCA1/2 +-type ovarian cancers correlates with increased genome instability. Knockdown in an ovarian serous adenocarcinoma cell line causes irregular mitotic chromosome segregation and increases aneuploidy.

RRAS2 (RAS related 2)

A member of the R-Ras subfamily of Ras-like small GTPases. Involved in signal transduction within the MAPK signalling pathway.

Somatic alterations: Missense mutations (p.G23A/D/S/V, p.G24C/D) in 12.5 - 13.6% of intracranial germ cell tumour subtypes.

Functional evidence: Expression of p.G23 (A, C, S, V) and p.G24 (C, D, V) mutant genes transforms mouse NIH 3T3 embryonic fibroblasts. Systemic expression of RRAS2-p.Q72L (occurs in several cancer types, most frequently endometrioid carcinoma) leads to the development of multiple cancer types in knock-in mice.


New Census Genes (Tier 2):

GSK3B (Glycogen Synthase Kinase 3 Beta)

A serine-threonine kinase (constitutively active in the basal state, but inactivated by p.S9-phosphorylation by kinases in various signalling pathways) that is a negative regulator of many physiological processes, including glycogen metabolism.

Somatic alterations: Mutated in 3.6% of endometrial cancers. Increased level of protein expression and of active GSK3B-pY216 in colorectal cancer and pancreatic cancer. Decreased protein expression in cutaneous squamous cell carcinoma and basal cell carcinoma.

Functional evidence: Knockdown in pancreatic cancer leads to increased apoptosis, and decreases both mouse xenograft tumour growth and angiogenesis. Knockdown increases the proliferation of cholangiocarcinoma cells, and melanoma cells.

MUC6 (Mucin 6, oligomeric mucus/gel-forming)

A secreted 2,439 amino acid-glycoprotein that forms an insoluble mucous barrier to protect epithelial surfaces, including the gut lumen.

Somatic alterations: Mutated (frameshift and in-frame indels, missense) in 6.0 - 9.8% of non-hypermutated gastric cancers.

Germline alterations: Minisatellite MS5 allelic variants are associated with gastric and rectal cancer.

Functional evidence: Knockdown in foetal gastric epithelial cell line GES-1 increases cell migration and invasion. Expression in pancreatic ductal adenocarcinoma cell line MIA PaCa-2 decreases cell proliferation, migration and invasion.

RAP1B (RAP1B, member of RAS oncogene family)

A GTPase that (1) stimulates BRAF to activate MAPK signalling, (2) modulates adhesion and signalling functions of integrins and cadherins, and (3) positively regulates angiogenesis during development.

Somatic alterations: Amplified (and overexpressed) in a subset of high grade gliomas.

Functional evidence: Knockdown in glioma cells decreases cell proliferation and invasion, and increases apoptosis.


Hallmarks of Cancer

Hallmarks of cancer annotations summarise the effect of Cancer Gene Census Tier 1 genes on the phenotypic traits shared by cancers. COSMIC v99 includes Hallmark gene pages for an additional 8 genes (SRC, SRSF2, STAT3, STAT5B, STK11, SUFU, TBX3, TNFRSF14).

In addition to hallmarks of cancer annotations, each Hallmark gene page summarises the role of a gene in cancer, how it is affected by somatic and germline alteration in cancer, and how it affects other biological processes relevant to cancer.

SRCSRSF2STAT3STAT5BSTK11SUFUTBX3TNFRSF14

 

Systematic Screen Papers

Follow the links below to the 32 papers that are new in v99, or view the full table of papers here.

COSP40589COSP40973COSP41798COSP42376COSP43057COSP43418COSP43792COSP44286COSP45330COSP46529COSP47258COSP49332COSP49700COSP49702COSP49708COSP49809COSP50112COSP50657COSP50778COSP50881COSP50950COSP50952COSP51034COSP51081COSP51164COSP51214COSP51234COSP51302COSP51433COSP51448COSP51450COSP51532

 

COSMIC Statistics

24,292,168
Total genomic variants (COSV) (+438,063)
16,579,554
Genomic non-coding variants (+274,853)
5,286,735
Genomic mutations within exons (coding variants) (+208,168)
9,069,262
Genomic mutations within intronic and other intragenic regions (+160,598)
1,527,131
Samples (+6,810)
29,230
Papers (+206)
19,428
Fusions (+0)
43,822
Whole genome screen samples (+1,303)
1,207,190
Copy number variants (+0)
9,215,470
Gene expression variants (+0)
7,930,489
Differentially methylated CpGs (+0)

 

COSMIC-3D


COSMIC-3D data has been updated for v99 release. These are the key updates:

  • 9 Cancer Gene Census genes now have mapped structures (FAT4, HGF, MPL, NUTM1, RAD50, RAD51B, RRAS2, SDHC, SDHD)
  • 279 new genes map to a PDB structure, bringing the total number of genes with structures to 8214.
  • 1698 new PDB structures are also added, increasing the total number of mapped protein structures (PDB ids) from 53,312 to 55,010

 

Mutational Signatures

We are also thrilled to announce the release of COSMIC Mutational Signatures, version 3.4. In this release, we introduce the curation of mutational signatures from two new variant classes: structural variants (SV1-10) and RNA-SBS variants (RNA-SBS1 through RNA-SBS5). The former describes large genomic changes resulting from chromosome rearrangements, while the latter enables precise inference of the patterns of nucleotide changes due to RNA editing.

Additionally, we have expanded our existing catalogue with newly extracted signatures, including SBS signatures(SBS96-99), DBS signatures (DBS12-20), ID signatures (ID19-23), and CN signatures (CN25). Finally, we have also been able to refine our reference set of mutational signatures by splitting SBS22 into SBS22a and SBS22b as well as SBS40 into SBS40a, SBS40b, and SBS40c. Consequently, SBS22 and SBS40 have now been deprecated in version 3.4.

For more details please see the signatures website.

 


v98 - 23rd May 2023

Summary

v98 (May 2023) A focus on rare skin tumour cancer, 2 census genes, 8 cancer hallmark genes are updated. In this release of COSMIC, we have 410,000 new genomic variants, 585,000 new coding mutations, 290,000 non-coding mutations, 4,300 new samples and 1,358 new whole genomes. We have also curated 19 new systematic screen papers.

Our other products Cancer Mutation Census (CMC) and COSMIC-3D are also updated with the latest datasets.

 

Key Updates

  1. New focus curation on Rare skin tumours.
  2. Gene focused curation on MUC6 gene
  3. Cancer Gene Census: 1 new Tier 1 gene (NTRK2) have been added and a Tier 2 gene (ASPM)
  4. We have created cancer hallmark annotations for each of the 8 Cancer Gene Census Tier 1 genes (SETBP1, SETD2, SH2B3, SMARCA4, SMARCB1, SMO, SOCS1, SPEN) and so hallmarks of cancer annotations are now available for 346 Census Tier 1 cancer genes.
  5. 19 new systematic screen papers.
  6. Cancer Mutation Census data is also updated to align with the latest COSMIC data (v98).
  7. COSMIC-3D is also updated to align with the latest COSMIC data (v98), with 187 new genes mapped to PDB structures, 1496 more mapped PDB structures and 9 new census gene structures.
  8. New beta download files and webpage - These are revamped data download files of COSMIC to increase interoperability, reduce redundancy of the same data in different files, use of COSMIC identifiers, new file naming conventions, handy readmes with each file, support for 4 versions of COSMIC releases and support for different checksums. Along with this, we are supporting the legacy COSMIC downloads for a whole year (until May 2024).

 

Website Changes

In this release we have launched our beta downloads for COSMIC, they are available on this page: New downloads.

This is a beta version of the new COSMIC Downloads page. The new page and the download files available here have been re-designed to improve useability and accessibility. It is now possible to browse by project and download complete datasets for all available products and genome versions for the current and 3 previous releases – COSMIC Core, Cell Lines Project (CLP), Actionability, and Cancer Mutation Census (CMC).

A detailed technical document listing all the changes in the new download files, along with the ERD (Entity Relationship Diagram) to explain the links between different products and a list of all the COSMIC identifiers is available in the change log

To help transition to the new files we are supporting the legacy downloads for a year (i.e. for v98 and v99), but thereafter these downloads will be discontinued. Until then access to the legacy downloads is available from the “Archive Downloads” page.

We value your feedback on the new Download page and download files. Please help us as we work to improve the useability and accessibility of COSMIC data by sending your thoughts to cosmic@sanger.ac.uk

The current beta version only supports whole file downloads, for our future releases, we will extend the download functionality to support the scripted and filtered downloads.

 

Download File Updates

The new download files are available at the new beta website: New downloads. The development work aimed to revamp the current COSMIC download files.

The key changes and benefits of the beta download files:

  • To address interoperability by making the file more interconnected with internal and external identifiers.
  • To reduce the redundancy of the data in different files; we have reduced the number of identical columns used across multiple files such as the tissue classification and instead, they are replaced with a central COSMIC phenotype identifier (COSO). This identifier can further be linked to a detailed classification file, that contains more detailed information.
  • Some columns are renamed to better match the description of the content.
  • Consistent use of COSMIC identifiers- We have 10 COSMIC identifiers - COSMIC Phenotype Id (COSO), COSMIC Gene Id (COSG), COSMIC Sample Id (COSS), COSMIC Structural Id (COST), COSMIC CNV Id (COSCNV), COSMIC Fusion Id (COSF), Legacy Mutation Id (COSM/COSN), COSMIC Paper Id (COSP), COSMIC Study Id (COSU), COSMIC Genomic Mutation Id (COSV). All these identifiers are linked in the files where applicable data is listed.
  • A tar file (tarball) has been created for each product: It contains a data file with all the contents related to the product and a read-me file describing each of the columns in the data file. Each tarball has a standard naming convention.
    • Tar file naming:
    • project name, product, format of the data, release version, assembly [Project]_[Filename]_[format]_v[Release]_GRCh[assembly].tar.gz
    • Data file naming:
    • project name, product, release version, assembly [Project]_[Filename]_v[Release]_GRCh[assembly].[format].gz
    • Read me file naming:
    • product, release version, assembly README_[Project]_[Filename]_v[Release]_GRCh[assembly].txt
  • The newly formatted download files are easily available for the current release i.e. v98 and also v97, going forward this will be extended to the typical 4 release versions.
  • Support for different checksums - md5sum, sha1sum, sha256sum
  • File changes:
    • CosmicMutantExport – deprecated and now replaced by Cosmic_GenomeScreensMutant and Cosmic_CompleteTargetedScreensMutant (excluding negative data)
    • CosmicCodingMuts.vcf has been split into two files Cosmic_GenomeScreensMutant_v97_GRCh37.vcf and Cosmic_CompleteTargetedScreensMutant_v97_GRCh37.vcf.
    • CosmicHGNC has been replaced with Cosmic_Gene
  • There are a few products and projects that we need to adapt in the newly formatted way. CMC and Actionability projects and the sample features for COSMIC are still to be changed. All these products and projects are made available via the Beta download page and the legacy download page.

A complete list of changes in all the files is available on the Beta download page

 

Curated Genes

MUC6

Mucin 6, oligomeric mucus/gel-forming (MUC6) is a member of a mucin family of high molecular weight glycoproteins produced by many epithelial tissues. The protein encoded by MUC6 gene is secreted and forms an insoluble mucous barrier that protects the gut lumen. MUC6 was identified for defining a subset of H&N cancers, specifically the Schneiderian low grade papillary sinonasal carcinoma. MUC6 has been assessed for inclusion in the Cancer Gene Census and assigned a Tier-2 status (v99). The variant data for the gene has been curated from the literature over the last 3 releases. In addition to targeted sequencing, v98 releases 26 samples with exome sequencing data. MUC6 somatic mutations were detected in 9/15 triple negative breast cancers and 4/5 Wilms tumours. The mechanism of this gene in cancer is not clear.   

 

Rare skin tumour focus

Rare skin tumour

Common skin cancers, such as basal cell carcinoma, squamous cell carcinoma and melanoma, are relatively overrepresented in the scientific literature. This reflects their frequency in the general population. For the v98 release of COSMIC, we sourced data about the rarer skin cancers to have them fairly represented in our database. We searched for publications about adnexal tumours, Merkel cell carcinoma, Kaposi sarcoma, dermatofibrosarcoma protuberans and extramammary Pagets disease. It is a mixed group of skin manifestations and not a comprehensive list of all tumour types that were included in these publications. For example, adnexal tumours (tumours of the sweat glands, hair follicles and sebaceous glands) alone have 50 different tumour types in our histology classification system. This number includes many non-cancerous conditions, such as sebaceous adenoma or cylindroma. Some of them have a potential to develop into malignant tumours, or need to be differentiated from the malignant tumours for treatment purposes. All the tumour types in COSMIC derive from samples that have been found to have somatic mutations in them. In v98 of COSMIC, 776 new samples were curated from the publications and 25,236 new variants were found in the rare skin tumours newly included. 17 new skin tumour types or subtypes were added to the histology classification system. You can explore the variant data and the sample metadata using the COSMIC Cancer Browser on the website. Most skin tumours can be found under Tissue Type = skin. Dermatofibrosarcoma protuberans is classified under soft_tissue > fibrous_tissue_and_uncertain_origin and Kaposi_sarcoma under soft_tissue > blood_vessel. The full spectrum of data linked to all sub tissues and sub histologies can be found in our download files.

 

Cancer Gene Census (CGC)


New Census Genes (Tier 1):

NTRK2 (Neurotrophic Receptor Tyrosine Kinase 2)

A neurotrophin receptor tyrosine kinase involved in peripheral and CNS development and maturation.

Somatic alterations: 3’-end (including tyrosine kinase domain) in-frame fusions to multiple genes in glioma, HNSCC and lung adenocarcinoma.

Functional evidence: QKI-NTRK2 (found in astrocytoma) expression in CDKN2A-deficient mouse astrocytes leads to glioma formation following intracranial transplantation into mice. SPECC1L-NTRK2 (occurs in anaplastic astrocytoma) expression enables IL-3 independent Ba/F3 cell growth.


New Census Genes (Tier 2):

ASPM (Assembly Factor For Spindle Microtubules)

An assembly factor for spindle microtubules, involved in cell spindle regulation.

Somatic alterations: Recurrent amplification, and associated increased expression, in LN metastasis-positive, triple negative invasive ductal breast carcinomas, and cutaneous melanoma metastases.

Germline alterations: A rare, predicted pathogenic germline variant segregates with disease in a nevoid basal cell carcinoma syndrome family.

Functional evidence: Overexpression in a weakly invasive acral lentiginous melanoma cell line increases cell migration in vitro. Knockdown in glioblastoma cells and fibroblasts decreases the NHEJ repair of X-ray-induced DNA double-strand breaks, and increases chromosomal aberrations, respectively. Knockdown reduces HR-mediated DNA double-strand break repair (osteosarcoma), and increases X-ray-induced chromosomal aberrations (cervical carcinoma).


Hallmarks of Cancer

Hallmarks of cancer annotations summarise the effect of Cancer Gene Census Tier 1 genes on processes that are relevant to cancer development and progression.

COSMIC v98 includes Hallmark gene pages for a further 8 genes (SETBP1, SETD2, SH2B3, SMARCA4, SMARCB1, SMO, SOCS1, SPEN), and so hallmarks of cancer annotations are now available for 346 Census Tier 1 cancer genes.

In addition to hallmarks of cancer annotations, each Hallmark gene page summarises the role of a gene in cancer, how it is affected by somatic and germline alteration in cancer, and how it affects biological processes relevant to cancer. Likely pathogenic germline mutations are described for 4 of the 8 genes with a new Hallmark gene page.

 

Systematic Screen Papers

Follow links below to the 19 papers which are new in v98, or view the full table of papers here.

COSP51008COSP47088COSP50829COSP49033COSP50776COSP49928COSP50704COSP50803COSP50775COSP50069COSP46651COSP50677COSP44696COSP50715COSP50702COSP45527COSP50548COSP41209COSP46024

 

COSMIC Statistics

23,854,105
Total genomic variants (COSV) (+410,264)
16,304,701
Genomic non-coding variants (+289,190)
5,078,567
Genomic mutations within exons (coding variants) (+40,586)
8,878,333
Genomic mutations within intronic and other intragenic regions (+160,598)
1,520,321
Samples (+4,356)
29,024
Papers (+144)
19,428
Fusions (+0)
42,519
Whole genome screen samples (+1,358)
1,207,190
Copy number variants (+0)
9,215,470
Gene expression variants (+0)
7,930,489
Differentially methylated CpGs (+0)

 

Actionability v9 - May 2023

COSMIC Actionability v9 includes 5 additional fully-curated genes:
ESR1, CCND1 ,CCND2 ,CCND3 and RB1

This means we have a total of 99 fully curated genes:
ABL1, AKT1, AKT2, AKT3, ALK, AR, ASXL1, ATM, ATR, BCL2, BCR, BRAF, BRCA1, BRCA2, BTK, CCND1, CCND2, CCND3, CD274(PD-L1), CD33, CDK12, CDK4, CDK6, CDKN2A, CEBPA, CHEK2, CTNNB1, DDR2, DNMT3A, EGFR, ERBB2, ERBB3, ERBB4, ESR1, ETV6, EZH2, FBXW7, FGFR1, FGFR2, FGFR3, FGFR4, FLT3, FOXL2, GNA11, GNAS, GNAQ, HRAS, IDH1, IDH2, JAK1, JAK2, JAK3, KIT, KMT2A, KRAS, MAP2K1(MEK1), MAP2K2, MDM2, MDM4, MET, MLH, MPL, MSH2, MSH6, MTOR, MYD88, NF1, NF2, NOTCH1, NPM1, NRAS, NTRK1, NTRK2, NTRK3, PALB2, PDGFRA, PDGFRB, PIK3CA, PIK3CB, PMS2, PTCH1, PTEN, PTPN11, RAF1, RB1, RET, ROS1, RUNX1, SF3B1, SMAD4, SMO, STK11, SYK, TERT, TET2, TP53, TSC1, VHL, WT1

To view the full list of curated genes visit the About page on the Actionability website. All previously-recorded clinical trials have been checked for new or updated results.

 

Statistics

99
Genes fully curated (+5)
372
Genes included (+15)
1835
Drugs (+100)
4796
Treatment combinations (+566)
4610
Trials with results (+221)
5746
Trials with no results (+315)
10356
Total trials (+948)
7364
Evidence from clinical trials databases (+643)
3177
Evidence from PubMed and other (+322)
154
Point mutations (0)
876
Total variants (+72)

 

COSMIC-3D

COSMIC-3D data has been updated for v98 release. These are the key updates:

  • COSMIC-3D has been updated to align the mutations with COSMIC release 98.
  • 187 new genes map to a PDB structure, bringing the total number of genes with structures to 7925.
  • 1496 new PDB structures are also added, increasing the total number of mapped protein structures (PDB ids) from 51,816 to 53,312
  • 9 Cancer Gene Census genes now have mapped structures (ASXL1, BAP1, EXT1, EXT2, GNAS, GPC3, NTRK2, RBM15, TFEB)

 


v97 - 29th November 2022

Summary

v97 (Nov 2022) A focus on blood cancer, 4 census Tier 2 genes, 10 cancer hallmark genes are updated along with resistance data. In this release of COSMIC, we have 44,000 new genomic variants, 127,000 new coding mutations, 27,000 non-coding mutations, 6000 new samples and 1,435 new whole genomes. We have also curated 20 new systematic screen papers.

 

Key Updates

  • New focus curation on Blood cancers
  • Cancer Gene Census: 3 new Tier 2 genes (GOLPH3FADDSUB1) have been added and, following a recent review 1 gene (SMARCD1 has been moved from Tier 1 to Tier 2 )
  • We have created cancer hallmark annotations for each of the 10 Cancer Gene Census Tier 1 genes (NFKBIENTRK3PHF6POLD1POLEPPP2R1APRDM1PTCH1RPL5SALL4). By so doing, we are adding functional annotations for 10 genes causally associated with cancer, thereby providing an overview of how the genes contribute to tumour development, in regard to the hallmarks of cancer.
  • 20 new systematic screen papers
  • Data for drug resistance is updated
  • Cancer Mutation Census data is also updated to align with the latest COSMIC data (v97), along with ClinVar and gnomAD datasets
  • COSMIC-3D is also updated to align with the latest COSMIC data (v97), assembly update, new census gene mapped structures and around 1100 more mapped protein structures.
  • Actionability data has been fully updated. Many new trials have been added, the number of trials with results available has substantially increased, several new mutations are represented and 11 new fully curated genes have been added.
  • Actionability and CMC downloads are free for non-commercial use, files are available on the download page.
  • New download file to map missing significant variants in the Non-Coding region

 

Download File Updates

Actionability and CMC downloads

Actionability and CMC downloads are free for non-commercial use, files are available on the COSMIC download page. Please refer to our licensing page here to understand if you are a Non-Commercial or Commercial user and how to obtain a license.

New download file NCV CDS syntax mapping

Since the annotation system upgrade in v90, VEP is used to standardise and normalise all variant annotations.https://www.ensembl.org/info/docs/tools/vep/index.html

One unintended consequence of using VEP is that it outputs genomic level ( g.) annotations for many non-coding mutations in the 5' UTRs of genes, as well as for all mutations in intergenic regions Sometimes these mutations are associated with a named gene and are known or predicted to be functionally significant, having well known CDS (c.) annotations reported in the scientific literature (eg TERT promoter mutations). Previously, these CDS (c.) annotations were shown in COSMIC, but since the v90 upgrade these are overwritten by the standardised VEP genomic annotations and any link to the gene is lost in the case of promoter mutations.

In order to maintain a standardised dataset, we will continue to show the VEP genomic annotations for all mutations, but we have now produced a mapping file to allow the non-coding variant (NCV) genomic annotations to be linked back to the CDS syntaxes.

The new mapping file NCV_CDS_syntax_mapping.tsv released in v97 can be cross referenced with the CosmicNonCodingVariants.vcf.gz or CosmicNCV.tsv.gz download files to link CDS syntaxes with LEGACY_ID or COSV identifiers.

Generally, on the website we focus on coding mutations, but non-coding variants are displayed on the Genome Browser and can also be viewed directly by searching for the COSN identifier eg: https://cancer.sanger.ac.uk/cosmic/search?q=COSN32285790

In v97, the new mapping file contains only TERT promoter mutations, but we plan to include non-coding mutation mapping for other genes in future releases.

This new file is available on the COSMIC download page.

 

Drug Resistance

NT5C2 - purine

Unique samples - 57, Unique Mutation - 81

FGFR2 - BGJ398

Unique samples - 9, Unique Mutation - 29

FGFR2 - pemigatinib

Unique samples - 8, Unique Mutation - 6

 

Blood Cancer

As part of release v97 we have focused on updating the expert-curated mutation data for blood tumours. Blood tumours in COSMIC are classified under haematopoietic and lymphoid tissue as haematopoietic neoplasms or lymphoid neoplasms, which include cancer types such as leukaemias, lymphomas and myelomas as well as myeloproliferative neoplasms. Seventy six additional publications with mutation screening data in these tumour types are included in this release. The types of data ranges from whole genome studies and studies utilising large next generation sequencing panels to case reports with more unusual clinical details and novel treatments. Over 2,600 samples were curated and 24,356 new variants added from these samples. Release v97 also incorporates 9 new blood tumour types into COSMIC.

 

Cancer Gene Census (CGC)


New Census Genes (Tier 2):
GOLPH3FADDSUB1SMARCD1

Hallmarks of Cancer

Based on the concept defined by D. Hanahan and R. A. Weinberg, COSMIC, in collaboration with Open Targets, integrates functional descriptions focused on Hallmarks of Cancer into the CGC. The Hallmark pages visually explain the role of a gene in cancer by highlighting which of the classic behaviours are displayed by the gene and whether they are promoted or suppressed.

NFKBIENTRK3PHF6POLD1POLEPPP2R1APRDM1PTCH1RPL5SALL4

 

Systematic Screen Papers

Follow links below to the 20 papers which are new in v97, or view the full table of papers here.

COSP50547COSP42471COSP40854COSP45619COSP50467COSP41068COSP50471COSP50438COSP43701COSP40675COSP38483COSP41453COSP35787COSP50319COSP40730COSP41722COSP45615COSP43650COSP49463COSP50070

 

COSMIC Statistics

23,443,841
Total genomic variants (COSV) (+44,671)
16,015,511
Genomic non-coding variants (+26,839)
5,037,981
Genomic mutations within exons (coding variants) (+35,377)
1,515,965
Samples (+6,287)
28,880
Papers (+186)
41,161
Whole genome screen samples (+1,435)
321,804
Genomic Rearrangements (+3,788)
19,428
Fusions
1,207,190
Copy number variants
9,215,470
Gene expression variants
7,930,489
Differentially methylated CpGs

 

Actionability Release v7

COSMIC Actionability v7 includes 11 additional fully-curated genes:CD274 (PD-L1), HRAS, MAP2K1 (MEK1), AR, GNA11, GNAQ, SMAD4, TSC1, DDR2, ETV6, FOXL2

This means we have a total of 72 fully curated genes:ABL1, AKT1, AKT2, AKT3, ALK, ASXL1, ATM, BCR ,BRAF, BRCA1, BRCA2, BTK, CDK12, CDK4, CDK6, CEBPA, CTNNB1, DNMT3A, EGFR, ERBB2, ERBB3, EZH2, FGFR1, FGFR2, FGFR3, FGFR4, FLT3, IDH1, IDH2, JAK1, JAK2, JAK3, KIT, KMT2A (MLL) ,KRAS, MDM2, MDM4, MET, MLH, MPL, MSH2, MSH6, NF1, NF2 ,NPM1, NRAS, PDGFRA, PDGFRB, PIK3CA, PMS2, PTCH1, PTEN, RET, ROS1, RUNX1, SF3B1, SMO, STK11, TET2, TP53, WT1, CD274 (PD-L1), HRAS, MAP2K1 (MEK1), AR, GNA11, GNAQ, SMAD4, TSC1, DDR2, ETV6, FOXL2

To view the full list of curated genes visit the About page on the Actionability website.

All previously-recorded clinical trials have been checked for new or updated results.

Expressed/not category added to Patient Pre-screening; From v7 onwards the download file contains a new category: 'Expressed/not' This is used for trials that compare patients that express a protein with those that don???t or compare patients with high expression with those with low expression. In practice, there is usually a threshold expression level and the comparison is between patients above/below it. If our curator is able to find out the measure and threshold level that was used, it appears as part of the trial name.This new value is represented by the term Patient Pre-Screening, in the column mutation_selected_dict

  • positive/above threshold expression - patient number and results value recorded in treatment values column
  • negative expression/below threshold values - value will appear in the fields used for control values, as is the case for trials that compare a treatment in patients with/without a mutation.

There are several trials using this new category in v7.

Addition of Australian/New Zealand Clinical Trials Registry

Actionability v7 includes the addition of a new datasource: the Australian New Zealand Clinical Trials Registry (ANZCTR). This can be seen in the Source_Type column as a value of 9.

Numbers with a '+' at the end of each statistics denotes the increase since the last release.

 

Actionability Statistics

72
Genes fully curated (+11)
311
Genes included (+25)
1,520
Drugs (+112)
3,943
Drug combinations (+362)
6,609
Evidence from trials databases (+481)
2,685
Evidence from PubMed and other sources (+447)
154
Point mutations (+2)
734
Total mutations (+35)
3,756
Trials with results (+361)
9,021
Total trials (+655)

 

Cancer Mutation Census (CMC)

Cancer Mutation Census data has been updated for v97 release. These are the key updates:

  • Cancer Mutation Census has been updated to align the mutations with COSMIC release 97
  • The ClinVar dataset has been updated to release 2022-08
  • gnomAD exome frequencies are from release v2.1.1 and contain data from 125,748 exome samples
  • gnomAD genome frequencies have been updated to release v3.1 containing 76,156 genome samples. This release also includes a new population - Middle Eastern (MID)
  • CMC data are free for non-commercial use, downloads are available on the COSMIC download page.

 

COSMIC-3D

COSMIC-3D data has been updated for v97 release. These are the key updates:

  • COSMIC-3D has been updated to align the mutations with COSMIC release 97
  • Switch from GRCh38 to GRCh37 human genome assemblies in line with the CMC data
  • 7 census genes now have mapped structures: (ABI1, ARID2, ATP1A1, FOXA1, FOXL2, SS18, TLR5, TRRAP
  • Increased total number of mapped protein structures (pdb ids) from 50,735 to 51, 816

 

Mutational Signatures

COSMIC Mutational Signatures is a resource curated in partnership with COSMIC and Cancer Grand Challenges, and in close association with our collaborators at Wellcome Sanger Institute, the Pillay lab at University College London and the Alexandrov lab at University of California.

New for COSMIC Mutational Signatures release v3.3

We have added a novel collection of reference signatures to describe copy number variations, in total we have 24 CN signatures. Copy number signatures are defined using the 48-channel copy number classification scheme. The scheme incorporates loss-of-heterozygosity status, total copy number state, and segment length to categorise segments from allele-specific copy number profiles (as major copy number and minor copy number respectively i.e. non-phased profiles), and the signatures displayed here were identified from 9,873 tumour copy number profiles obtained from The Cancer Genome Atlas (TCGA) SNP6 array data spanning 33 cancer types.

The SBS and DBS signatures have been enriched with more topographical data and graphs, across 7 new features. These are:

  • replication timing
  • nucleosome occupancy
  • CTCF occupancy
  • histone modifications
  • replication strand asymmetry
  • and differences in genic and intergenic regions

In adding these new topographical features we overhauled the existing transcriptional strand asymmetry feature and made it possible to view a feature's respective graph in a tissue specific as well as an aggregated manner.

Other changes include:

  • reprocessed signature data files to better handle situations where the percentage for a channels was over-zealously rounded to zero
  • addition of SBS95, a sequencing artefact signature
  • improvements to the interface to better compare graphs without reloading the page

 


v96 - 31st May 2022

Key Updates

Focused curation on rare head and neck cancers:

  • 56 papers
  • 2526 new variants
  • 129 new site-histology pairs with sequencing data added

Gene focused curation:

Updates to Cancer Gene Census

Updates to Hallmark Genes

Whole Genome data

  • 9 new papers containing whole genome/exome or RNA sequencing data

 

 

Author submission

COSMIC welcomes author contributions of data as they are invaluable in supporting us to identify new genes and trends in cancer research. We actively collaborate with authors who have their publications at a submission stage. Correctly formatted variant data ensures faster inclusion of the paper in COSMIC and dissemination of the data into the research community to further empower new research. An example of such collaboration was an author submission that highlighted POLR2A. As a result of this submission, POLR2A status as a cancer gene was reviewed and it was added to the Cancer Gene Census as a Tier 2 gene and the literature reporting POLR2A mutations across all cancers was comprehensively curated for this release.
Whether our submissions report previously undescribed cancer mutations or cancer genes or well known variants in new cancer types, all papers are triaged and prioritised. Some journals require a proof of submission to COSMIC as a pre-publication requisite. However, papers are released in COSMIC only after peer-review and publication to guarantee high quality and open access status of data.

Instructions on how to submit data to COSMIC can be found here: https://cancer.sanger.ac.uk/cosmic/submissions

 

Curated Genes

POLR2A

POLR2A, the gene encoding RNA polymerase II catalytic subunit A, is a key player in meningiomas (COSP 41827).

A subset of WHO grade I meningiomas are defined by somatic hotspot mutations in p.Q403K and p.L438_H439 deletions. Germline mutations in POLR2A are associated with heterogenous multi-system disorders and p.L438_H439del is associated with the most severe phenotype. POLR2A status as a cancer gene was reviewed and it was added to the Cancer Gene Census as a Tier 2 gene for its role as a potential oncogene in meningioma. The gene seems to be commonly deleted in cancer where recurrent mutations cause widespread changes in gene expression, although no definitive evidence was found that they cause cancer. Differential expression is enriched for genes involved in the cell cycle, apoptosis and cancer-associated signalling pathways. The literature reporting POLR2A mutations across all cancers was comprehensively curated.

PRKD1

The Protein kinase D1 (14q12) gene has been added to the Cancer Gene Census as a Tier 2 gene for its role in fusions found in cribiform adenocarcinomas of the salivary gland. It is a serine/threonine protein kinase involved in several signalling pathways and many cellular processes including cell migration and differentiation, cell survival and regulation of cell shape and adhesion. Our H&N curation focus included several papers reporting recurrent p.E710D mutations in the majority of polymorphous low grade adenocarcinomas (PAC), the second most common malignant tumour of minor salivary glands (COSP 4640849877494984950049502). The p.E710D mutation is also found in a minority of cribiform adenocarcinomas of the salivary gland (CASG) (COSP 49877 & PMID 31492931), but not in more aggressive head and neck adenoid cystic carcinomas or pleomorphic adenomas (COSP 49498), nor in other solid tumours and leukaemias. A minority of PACs and a majority of CASGs carry fusions involving either PRKD1PRKD2 or PRKD3. Other PRKD1 mutations are found at lower frequencies in a variety of other tumours including breast, leukaemia, lymphoma and gastric cancers.

 

Rare head and neck cancer focus

Head and neck (H&N) cancer is a relatively uncommon type of cancer. Around 12,400 new cases are diagnosed in the UK each year (NHS) and H&N cancer accounts for 3% and 4% of the total cancer incidence in the US and Europe respectively. v96 contains data from focused curation on less common H&N cancers, for example the ones that develop in the salivary glands, sinuses, or muscle and bone in the head and neck. Variant and patient data was curated from 56 publications. The focus of the papers ranged widely, including defining mutational profiles for the tumours, their aetiology, histopathology or treatment options,and finding actionable mutations for each tumour type. From this curation, 129 new site-histology pairs with sequencing data were added to COSMIC, and a New NCI Thesaurus code has been created for Sinonasal low-grade Schneiderian papillary carcinoma in collaboration with the NCIT.

17 further publications were evaluated and are listed on the COSMIC website but data from these could not be curated for quality reasons or because they were review type publications that don't report novel variant data.

 

Cancer Gene Census (CGC)


New Census Genes (Tier 1):

ACVR1B: Tumour suppressor gene

  1. Recurrent homozygous deletions in pancreatic adenocarcinomas
  2. Knockdown in pancreatic cancer cell lines increases cell proliferation in vitro and the growth of mouse xenograft tumours

New Census Genes (Tier 2):

CTNNA1: Tumour suppressor gene, fusion gene

  1. Links transmembrane cadherins to the cell actin cytoskeleton
  2. Hemizygous deletion in myelodysplastic syndrome and acute myeloid leukaemia, and fusion with RAF1 in cutaneous melanoma
  3. Germline truncating mutations in Hereditary Diffuse Gastric Cancer Syndrome
  4. Overexpression in a deletion-bearing acute myeloid leukaemia cell line leads to G0/G1 cell cycle arrest and apoptosis

POLR2A:

  1. Largest subunit of RNA polymerase II
  2. Recurrent missense mutation and in-frame deletion in WHO grade 1 meningiomas
  3. Knockdown in an acute myeloid leukaemia cell line increases apoptosis, and alters the expression of genes involved in the cell cycle, apoptosis and cancer-associated pathways

PRKD1: fusion gene

  1. Serine/threonine protein kinase
  2. Recurrent p.E710D in polymorphous adenocarcinoma of the palatal minor salivary glands, and fusions with PRKAR2A and SNX9 in cribriform adenocarcinoma of the minor salivary glands
  3. p.E710D increases kinase activity in a cell-free assay, whilst overexpression of the wild-type and p.E710D mutant genes in breast epithelial cells increases cell proliferation

Hallmarks of Cancer

Based on the concept defined by D. Hanahan and R. A. Weinberg, COSMIC, in collaboration with Open Targets, integrates functional descriptions focused on Hallmarks of Cancer into the CGC. The Hallmark pages visually explain the role of a gene in cancer by highlighting which of the classic behaviours are displayed by the gene and whether they are promoted or suppressed.

MAP2K4MAXMLH1MPLMSH2MSH6MYOD1NCOA2NCOR1NTRK1

 

Systematic Screen Papers

Follow links below to the 9 papers which are new in v96, or view the full table of papers here.

COSP41122COSP45420COSP41827COSP48664COSP49502COSP47849COSP46408COSP45485COSP49477

 

COSMIC Statistics

23,399,170
Total genomic variants (COSV) (+5982)
15,988,672
Genomic non-coding variants (+3693)
5,002,604
Genomic mutations within exons (coding variants) (+4336)
8,704,304
Genomic mutations within intronic and other intragenic regions (+2635)
1,509,678
Samples (+3686)
28,694
Papers (+143)
19,422
Fusions
318,016
Genomic Rearrangements
1,207,190
Copy number variants
9,215,470
Gene expression variants
7,930,489
Differentially methylated CpGs

 


For reference, release notes for earlier versions are available on the Release Notes Archive page. However, please note that these versions are no longer available to download, are not supported, and the release notes may link or refer to pages which are now obsolete.