What is the Cell Lines Project?

COSMIC – the Catalogue of Somatic Mutations in Cancer – is the world's largest source of expert manually curated somatic mutation information relating to human cancers. COSMIC includes the COSMIC database and the Cell Lines Project, two separate but related entities. This page discusses the Cell Lines Project (green pages); please see our dedicated page for further information on COSMIC.

Overview

The Cell lines site provides an interface for the Cell Lines Project, based at the Wellcome Sanger Institute. The Cell Lines Project:

  • Started with the aim of improving the utility of cancer cell lines through standardisation
  • Provides a systematic characterisation of the genetics and genomics of large numbers of cancer cell lines
  • Data from the full exome sequencing of 1020 cancer cell lines
  • Molecular characterisation includes:
    • CNV
    • Gene expression
    • CpG methylation (coming soon)
    • RNASeq (coming soon)
  • The cancer cell lines are from:
    • Major publicly-accessible repositories from around the world
    • A few lines which are not publically available
  • Designed to encompass a broad range of tumour types and includes most cell lines that have been used extensively in cancer research, including the NCI-60 set
  • Description of the methodology using in the project:

Website Access and Tools

Exploring the Cell Lines website can provide a useful insight into what data are available. Key aspects of the website include:

  • Free access for all users
  • Dedicated tools to help you explore the data, including:
    • Cell Lines Browser
      The Cell lines browser allows you to explore the different cell lines included in the project, providing a good way to see which cell lines are included or to look at a specific cell line.
    • NCI-60 Browser
      The NCI-60 browser allows you to explore the cell lines from the NCI-60 panel.
    • Cancer Browser
      The cancer browser allows for mutations to be explored by tissue type and histology, in order to give a disease-specific perspective.
    • Genome Browser
      The genome browser provides a genome wide perspective to cancer genomics. Different variant tracks can be turned on or off at the user's discretion.
    • CONAN
      CONAN (the copy number analysis tool) searches for loss of heterozygosity, homozygous deletions and amplifications across the COSMIC dataset. All samples have been analysed with PICNIC or ASCAT.
  • All aspects are available as both GRCh37 and GRCh38.

Download Access

Downloading cell lines data gives you the freedom to interrogate the data bioinformatically outside of the inherent restrictions of the website, and allows for integration with other tools or data.

Key aspects of the downloadable files include:

  • Free registration for academic use; commercial use requires a licence.
  • Complete, one-click file downloads and filtered files are available:
    • 11 files that divide the data into logical categories, such as ‘Complete Mutation Data’, ‘Non Coding Variants’ and ‘Gene Expression data’
    • Filter files by gene, tissue or sample of interest
    • Complete Oracle database
  • Available for both GRCh37 or GRCh38.
  • All the files available with a description.

Download a sample of COSMIC data

We have made the first 100 lines of each of the download files freely available so you can try out the data. You can download the data sample on the "About" page. Full descriptions of what is in the complete download files are also available.