Mutation

This page outlines raw data and the subsequent processing for use within the Cancer Dependency Map at Sanger.


Raw Data

Dataset Origin Data Type Model Type Details Link
Whole Exome Sequencing Sanger BAM Cell Line Illumina HiSeq 2000 EGAS00001000978
Targeted Gene Squencing Sanger CRAM Organoid Illumina HiSeq 4000 EGAS00001002221
Whole Genome Sequencing Sanger CRAM Organoid Illumina HiSeq 4000 EGAS00001002222

Processed Data

Descriptions of how the raw data was processed including algorithms and filtering. Processed datasets can be downloaded here, the active dataset can also be accessed using the DepMap web resources and API.


Whole Exome Sequencing Data (Cell Lines)

After sequencing, variants were identified by comparison to a reference genome*. Differences from the reference genome were identified using the CaVEMan and Pindel algorithms identifying substitution and small insertions/deletions, respectively (https://github.com/cancerit). The resulting variants were then screened against approximately 8,000 normal samples to remove sequencing artefacts and germline variants (428 in-house normal exomes, 6500 normal exomes (NHLBI GO Exome Sequencing Project, June 20th2012 release), 1000 genomes project (29thMarch 2012 release)) as well as variants in the DBSNP database (only those with associated minor allele frequency).

*A matched lympoblastoid line was used as the reference genome for a small subset (n=39) of the cell lines where this was available.

The remaining putatively somatic variants were classed as validated if present in other large scale cell-line sequencing datasets (the CCLE targeted sequencing (Barretina et al., 2012), NCI60 exome sequencing or previous capillary sequencing of 70 known cancer genes across 770 cell lines (Garnett et al., 2012)) and all such validated variants, together with other high quality variants (read depth ≥15 and a mutant allele burden ≥15% with no reads in the reference normal) were entered into the COSMIC cell line project database. Additional validation was carried out for putatively oncogenic ‘low confidence’ variants seen in genes listed in the Cancer Gene Census (v67). Several transcripts are listed in COSMIC for some genes and this results in duplication of variants when exported, such duplicates were removed from the dataset.

Publication reference: Iorio et al. A Landscape of Pharmacogenomic Interactions in Cancer. Cell, 2016.


Targeted Gene Sequencing Data (Organoids)

Sequencing is performed on samples obtained from the established organoid model at the time of banking.

Somatic mutations from targeted pulldown sequencing of our v4 panel are called using our CaVEMan and Pindel algorithms. Sequencing data from a matched normal sample and a panel of unrelated normal samples are used as references (GRCh37) to discard germline variants and technology specific artefacts.

Somatic variants reported by these algorithms are flagged by filters designed to detect common causes of false positives. This filtering was performed using cgpCaVEManPostProcessing (https://github.com/cancerit) Those variants which pass all filters are given a PASS flag.

Only variants with the PASS flag are uploaded to the DepMap API.


Whole Genome Sequencing Data (Organoids)

Sequencing is performed on samples obtained from the established organoid model at the time of banking.

Somatic mutations from whole genome sequencing are called using our CaVEMan and Pindel algorithms. Sequencing data from a matched normal sample and a panel of unrelated normal samples are used as references (GRCh38) to discard germline variants and technology specific artefacts.

Somatic variants reported by these algorithms are flagged by filters designed to detect common causes of false positives. This filtering was performed using cgpCaVEManPostProcessing (https://github.com/cancerit) Those variants which pass all filters are given a PASS flag.

Only variants with the PASS flag are uploaded to the DepMap API.


Dataset Annotation & Integration

For further documentation on the annotation of genomic data set and model authentication see the links below.