Copy Number

This page outlines raw data and the subsequent processing for use within the Cancer Dependency Map at Sanger.

Raw Data

Dataset Origin Data Type Model Type Details Link
Copy Number Sanger CEL Cell Line Affymetrix SNP6 EGAS00001000978

Processed Data

Descriptions of how the raw data was processed including algorithms and filtering. Processed datasets can be downloaded here, the active dataset can also be accessed using the DepMap web resources and API.

Affymetrix SNP6 Data

Segment copy number data was downloaded from the TCGA (Cancer Genome Atlas Research Network et al., 2013) (8,182 samples) and analysed with ADMIRE (van Dyk et al., 2013). The cohorts of COAD and READ were merged due to their high similarity in tissue type and response profile. The ADMIRE analysis results comprised copy number segments statistically different from expectation. Filter criteria were defined to focus the analysis on potential driver segments. The filter list required the segments to include at least one protein coding or antisense gene, but no more than 100 of them. It required the deletions to include an exon (a proxy for gene disruption) and amplifications to span a gene (as sub-genic amplifications are unlikely to be functional). The false discovery rate (FDR) controlled p-value was required to be smaller than 0.05, and the segment was required to be at recursion level two or higher unless it was a top-level segment. To ensure clinical relevance, the identified segment needed to be affected in at least 2.5% of the subjects. The latter was evaluated on two levels, using the overall background variance, and using the local background variance. The first was calculated on the log2 values not part of any identified segment, regardless of filtering. The second was calculated on the recursion level below the identified segment. Within each tumor type the segments obtained after filtering (Table S2D) were further compacted by pruning all overlapping segments such that only the shortest were retained. This results in a fairly concise set of segments per tumor type. The pan-cancer set of segments was derived from the entire collection of filtered cancer specific segments, but only the largest overlapping segment was retained (Table S2E).

Publication reference: Iorio et al. A Landscape of Pharmacogenomic Interactions in Cancer. Cell, 2016.