This page outlines raw data and the subsequent processing for use within the Cancer Dependency Map at Sanger.

Raw Data

Dataset Origin Data Type Model Type Details Link
RNA-Seq Sanger BAM Cell Line Illumina HiSeq 2000 EGAS00001000828
RNA-Seq Broad BAM Cell Line

Processed Data

Descriptions of how the raw data was processed including algorithms and filtering. Processed datasets can be downloaded here, the active dataset can also be accessed using the DepMap web resources and API.

RNA-Seq (Cell Lines)

RNA-seq data collated from the Wellcome Sanger Institute and the Broad Institute (1) were processed using the iRAP pipeline (2). The original datasets for each institute are available for download with read counts and FPKM (fragments per kilobase million) values.

Data presented through the API and Cell Model Passports website combines the Sanger and Broad datasets. Where cell models have been screened at both institutes, the Sanger data has been selected for the merged dataset. Separate files for both Sanger and Broad data are available for download. The data contains read counts, FPKM and also TPM (transcripts per million) values.

Publication reference (1): Garcia-Alonso L, Iorio F, Matchan A, et al. Transcription Factor Activities Enhance Markers of Drug Sensitivity in Cancer. Cancer Research. 2018 Feb;78(3):769-780.

Publication reference (2): Fonseca NA, Petryszak R, Marioni J, Brazma A. iRAP - an integrated RNA-seq Analysis Pipeline. bioRxiv; 2014.

Experimental Method (Cell Lines - Sanger Data)

For sequencing performed at the Sanger Institute, cell line pellets were collected during exponential growth in RPMI or Dulbecco’s Modified Eagle’s Medium/F12 and were lysed with TRIzol (Life Technologies) and stored at −70 °C. Following chloroform extraction, total RNA was isolated using the RNeasy Mini Kit (Qiagen). DNAse digestion was followed by the RNAClean Kit (Agencourt Bioscience). RNA integrity was confirmed on a Bioanalyzer 2100 (Agilent Technologies) prior to labeling using 3′ IVT Express (Affymetrix).  Sequence libraries were prepared in an automated fashion on the Agilent Bravo platform using the stranded mRNA Library Prep Kit from KAPA Biosystems. Processing steps were unchanged from those specified in the KAPA manual, except for use of an in-house indexing set.

Publication reference: Picco, G., Chen, E.D., Alonso, L.G. et al. Functional linkage of gene fusions to cancer cell fitness assessed by pharmacological and CRISPR-Cas9 screening. Nature Communications, (2019).