Content: Drug Sensitivity Datasets Screening & Data Generation GDSC1 GDSC2 Media Media MultipleIDs DataProcessing Analysis

Drug Sensitivity

This page provides descriptions and documentation for the drug sensitivity data within the Cancer Dependency Map at Sanger.

Drug Sensitivity Datasets

Dataset Name	Origin	Model Type	No. Models	No. Compounds	Assay	Duration
GDSC1	Sanger	Cell Line	970	403	Resazurin or Syto60	72 hours
GDSC2	Sanger	Cell Line	969	297	CellTitre-Glo	72 hours

Screening & Data Generation

Descriptions of how the screens were performed for each of the available datasets.

GDSC1

The GDSC1 dataset was generated jointly by the Wellcome Sanger Institute and Massachusetts General Hospital (MGH) between 2009 and 2015 using a matched set of cancer cell lines (the GDSC1000). GDSC1 was published by Iorio et al. (Cell 2016).

Compounds were provided by industry, academic collaborators or sourced from commercial vendors. Compounds were stored in aliquots at -80°C and were subjected to a maximum of 5 freeze-thaw cycles.

Cells are seeded at an optimised density in medium with 5% or 10% FBS and 1% penicillin/streptomycin. The optimal cell number for each cell line is determined to ensure that it is in growth-phase at the end of the assay and to maximise the dynamic range of endpoint measurements. 24 hours after plating, cells are treated with a dose titration of each compound, except for lines screened at MGH where drugging occurs the same day as plating. Following drugging, plates are returned to the incubator for assay at a 72-hour time point. (Cell-lines screened at MGH are drugged the same day as plating). Cell viability is determined using either a DNA dye (Syto60) or metabolic assay (Resazurin). All screening plates are subject to stringent quality control measures.

Cells were seeded in 96-well or 384-well plates and compound dose titrations were delivered using tip based liquid handling apparatus. Cell viability was measured using either Syto60 or Resazurin. Drug treatments in this dataset used two formats:

9-point dose curve incorporating a 2-fold dilution step (256-fold range)
5-point dose curve incorporating a 4-fold dilution step (256-fold range)

GDSC2

GDSC2 has been generated at the Wellcome Sanger Institute since 2015 following improvements to the previous GDSC1 screen design and assay.

Compounds were provided by industry, academic collaborators or sourced from commercial vendors. Compounds are stored in Storage Pods (Roylan Developments) providing a moisture-free, low oxygen environment, and protection from UV damage.

Cells are seeded in 1536-well plates at an optimised density in RPMI or DMEM/F12 medium with 10% FBS and 1% penicillin/streptomycin (see below for detailed composition), and maintained at 37°C in a humidified atmosphere at 5% CO2. Cell lines were propagated in these two media in order to minimize the potential effect of varying the media on sensitivity to therapeutic compounds in the assay, and to facilitate high-throughput screening. The optimal cell number for each cell line is determined to ensure that it is in growth-phase at the end of the assay and to maximise the dynamic range of endpoint measurements.

24 hours after plating, cells are treated with a dose titration of each compound delivered using an Echo555 Acoustic Dispenser (Beckman). Drug treatments use one of the following dose response formats:

7-point dose curve incorporating a half-log dilution step (1000-fold range)
7-point dose curve with 2x 2-fold dilutions followed by 4x 4-fold dilutions (1024-fold range)

Following drugging, plates are returned to the incubator and cell viability is determined after 72 hours using a metabolic assay (CellTiter-Glo). All screening plates are subject to stringent quality control measures.

RPMI:

RPMI Base Media
10% FBS
1% PenStrep
4.5mg/ml Glucose
1mM Sodium Pyruvate

D/F12:

DMEM/F12 Base Media
10% FBS
1% PenStrep

Compound Annotation

The Target refers to the nominal therapeutic target(s) of a compound. In many, if not all, instances compounds have additional targets not listed. The Target Pathway has been manually curated based on our current understanding of cancer biology, therapeutic application and the biological processes in disease.

Compounds with Multiple Entries

Both the GDSC1 and GDSC2 datasets contain multiple results for the same compound associated different Drug ID's.

For GDSC1 this is due to the screening being a combined effort at the Massachusetts General Hospital (Boston, USA) and the Wellcome Sanger Institute (Cambridge, UK). Some compounds were screened at both sites, resulting in two Drug IDs for the same compound.

Within the GDSC2 dataset multiple entries are present due to internal tracking procedures.

Data Processing & Analysis

Descriptions of the curve fitting and analysis performed.

Analysis

Datasets were analysed independently. Raw viability readouts were processed using the R package gdscIC50. Viability data are normalized per plate using available negative and positive controls:

GDSC1 - negative controls were treatments with media alone, and the positive controls were blank wells with media but no cells.
GDSC2 - negative controls were treatments of the cells with media + DMSO (the compound vehicle in most cases), the positive controls were blank wells with media but no cells.

Dose-response curves are fitted using the non-linear mixed effects model of Vis et al., incorporated in the gdscIC50 package. All available replicates for an individual experiment (cell model + compound) are used to fit each curve and obtain IC50 and AUC estimates.

Curve Fitting

Intensity data from screening plates for each dose response curve is fitted using a multi-level fixed effect model of Vis et al.. The viability of the concentration dilution series is assumed to be sigmoidal, the classical dose-response S-shape. This function is fitted to all of the cell model - compound combinations screened. In the multilevel mixed effect model two parameters are used to describe the sigmoidal curve. However, instead of fitting each dose-response series in isolation, the complete set with all combinations of cell models - compounds screened, is fitted simultaneously. The shape parameter varies only across cell models, while the position parameter varies across cell models and compounds. This is a faithful and efficient representation of the data, but most importantly, it allows for borrowing strength by using all observations, which in turn allows for more accurate IC50 estimates.

Fitted IC50 Data Definitions

Definitions for the fields present in the GDSC1 & GDSC2 fitted data results files.

Field Name	Description
DATASET	Indicates the dataset name. Datasets are processed as a whole.
NLME_RESULT_ID	Identifier for the fitted results.
NLME_CURVE_ID	Identifier for the fitted dose response.
CELL_LINE_NAME	Model name.
SANGER_MODEL_ID	Sanger DepMap model identifier.
CANCER_TYPE	Sanger DepMap cancer type annotation.
DRUG_ID	Drug identifier.
DRUG_NAME	Primary drug name.
PUTATIVE_TARGET	Putative drug target.
PATHWAY_NAME	Pathway assigned to the putative drug target.
MIN_CONC	Minimum micromolar screening concentration of the drug within the dataset.
MAX_CONC	Maximum micromolar screening concentration of the drug within the dataset.
LN_IC50	Natural log of the fitted IC50.
AUC	Area Under the Curve for the fitted model. Presented as a fraction of the total area between the highest and lowest screening concentration.
RMSE	fraction of the total area between the highest and lowestRoot Mean Squared Error, a measurement of how well the modelled curve fits the data points. Curves with RMSE > 0.3 are excluded prior to release as part of quality control.
Z_SCORE	Z score of the LN_IC50 comparing it to the mean and standard deviation of the LN_IC50 values for the drug in question over all models treated.

DepMap Documentation

Drug Sensitivity