Drug Sensitivity
This page provides descriptions and documentation for the drug sensitivity data within the Cancer Dependency Map at Sanger.
Drug Sensitivity Datasets
| Dataset Name | Origin | Model Type | No. Models | No. Compounds | Assay | Duration |
|---|---|---|---|---|---|---|
| GDSC1 | Sanger | Cell Line | 970 | 403 | Resazurin or Syto60 | 72 hours |
| GDSC2 | Sanger | Cell Line | 969 | 297 | CellTitre-Glo | 72 hours |
Screening & Data Generation
Descriptions of how the screens were performed for each of the available datasets.
GDSC1
The GDSC1 dataset was generated jointly by the Wellcome Sanger Institute and Massachusetts General Hospital (MGH) between 2009 and 2015 using a matched set of cancer cell lines (the GDSC1000). GDSC1 was published by Iorio et al. (Cell 2016).
Compounds were provided by industry, academic collaborators or sourced from commercial vendors. Compounds were stored in aliquots at -80°C and were subjected to a maximum of 5 freeze-thaw cycles.
Cells are seeded at an optimised density in medium with 5% or 10% FBS and 1% penicillin/streptomycin. The optimal cell number for each cell line is determined to ensure that it is in growth-phase at the end of the assay and to maximise the dynamic range of endpoint measurements. 24 hours after plating, cells are treated with a dose titration of each compound, except for lines screened at MGH where drugging occurs the same day as plating. Following drugging, plates are returned to the incubator for assay at a 72-hour time point. (Cell-lines screened at MGH are drugged the same day as plating). Cell viability is determined using either a DNA dye (Syto60) or metabolic assay (Resazurin). All screening plates are subject to stringent quality control measures.
Cells were seeded in 96-well or 384-well plates and compound dose titrations were delivered using tip based liquid handling apparatus. Cell viability was measured using either Syto60 or Resazurin. Drug treatments in this dataset used two formats:
- 9-point dose curve incorporating a 2-fold dilution step (256-fold range)
- 5-point dose curve incorporating a 4-fold dilution step (256-fold range)
GDSC2
GDSC2 has been generated at the Wellcome Sanger Institute since 2015 following improvements to the previous GDSC1 screen design and assay.
Compounds were provided by industry, academic collaborators or sourced from commercial vendors. Compounds are stored in Storage Pods (Roylan Developments) providing a moisture-free, low oxygen environment, and protection from UV damage.
Cells are seeded in 1536-well plates at an optimised density in RPMI or DMEM/F12 medium with 10% FBS and 1% penicillin/streptomycin (see below for detailed composition), and maintained at 37°C in a humidified atmosphere at 5% CO2. Cell lines were propagated in these two media in order to minimize the potential effect of varying the media on sensitivity to therapeutic compounds in the assay, and to facilitate high-throughput screening. The optimal cell number for each cell line is determined to ensure that it is in growth-phase at the end of the assay and to maximise the dynamic range of endpoint measurements.
24 hours after plating, cells are treated with a dose titration of each compound delivered using an Echo555 Acoustic Dispenser (Beckman). Drug treatments use one of the following dose response formats:
- 7-point dose curve incorporating a half-log dilution step (1000-fold range)
- 7-point dose curve with 2x 2-fold dilutions followed by 4x 4-fold dilutions (1024-fold range)
Following drugging, plates are returned to the incubator and cell viability is determined after 72 hours using a metabolic assay (CellTiter-Glo). All screening plates are subject to stringent quality control measures.
RPMI:
- RPMI Base Media
- 10% FBS
- 1% PenStrep
- 4.5mg/ml Glucose
- 1mM Sodium Pyruvate
D/F12:
- DMEM/F12 Base Media
- 10% FBS
- 1% PenStrep
Compound Annotation
The Target refers to the nominal therapeutic target(s) of a compound. In many, if not all, instances compounds have additional targets not listed. The Target Pathway has been manually curated based on our current understanding of cancer biology, therapeutic application and the biological processes in disease.
Compounds with Multiple Entries
Both the GDSC1 and GDSC2 datasets contain multiple results for the same compound associated different Drug ID's.
For GDSC1 this is due to the screening being a combined effort at the Massachusetts General Hospital (Boston, USA) and the Wellcome Sanger Institute (Cambridge, UK). Some compounds were screened at both sites, resulting in two Drug IDs for the same compound.
Within the GDSC2 dataset multiple entries are present due to internal tracking procedures.
Data Processing & Analysis
Descriptions of the curve fitting and analysis performed.
Analysis
Datasets were analysed independently. Raw viability readouts were processed using the R package gdscIC50. Viability data are normalized per plate using available negative and positive controls:
- GDSC1 - negative controls were treatments with media alone, and the positive controls were blank wells with media but no cells.
- GDSC2 - negative controls were treatments of the cells with media + DMSO (the compound vehicle in most cases), the positive controls were blank wells with media but no cells.
Dose-response curves are fitted using the non-linear mixed effects model of Vis et al., incorporated in the gdscIC50 package. All available replicates for an individual experiment (cell model + compound) are used to fit each curve and obtain IC50 and AUC estimates.
Curve Fitting
Intensity data from screening plates for each dose response curve is fitted using a multi-level fixed effect model of Vis et al.. The viability of the concentration dilution series is assumed to be sigmoidal, the classical dose-response S-shape. This function is fitted to all of the cell model - compound combinations screened. In the multilevel mixed effect model two parameters are used to describe the sigmoidal curve. However, instead of fitting each dose-response series in isolation, the complete set with all combinations of cell models - compounds screened, is fitted simultaneously. The shape parameter varies only across cell models, while the position parameter varies across cell models and compounds. This is a faithful and efficient representation of the data, but most importantly, it allows for borrowing strength by using all observations, which in turn allows for more accurate IC50 estimates.
Fitted IC50 Data Definitions
Definitions for the fields present in the GDSC1 & GDSC2 fitted data results files.
| Field Name | Description |
|---|---|
| DATASET | Indicates the dataset name. Datasets are processed as a whole. |
| NLME_RESULT_ID | Identifier for the fitted results. |
| NLME_CURVE_ID | Identifier for the fitted dose response. |
| CELL_LINE_NAME | Model name. |
| SANGER_MODEL_ID | Sanger DepMap model identifier. |
| CANCER_TYPE | Sanger DepMap cancer type annotation. |
| DRUG_ID | Drug identifier. |
| DRUG_NAME | Primary drug name. |
| PUTATIVE_TARGET | Putative drug target. |
| PATHWAY_NAME | Pathway assigned to the putative drug target. |
| MIN_CONC | Minimum micromolar screening concentration of the drug within the dataset. |
| MAX_CONC | Maximum micromolar screening concentration of the drug within the dataset. |
| LN_IC50 | Natural log of the fitted IC50. |
| AUC | Area Under the Curve for the fitted model. Presented as a fraction of the total area between the highest and lowest screening concentration. |
| RMSE | fraction of the total area between the highest and lowestRoot Mean Squared Error, a measurement of how well the modelled curve fits the data points. Curves with RMSE > 0.3 are excluded prior to release as part of quality control. |
| Z_SCORE | Z score of the LN_IC50 comparing it to the mean and standard deviation of the LN_IC50 values for the drug in question over all models treated. |