Content: FAQ's Data General Gene Mapping & Annotation Mutation Data Models

FAQ

Frequently Asked Questions

Where can I download the data?	Genomic datasets, model annotation and reference files can be accessed here: Downloads Drug sensitivity data can be accessed via the GDSC site: Drug Sensitivity Data
How do I access the raw data?	Links to repositories hosting the raw genomic data can found in each datasets dedicated documentation page.

How can I map SIDG's with the corresponding gene symbols?

A list of all genes used in the Dependency Map at Sanger, with HUGO, Ensembl, Entrez and Refseq annotation can be found here.

I see duplicate entries/rows in the mutation data, why is this?

There are two possible causes for this:

If a model has been sequenced more than once (for instance WES and WGS) and the same mutation has been found through both methods, this will be recorded twice. This would manifest itself by having many duplicate mutations for a particular set of models.
If a model has multiple mutations in a single gene, but both don't affect the RNA/protein sequence, they might appear as duplicates. These cases can be confirmed by checking that the content of the RNA and amino acid columns is effectively blank.

I can see mutiple entries for a single model in the Sanger WES .bam files on the EGA: EGAD00001001039. Why is this?

This is because there are more than one lane sequenced(multiplexed) for the same sample. These need to be merged after alignment.

How do I map the cell lines used within the Dependecy Map at Sanger with other resources?

The Cell Model Passports contains identifiers for the Broad, Cellosaurus and COSMIC. These can be downloaded here: