Data Library

Are you looking to access preclinical data to further advance understanding of human disease and safety? We are providing access to data sets to enhance understanding of translation to human efficacy and safety.

We offer access to preclinical data sets on our early development compounds for data mining and research purposes. The aim is to enhance understanding of translation to human efficacy and safety. 

Available data sets:

  • Preclinical safety data:  contains in vivo data in standard models to provide insight into compounds and explore relationships in data to better understand preclinical safety profiles and translation to human safety
  • Oncology combinations data:  contains 11,000 data points from over 100 oncology drugs tested in combination, for the purpose of assessing and predicting drug combination synergies.

Interested investigators are invited to:

  • Learn more about the available data sets through the information on this site
  • Submit a brief proposal on how you intend to use the data set
  • Access our data once your application has been approved

Preclinical safety data

In order to enhance understanding of translation to human efficacy and safety, we offer access to preclinical data sets on our early development compounds for data mining and research purposes. 

Types of preclinical data available include in vivo animal studies in rodent and non-rodent species, which are common regulatory requirements to support progression to clinical studies. This comprises of:

  • Standard acute and repeat dose studies with clinical observations
  • Serum chemistry and urinalysis measures
  • Gross pathology and histopathology observations
  • Administered doses and, where available, toxicokinetic measurements

Let's partner in preclinical safety

We invite toxicologists as well as data and computer scientists to partner with us, with each other or with public or private funding bodies to brainstorm, mine and develop predictive models for preclinical safety. The aim is to generate greater insight into how we can better predict and measure the safety of novel medicines for the treatment of human disease. Typical activities could include development of novel safety assays, translational models or drug repositioning ideas.

Oncology combinations data

In order to accelerate the understanding of oncology drug combination synergies, AstraZeneca is sharing over 11,000 pre-clinical pharmacology data points.  These data enable you to explore fundamental traits that underlie effective combination treatments and synergistic drug behavior.

Available data:

  • Phenotypic (cell viability) data from over 11,000 experiments testing over 100 drugs paired at various dose combinations in up to 85 cancer cell lines, primarily colon, lung, and breast cancer. Comprehensive monotherapy drug response data for each drug and cell line.
  • “Synergy score” comparing drug combination to respective monotherapy effects in each cell line.
  • Target and chemical properties of drugs including gene names of protein target, molecular weight, H-bond acceptors, H-bond donors, cLogP, Lipinski's rule of 5.
  • Ability to link to deep molecular profiles for respective cell panels in public resources such as GDSC/COSMIC and CCLE.

Let's partner to uncover new oncology combinations

Are you a scientist striving to identify novel oncology drug combinations? Do you hope to understand the fundamental traits that underlie effective drug combinations?  Do you aim to identify patients most likely to benefit from drug combinations?  If so, we invite you to submit a proposal outlining how you think analysis of our data can help uncover new paths. We encourage you to include your own background knowledge and data into the analysis.

Transcriptomic profiling data

We have completed a project where a set of 32 compounds were assessed against two cell lines for their RNA signature profiles. This is only the start of this project as the power is in the analytical interpretation of the data. We have therefore decided to share these data to allow other groups access to the data and enable further modelling.

The data available comprises:

·       32 compounds that have been profiled against two cells lines at two concentrations (high dose and low dose)

·       The cell lines used were A549 and MCF7

·       Compounds were assessed as monotherapies

·       RNAseq data; raw data files will be sent with corresponding identifiers

The data can be released as blinded or unblinded regarding the mechanism of action of the compounds, depending on the analysis request.

Let’s collaborate in Data Analytics

Are you a data, analytical or computer science research group that profiles data sets with algorithms to interrogate patterns in data and link to biological outcomes? We would encourage these data to be combined with other data sets to expand the biological outcome. Our aim is to create a wider insight into these data, the mechanisms and potential benefit to the patient and we invite you to submit a proposal outlining how your analysis can unlock new insight into these data.