The Role of External Data in Oncology Drug Development
August 1, 2024
Randomized controlled trials (RCTs) remain the gold standard for the evaluation of the safety and effectiveness of a new treatment. However, in a number of cases alternative approaches leveraging external data (i.e., data from outside of a clinical trial) — ranging from single arm trials to augmented RCTs — can be appropriate. Here, we discuss how to leverage and incorporate external data in drug development, focusing on the use of external control arms and Bayesian borrowing.
Use of external data in drug development
In early phases of drug development, single arm trials with external controls/comparisons can be used for efficacy signal-finding or dose selection to inform internal decision-making. Single arm trials with external control or augmented trials can be used in later phases for establishing natural history of disease, isolation of treatment effect or supportive comparative efficacy.
For example, in rare diseases, rare genetic subtypes of a disease, or in a debilitating and life threating disease, it may be not feasible or ethical to enroll the number of patients required for a conventional well-powered RCT. In the post-approval setting, studies with external controls can be used for comparative effectiveness to inform reimbursement decisions and clinical guidelines.
Multi-step process to incorporate external data
Leveraging external data in drug development requires thoughtful evaluation and planning.
To begin, define the key objectives and assess the clinical rationale by asking the following questions:
- Can the clinical research objectives be reasonably achieved with a traditional RCT?
- Is it ethical and feasible to randomize patients to a control arm?
- Is it feasible to enroll the number of patients required for a powered RCT?
- Is it feasible for patients to complete the required follow-up period?
The next key question is whether external data are available and fit-for-purpose. If available, integrating external data requires developing a clinical trial protocol and a detailed statistical analysis plan. It is recommended to engage with regulatory authorities to align on these approaches prior to the study initiation.
Augmented RCTs
In the context of clinical trials, borrowing means the use of external data (external with respect to data generated within the trial) to perform statistical inference or make decisions. Borrowing for a control arm is sometimes used in clinical trials as an effective means to reduce the number of patients randomized to the control arm. Single-arm trials are an extreme example of borrowing (100% borrowing); however, the absence of randomization can introduce bias. Therefore, concurrent randomization alongside borrowing could be a more appropriate strategy.
The Bayesian framework provides methods for different types of borrowing:
- Borrowing from data that is external to current trial. It can be applied to a wide range of trials and is based on the use of external data summarized in the form of informative prior.
- Borrowing from other arms/periods within the same treatment, as seen in Basket trials that investigate a drug across multiple indications.
- Borrowing from other treatment arms and periods; mainly used in Platform trials.
Data from other RCTs, natural history studies, real-world data, and other external data can be summarized in the form of informative prior including robust meta-analytic predictive priors, power priors, and commensurate priors.
The robustification of prior helps to reduce bias in case of prior-data conflict — a mismatch between the current trial data and external data. This robustification can be achieved by adding the non-informative component with weights which are determined to obtain desirable frequentists type I error. Thereby, the effective sample size of the historical data is controlled via robustification.
External Control Arms
For regulatory submissions, comparisons that are not available as randomized clinical trials can be made through careful harmonization, patient selection, and adjustment for confounding factors. Comparisons with single-arm trials are used for rare diseases, accelerated approval, comparisons using non-randomized study arms, and comparisons with different or changing standards of care.
Several adjustment methods can be used to achieve balance between cohorts, such as matching or using weights to create a population where the distribution of measured baseline covariates is independent of treatment assignment. These methods are often based on the propensity score (probability of treatment assignment conditional on observed covariates).
Click here to read our previous blog to learn more.
Additionally, when sample sizes are limited, Bayesian borrowing can be employed to supplement target cohorts in an external control arm comparison using additional data sources.¹ The advantage of Bayesian borrowing for external control arms is its potential to increase the power and precision of the target comparison. It addresses a commonly encountered scenario where no single fit-for-purpose data source has a large pool of eligible patients, necessitating the use of multiple data sources, each with their distinct advantages and disadvantages. Concepts regarding the specification of priors are analogous to those described in the previous section.
As with ECAs, whether a comparison can “emulate” an RCT depends on multiple factors, from the data to the methodology, and understanding of the therapeutic area or indication. Thus, great care should be exercised when designing such analyses, with appropriate caution in interpreting the results in the context of any challenges in the study.
Quantitative bias analysis: A key component for reliable external comparisons
Quantitative bias analysis (QBA) is a set of statistical techniques used to assess the potential impact of a wide range of systematic errors (biases) on the results of a research study. QBA goes beyond simply acknowledging the possibility or qualitative descriptions of bias; it allows researchers to quantify the magnitude and direction of the bias, giving a far more complete and nuanced picture of the uncertainty associated with a study.
Figure 2 shows two general concepts in QBA in a common scenario where the hazard ratio (HR) is being estimated, and there are both measured confounders and known unmeasured confounders. External adjustment (left) refers to methods that incorporate the effect of unmeasured confounders in the HR estimate. Tipping point analysis (right) refers to QBAs where after adjusting for measured confounders and estimating the HR, the strength of confounding required to tip the HR to 1 (either point estimate or upper confidence interval) is quantified. The plausibility of such scenarios would then be evaluated.
Click here to read our previous blog to learn more.
External Data Strategy
Different types of external data sources can be used in clinical trials, including:
- Randomized clinical trials: Comparator or placebo groups from trials with very similar design, inclusion criteria and endpoints.
- Real-world data: Electronic health record databases, administrative claims, registries or prospective cohorts and retrospective studies.
Historical and contemporaneous controls could be used as well. A historical control (based on data from patients treated earlier) may be appropriate when standards of care are well-defined and disease management and outcomes have remained stable over time. Contemporaneous controls (based on concurrent data with the data from the clinical trial) are more suitable for rapidly developing areas of standards of care where management is changing.
It is important to plan a thorough fit-for-purpose assessment. Formal guidance on the use of external controls issued by regulatory and health technology assessment bodies emphasizes the need for a systematic and transparent process for data identification and selection, and the use of data suitability and reporting frameworks, such as SPIFD² or DataSAT³ , is strongly encouraged.
Detailed clinical data are needed for building external comparators in oncology and rare diseases, for example, genetic biomarker data or data on disease progression and response. Advances in data technology, such as linkage through tokenization and the increasing use of clinical artificial intelligence and large language models that extract data from unstructured physician notes, have enabled the building of comprehensive datasets for the identification of narrow patient cohorts that match clinical trial participants. Current fit-for-purpose data strategies should allow for these new technologies to be considered, although further formal guidance from decision making bodies is needed.
References:
[1] https://becarispublishing.com/doi/10.57264/cer-2023-0175
[3] https://www.nice.org.uk/corporate/ecd9/chapter/appendix-1-data-suitability-assessment-tool-datasat
ECAs have been widely applied in oncology and rare indications to contextualize evidence from single-arm trials in regulatory and HTA submissions. Bayesian borrowing methods can be used to reduce the sample size of a control arm, allowing the randomization of more patients to the active treatment, and can also be used to combine multiple data sources into a single control arm. In our on-demand webinar, we discuss the uses of ECAs and Bayesian borrowing alongside the product lifecycle from clinical development to HTA applications.
Watch WebinarSubscribe to our newsletter
Natalia Mühlemann
Vice President, Clinical Development
Natalia Mühlemann brings over 20 years of experience in general management, clinical development, and business development in the life sciences industry to her role at Cytel.
Prior to joining Cytel in 2020, Natalia served as Global Category Head, Acute Care – Oncology – Devices at Nestle Health Sciences. She acts as an Expert Jury member for the European Commission’s Innovation Council.
Natalia holds an MD and an MBA (IMD) and professional certifications in statistics and data science. She continues to expand her education, focusing on AI/ML and CME.
Read full employee bio
Evie Merinopoulou
Senior Director, Real-World Evidence
Evie Merinopoulou is Senior Director, Real-World Evidence, at Cytel. She is a health economist and real-world data scientist working on applications of real-world evidence in support of regulatory and HTA decision-making.
Evie has worked in the healthcare consulting industry for over 10 years, currently leading the design and execution of observational research projects using global real-world data. She particularly focuses on projects involving real-world synthetic control arms, quantitative bias analysis, head-to-head comparisons using target trial emulation, and transportability analysis.
Read full employee bioClaim your free 30-minute strategy session
Book a free, no-obligation strategy session with a Cytel expert to get advice on how to improve your drug’s probability of success and plot a clearer route to market.

