Solutions
About Us
Insights
Careers

Why Biomarkers Are Transforming How We Diagnose and Understand Alzheimer’s Disease

Alzheimer’s disease (AD) is the most common cause of dementia worldwide, accounting for about two thirds of cases. It is a progressive neurodegenerative condition that slowly erodes memory, thinking, and the ability to manage daily life. Globally, tens of millions of people live with Alzheimer’s today — numbers expected to grow sharply as populations age. The personal toll on individuals and families is profound, and the economic burden runs into hundreds of billions annually when considering healthcare costs, social care, and lost productivity.

Early diagnosis remains one of the biggest challenges. Historically, clinicians relied on observing cognitive decline — often when the disease is already advanced and irreversible brain damage has occurred. Emerging tools, particularly biomarkers, are beginning to change this picture.

 

The value of biomarkers in Alzheimer’s disease

Biomarkers are measurable indicators of biological processes, disease activity, or treatment response. In Alzheimer’s disease, they help detect characteristic brain changes — such as beta amyloid plaques and tau protein tangles — long before symptoms appear. These can be identified through imaging, cerebrospinal fluid tests, and, increasingly, blood-based biomarkers.

Research shows that:

  • Biomarkers can detect early pathological changes along the Alzheimer’s disease continuum, even in individuals without symptoms.1
  • Blood-based biomarkers (BBMs) such as plasma amyloid and tau provide a low cost, accessible alternative to PET scans and lumbar punctures.2
  • Updated diagnostic frameworks now define Alzheimer’s biologically, emphasizing that biomarker-detected pathology is equivalent to diagnosing the disease.3

This shift opens opportunities for earlier intervention, improved trial recruitment, and more personalized care.

 

What recent biomarker research tells us

1. Alzheimer’s is now considered a “biological disease” detectable long before symptoms.

Updated clinical criteria emphasize that identifying Alzheimer’s related proteins — through imaging, cerebrospinal fluid, or blood — is sufficient to diagnose the disease, even in people who feel cognitively normal. This reflects decades of evidence showing pathology accumulates silently before memory loss begins.4

 

2. Blood tests are emerging as game‑changers.

Advances in ultra‑sensitive technologies now allow scientists to detect minute amounts of proteins that leak from the brain into the blood. These tests measure markers like amyloid‑β and phosphorylated tau — proteins central to Alzheimer’s disease. Because they require only a simple blood draw, they enable repeated testing over time, making disease monitoring easier and more accessible.5

 

3. Blood biomarkers may revolutionize primary care detection.

Many people with early cognitive impairment go undiagnosed, especially in community settings. New blood-based biomarker tests can be integrated into routine care to flag individuals at high risk earlier, long before they reach a specialist.6

 

4. Early diagnosis enables better care and more timely treatment.

Because Alzheimer’s pathology starts decades before symptoms, identifying the disease early can help clinicians initiate supportive measures, guide lifestyle interventions, and offer patients and families time to plan. It also allows people to access relevant clinical trials at the most impactful disease stage.7

 

5. Biomarkers deepen scientific understanding of Alzheimer’s progression.

Different biomarkers reveal different aspects of disease — amyloid reflects early accumulation, tau correlates more strongly with neurodegeneration, and neurofilament light chain indicates nerve cell damage. Using multiple tests helps clinicians and researchers understand where a person lies along the disease continuum.8

 

Focusing on Amyloid Beta (Aβ)

I, along with my co-authors, recently published a study that investigated whether changes in amyloid beta (Aβ) can reliably predict whether a treatment will help patients think and function better, i.e., whether Aβ is a surrogate marker. Many new Alzheimer’s drugs reduce Aβ levels, and some have even been approved based on this effect. But the big question remains: Does lowering Aβ actually translate into meaningful clinical benefit?

We collected data from 23 clinical trials of seven different anti-amyloid monoclonal antibody drugs.

These trials reported treatment effects on both:

  • Aβ levels (using brain scans such as PET SUVR or the Centiloid scale), and
  • Clinical outcomes, including:
    • Clinical Dementia Rating – Sum of Boxes (CDR‑SOB)
    • Mini‑Mental State Examination (MMSE)
    • Alzheimer’s Disease Assessment Scale–Cognitive Subscale (ADAS‑Cog)

The team used a Bayesian meta-analysis, a statistical method that combines results across many studies to look for broad patterns.

The results showed that lowering amyloid‑beta often — but not always — leads to better clinical outcomes in Alzheimer’s disease. Aβ is a promising surrogate marker at the group level, but it is not reliable enough to predict benefit for individual drugs without additional evidence. Its use by health technology assessment agencies such as NICE and ICER to make decisions about the value of the new disease modifying treatments should take this into account.

 

The need for continued research into Alzheimer’s biomarkers

Biomarkers are reshaping the landscape of Alzheimer’s disease, offering hope for earlier, more accurate diagnosis and more tailored therapeutic strategies. But despite these advances, more work is needed.

Continued research is vital to:

  • Improve the accuracy and reliability of blood-based tests
  • Ensure tests are validated across diverse populations
  • Link biomarker changes more precisely to clinical outcomes
  • Support equitable access in primary care and low resource settings

As we enter an era of disease-modifying therapies, biomarkers will be indispensable — guiding diagnosis, monitoring response, and helping patients receive the right treatment at the right time.

The future of Alzheimer’s care will be biomarker driven, and ongoing research is the key to making that future accessible to all.

 

Interested in learning more?

Read “Evaluating amyloid-beta as a surrogate endpoint in trials of anti-amyloid-beta drugs in Alzheimer’s disease: A Bayesian meta-analysis.”

ELEVATE-GenAI: A New Guideline for Reporting Generative AI in HEOR Workflows

Generative artificial intelligence (AI), particularly large language models (LLMs), is increasingly embedded in health economics and outcomes research (HEOR) workflows. Researchers are now using these tools to support activities such as systematic literature reviews, health economic modeling, and real-world evidence generation.

As adoption grows, so does a fundamental question for the HEOR community:

How should the use of generative AI be transparently and consistently reported within HEOR workflows?

To address this question, the International Society for Pharmacoeconomics and Outcomes Research (ISPOR) Working Group on AI has developed ELEVATE-GenAI — a reporting guideline specifically designed to document and communicate how generative AI is used in HEOR research.

 

Why a dedicated reporting guideline is needed

HEOR has a strong tradition of structured reporting, supported by well-established standards for systematic reviews, economic evaluations, and real-world evidence. However, the rapid integration of LLMs into HEOR workflows has outpaced the development of HEOR-specific guidance on how their use should be reported.

LLMs are now being applied to:

  • Screening and classifying abstracts in systematic literature reviews
  • Extracting data and assessing bias
  • Building or replicating health economic models
  • Transforming unstructured real-world data into analyzable formats

While these applications offer efficiency and scalability, they also introduce new challenges related to transparency, reproducibility, factual accuracy, bias, uncertainty, and data governance. Existing AI reporting guidelines do not fully address these challenges in the context of HEOR decision-making, regulatory review, or health technology assessment (HTA).

ELEVATE-GenAI was developed to fill this gap by providing clear, HEOR-specific guidance for reporting the use of generative AI within research workflows.

 

What is ELEVATE-GenAI?

ELEVATE-GenAI is a reporting framework and checklist intended for HEOR studies in which generative AI plays a substantive role in evidence generation, synthesis, or analysis. Its goal is not to evaluate the performance of specific AI tools or to prescribe how AI should be used, but rather to ensure that AI-assisted workflows are clearly described, interpretable, and reproducible.

The guideline is designed to support:

  • Authors, by clarifying what information should be reported
  • Reviewers and editors, by enabling consistent evaluation
  • HTA bodies and regulators, by improving transparency and trust

Importantly, ELEVATE-GenAI is not intended for studies that use AI only for minor tasks such as editing or formatting text. Instead, it applies when generative AI meaningfully influences HEOR outputs.

 

Reporting generative AI across HEOR workflows: The 10 ELEVATE domains

At the center of ELEVATE-GenAI is a set of 10 reporting domains that together describe how generative AI is integrated into HEOR workflows and how its outputs are assessed.

 

1. Model characteristics

This domain ensures clarity about what AI system was used. Authors are encouraged to report the model name and version, developer, access method, license type, architecture, and — where available — training and fine-tuning data sources.

 

2. Accuracy assessment

Accuracy reporting focuses on how closely AI-generated outputs align with expected or correct results, using task-appropriate benchmarks such as expert review, gold-standard datasets, or quantitative performance measures.

 

3. Comprehensiveness assessment

Comprehensiveness addresses whether AI outputs fully cover all relevant elements of a task — for example, whether all key studies were captured in a literature review or all required components were included in an economic model.

 

4. Factuality verification

This domain emphasizes verification of factual correctness, including identifying and correcting hallucinated citations, incorrect data, or unsupported claims generated by the model.

 

5. Reproducibility and generalizability

Authors are encouraged to document prompts, parameters, workflows, and model versions to support reproducibility, and to discuss whether the AI-assisted approach can be applied to similar HEOR questions or settings.

 

6. Robustness checks

Robustness reporting addresses how sensitive AI outputs are to changes in inputs, such as minor prompt variations, ambiguous wording, or typographical errors.

 

7. Fairness and bias monitoring

Where applicable, studies should assess whether AI outputs introduce or reinforce biases related to demographic or population characteristics relevant to HEOR analyses.

 

8. Deployment context and efficiency

This domain captures practical aspects of AI deployment, including hardware and software configurations, processing time, scalability, and resource requirements — factors that influence real-world feasibility.

 

9. Calibration and uncertainty

Calibration focuses on whether AI confidence aligns with actual performance and how uncertainty is handled, such as defining thresholds for human review in hybrid AI–human workflows.

 

10. Security and privacy measures

Authors should describe how sensitive data, intellectual property, and regulatory requirements (e.g., GDPR or HIPAA) are addressed when generative AI is used in HEOR workflows.

 

Each domain is accompanied by reporting guidance and an assessment of metric maturity, recognizing that some areas — such as fairness and uncertainty — are still evolving.

 

From framework to practice: The ELEVATE checklist

To facilitate adoption, ELEVATE-GenAI includes a practical checklist that translates the 10 domains into concrete reporting questions. An optional scoring system allows authors and reviewers to summarize reporting completeness, while emphasizing that this score is not a measure of methodological quality or study validity.

The authors demonstrate the applicability of the guideline by retrospectively applying it to two published HEOR studies — one focused on systematic literature review automation and another on health economic modeling. These examples show how ELEVATE-GenAI can be used to consistently describe AI-assisted workflows across different HEOR applications and to identify areas where reporting can be strengthened.

 

Why ELEVATE-GenAI matters for HEOR

As generative AI becomes more deeply integrated into HEOR workflows, transparent reporting is essential to maintain scientific credibility and stakeholder trust. ELEVATE-GenAI provides a shared structure for documenting how AI is used, how outputs are evaluated, and what limitations may affect interpretation.

By establishing common expectations for reporting generative AI in HEOR, ELEVATE-GenAI supports responsible innovation while aligning with the needs of journals, HTA bodies, and regulators.

 

Final takeaways

ELEVATE-GenAI positions itself as a foundational guideline for reporting the use of generative AI in HEOR workflows. By focusing on transparency, reproducibility, and interpretability, it helps ensure that AI-augmented research can be critically assessed and confidently used in healthcare decision-making.

As a living guideline, ELEVATE-GenAI will continue to evolve alongside advances in generative AI — providing the HEOR community with a practical framework for integrating new technologies without compromising rigor or trust.

 

Interested in learning more?

Read the full paper: “ELEVATE-GenAI: Reporting Guidelines for the Use of Large Language Models in Health Economics and Outcomes Research: An ISPOR Working Group Report.”

External Control Arms in Drug Development: Methodological and Regulatory Considerations

Drug development is growing more complex, with compressed timelines and increasingly high expectations from regulators, payers, and health systems. In this setting, external control arms (ECAs) leveraging real‑world data (RWD) are emerging as a pragmatic approach to support clinical development and downstream commercial decision‑making.

Randomized controlled trials (RCTs) remain the gold standard for evidence generation. However, in many modern development programs, traditional randomized designs are not feasible or may raise ethical concerns. Sponsors increasingly encounter situations in which:

  • Patient recruitment is slow, limited, or not achievable
  • Randomization is ethically challenging
  • Development costs escalate rapidly
  • Competitive dynamics demand accelerated evidence generation
  • Patient populations are small or rapidly progressing
  • There is a high unmet medical need

 

These challenges are particularly acute in oncology, rare diseases, post‑approval expansion studies, and advanced or cell‑based therapies.

 

What is an external control arm?

An external control arm replaces or supplements a traditional control group by leveraging data from patients treated outside the clinical trial. These patients are drawn from routine clinical practice and reflect outcomes under standard‑of‑care treatment in real‑world settings.

External controls are typically constructed using real‑world data sources such as:

  • Electronic health records (EHRs)
  • Administrative and insurance claims
  • Disease and treatment registries

Unlike trial data, real‑world data reflect patterns of diagnosis, treatment, and follow‑up in everyday clinical care. The foundation of a well‑designed external control study is the use of fit‑for‑purpose data that are sufficiently complete, clinically relevant, and reliable to support robust and defensible analyses.

 

Strategic value of external control arms

When thoughtfully designed and appropriately governed, ECAs can provide meaningful strategic benefits, including:

  • Shortened development timelines
  • Improved feasibility of clinical studies
  • Evidence generation in small or rare populations
  • Stronger value narratives for payers and health technology assessment bodies
  • Support for lifecycle management and label expansion strategies

 

Methodological considerations and risks to manage

The credibility and acceptability of an external control arm depend heavily on methodological rigor.

Key considerations include the following:

1. Study design

External control studies should be designed to closely mirror the clinical trial, including:

  • Alignment of inclusion and exclusion criteria
  • Clear definition of index date and baseline
  • Comparable follow‑up periods and outcome assessment windows
  • Consistent treatment context and line of therapy

Pre-specification of the estimand and statistical analysis plan is critical to avoid post‑hoc decision‑making.

 

2. Patient selection and alignment

Ensuring comparability between trial participants and real‑world patients is one of the most critical aspects of ECA design. Sponsors should:

  • Use transparent, reproducible cohort selection algorithms
  • Apply consistent definitions for key demographic and clinical variables
  • Assess overlap and positivity between trial and external populations
  • Explicitly evaluate differences in baseline characteristics

Sensitivity analyses should be conducted to quantify the impact of residual differences where appropriate.

 

3. Handling confounding and bias

Because external control arms lack randomization, confounding must be actively addressed. Common analytical approaches include:

  • Propensity score methods (matching, weighting, stratification)
  • Multivariable outcome regression
  • Doubly robust methods that combine weighting and modeling

Method selection should be driven by study objectives, data characteristics, sample size, and variable completeness and not for analytical convenience.

 

4. Data quality and missingness

Real‑world data are inherently heterogeneous and incomplete. Methodological plans should address:

  • Data provenance, completeness, and validation
  • Handling of missing or partially observed variables
  • Measurement variability across providers, systems, or data sources
  • Differences in assessment timing and frequency

Imputation strategies and key assumptions should be explicitly documented and tested through sensitivity analyses.

 

5. Outcome definition and assessment

Endpoints derived from RWD must be clinically meaningful and aligned as closely as possible with trial definitions. Considerations include:

  • Use of validated real‑world endpoint definitions
  • Clear attribution and timing of outcomes
  • Consistency with regulatory‑recognized measures of clinical benefit
  • Avoidance of surrogate endpoints unless scientifically justified

Outcome misclassification remains a key risk and should be explicitly evaluated.

 

6. Sensitivity and robustness analyses

Regulators expect evidence that findings are robust under alternative assumptions. Analyses may include:

  • Variation in matching or weighting specifications
  • Alternative cohort definitions or look‑back periods
  • Use of negative control outcomes or exposures
  • Quantitative bias analyses where feasible

The objective is to demonstrate that conclusions are not driven by a single design or modeling decision.

 

7. Transparency and documentation

Methodological transparency is essential for regulatory and payer review. Best practices include:

  • Prespecifying analysis plans and decision rules
  • Fully documenting data sources, algorithms, and assumptions
  • Providing traceability from raw data to final outcomes
  • Enabling reproducibility of key analyses

 

Regulatory outlook and expectations

Regulatory agencies and health technology assessment bodies, including the U.S. Food and Drug Administration (FDA), the European Medicines Agency (EMA), and the Canadian Agency for Drugs and Technologies in Health (CADTH) have recognized the potential role of external control arms under conditions of methodological rigor and transparency.

Regulatory agencies have not lowered evidentiary standards. Rather, they have:

  • Provided greater clarity on scenarios in which external control arms may be acceptable
  • More explicitly articulated methodological expectations
  • Encouraged early and proactive dialogue with sponsors

 

Successful regulatory submissions that incorporate ECAs typically:

  • Provide a clear scientific and ethical rationale for why randomization is not feasible or appropriate
  • Use high‑quality, fit‑for‑purpose real‑world data sources
  • Transparently define patient selection criteria and demonstrate alignment with the trial population
  • Show that findings are robust, reproducible, and minimally biased

Early engagement with regulators remains critical to aligning expectations and maximizing the likelihood of success.

 

Join Anupama Vasudevan and James Matcham on February 3 at 10 a.m. ET for an open office hours on “Evidence Generation with External Control Arms”:

From Regulators to Reimbursement: What the EMA-FDA AI Principles Mean for HEOR

In January 2026, the European Medicines Agency (EMA), together with the U.S. Food and Drug Administration (FDA), have taken an important step by publishing the “Guiding Principles of Good AI Practice in Drug Development.” This document is more than a technical checklist — it is a clear signal that regulators are getting serious about how artificial intelligence (AI) should be developed, validated, governed, and, ultimately, trusted across the medicines lifecycle.

While the principles are formally framed around drug development, their implications go well beyond non-clinical and clinical domains. For Health Economics and Outcomes Research (HEOR), this guidance offers something the field has long needed: a credible regulatory blueprint for responsible AI use that could help agencies move from cautious experimentation to structured adoption.

 

Why this matters now

AI is already being used across HEOR — whether for real-world evidence generation, economic modeling, patient segmentation, or long-term outcome prediction. Yet, despite methodological innovation, acceptance by HTA bodies and payers remains uneven. One of the key barriers is not capability, but confidence: confidence in transparency, robustness, reproducibility, and governance.

By articulating shared principles for AI use, the EMA and its partners are laying the groundwork for that confidence. Importantly, they are doing so in a way that aligns closely with the questions HTA agencies ask every day: What is this model for? What risks does it introduce? Can we trust the outputs? And how do we manage it over time?

 

A bridge to HEOR: Learning from regulatory leadership

We have already seen how regulatory clarity can accelerate adoption. The UK, for example, has actively explored how AI can be used to support evidence generation and decision-making in health systems. EMA-FDA’s principles create an opportunity to extend this momentum across Europe and beyond — including into HEOR and HTA decision frameworks.

Although all ten principles are relevant, four stand out as particularly transformative for HEOR.

 

Four principles with outsized impact on HEOR

1. Human-centric by design

This principle explicitly anchors AI development in ethical and human-centric values. For HEOR, this is critical. Economic models and real-world analyses directly influence access, reimbursement, and, ultimately, patient outcomes.

A human-centric approach reinforces that AI in HEOR should support, not replace, expert judgement. It legitimizes hybrid workflows where analysts, clinicians, patients, and decision-makers remain central, while AI enhances scale, speed, and insight. This framing directly addresses common HTA concerns about “black box” decision-making.

 

2. Risk-based approach

Not all AI use cases carry the same consequences, and this principle explicitly recognizes this. For HEOR, this principle is particularly powerful.

Using AI to automate literature screening does not pose the same risk as using it to inform long-term survival extrapolations or pricing decisions. A risk-based approach allows proportionate validation, governance, and oversight — making AI adoption more realistic and scalable for both developers and agencies.

This is precisely the kind of nuance HTA bodies need to move beyond binary “acceptable/not acceptable” positions on AI.

 

3. Risk-based performance assessment

Closely linked, the EMA and FDA emphasize that performance assessment should consider the complete system, including human-AI interaction, and be tailored to the intended context of use.

For HEOR, this reframes validation away from abstract accuracy metrics and toward decision relevance. The key question becomes: Is this AI fit-for-purpose for the policy or reimbursement decision it supports? This aligns naturally with HTA thinking and opens the door to more pragmatic, decision-focused validation frameworks.

 

4. Life cycle management

Perhaps the most underappreciated principle in HEOR today is life cycle management. The EMA highlights the need for ongoing monitoring, re-evaluation, and management of issues such as data drift.

HEOR models are often treated as static artefacts, yet AI-enabled models evolve as data, clinical practice, and populations change. Recognizing AI as a living system — not a one-off submission — could fundamentally change how HTA agencies think about post-submission evidence generation, managed entry agreements, and reassessment over time.

 

From drug development to HTA: An opportunity not to miss

This guidance is explicitly focused on drug development, but its principles are intentionally broad and collaborative. They invite extension, adaptation, and harmonization across jurisdictions and evidence domains.

For HEOR, this is an opportunity. By aligning AI methods with regulatory expectations early — rather than waiting for explicit HTA-specific rules — the field can help shape how agencies evaluate AI-enabled evidence. In doing so, HEOR can move from being a passive recipient of regulation to an active contributor to responsible AI adoption.

 

Looking ahead

AI will not replace HEOR expertise — but it will increasingly shape how evidence is generated, synthesized, and interpreted. These guiding principles offer a shared language to discuss trust, risk, and value. If agencies apply similar thinking to HEOR, we may finally see a path toward consistent, transparent, and confident use of AI in reimbursement and access decisions.

In that sense, this guidance is not just about AI in drug development. It is about preparing the entire evidence ecosystem — including HEOR — for a future where intelligent systems are used responsibly, transparently, and in service of better patient outcomes.

 

Interested in learning more?

Watch our recent webinar, “AI in HEOR: Case Studies on Navigating Regulatory and HTA Guidance,” on demand, featuring experts Dalia Dawoud, Manuel Cossio, Sheena Singh, and Cale Harrison:

The What, When, and Why of the Changes to NICE Methods: Is the Devil in the Details?

Following weeks of anticipation, NICE officially announced in December that the recently rumored increase of its standard cost effectiveness threshold will take effect beginning April 2026.

 

What’s changing and when?

The standard cost effectiveness threshold range that NICE committees use to judge whether a medicine is cost effective will increase by 25% from 20–30K GBP per QALY gained to 25–35K GBP per QALY gained.

NICE stated in its webinar on December 3, 2025, that the Department of Health and Social Care (DHSC) will consult on powers to direct NICE to enact this change starting April 2026, in a targeted change to regulation. This consultation opened on December 9, 2025, and will close on January 13, 2026.

NICE stressed that this targeted change will not mean any broader intervention from government ministers in its methods or decisions. It also confirmed that it is proposing to the government that the new threshold applies across all NICE guidance (Digital, HealthTech, Guidelines) and was awaiting further details. NICE also confirmed in the webinar that it was not aware of any proposals to change the thresholds used to evaluate Highly Specialized Technologies (HSTs) for ultra-rare diseases.

However, the first proposal in the DHSC consultation document refers explicitly to all NICE guidance:

“Do you agree or disagree that the power to direct NICE about the standard cost-effectiveness threshold should apply to all NICE guidance that makes recommendations on health spending? This includes technology appraisal and highly specialised technology evaluation recommendations.

As part of the timeline announced by NICE (see figure), which is subject to consultation, NICE confirmed that in early 2026 it will consult on how this change will be implemented.

 

Anticipated timeline to implement the announced changes (Source: NICE webinar on December 3, 2025)

 

In addition to an increase of its cost effectiveness threshold, NICE also announced it will start using a new EQ-5D-5L UK value set that has been developed by asking 1,200 members of the public to judge different health states and is anticipated to be published in a peer-reviewed publication by March 2026. This change, however, will follow the standard approach to making modular updates to its methods including a public consultation on the proposed change before its full implementation.

NICE’s announcement came in parallel with an announcement from the UK government about the successful closure of a trade deal with the US that includes this change, alongside an agreement regarding the tariff that UK pharmaceutical manufacturers will pay when exporting medicines to the US.

 

Why these changes?

NICE’s methods changes are anticipated to reshape the market access environment in the UK and beyond. The US-UK trade deal, of which this threshold change is part, may convince pharma companies to continue their presence in the UK and to maintain the UK’s positioning in the launch sequence after previously threatening to pull out of the UK market under pressure from the newly announced US tariffs and policies such as the MFN external reference pricing policy.

According to the UK government’s press release announcing NICE threshold changes:

“This is supported by confirmation that — thanks to strong UK support for innovation — the UK has secured mitigations under the US’ ‘Most Favoured Nation’ drug pricing initiative so that we will continue to ensure access to the latest treatments. This will encourage pharmaceutical companies from around the world to prioritise the UK for early launches of their new medicines, meaning British patients could be among the first globally to access breakthrough treatments.”

 

The anticipated impact

These NICE methods changes will have far reaching impact on the assessment of cost effectiveness of medicines in the UK, with likely spillover effects on other countries’ practices as well.

The higher WTP threshold expands headroom for treatments near previous ICER cut-offs, improving the feasibility of charging higher prices for innovative therapies. However, the unchanged discount rate limits the full advantage of this increase. This means more flexibility on price, but continued pressure on future value. It remains to be seen whether this increased threshold will also apply to other NICE guidelines apart from its technology appraisal (TA) program. What has been confirmed is that the threshold change will not lead to any reviews of completed appraisals.

NICE’s adoption of the EQ-5D-5L UK value set will also reshape patient-reported outcomes strategy. Utilities derived from EQ-5D directly influence QALY calculations and ICERs. By reflecting more nuanced health states, EQ-5D-5L supports a more accurate calculation of QALYs. Trials that currently collect EQ-5D-3L data may need a new mapping function to align with the new set. Future trials should prioritize EQ-5D-5L and ensure high completion rates for PRO instruments, as missing data will become even more critical.

From a patient perspective, this means their lived experience is better represented in HTA decisions. For pharma companies, it means interventions that improve pain, anxiety, and functional independence can show their full value in cost-effectiveness models.

 

Regional impact

It is not clear how Europe will respond to these changes on both sides of the Atlantic, but what is clear is that actions will need to be taken to minimize the impact of these changes on both the favorability of European markets as launch markets and the prices to be charged by pharma companies in these markets, both of which are likely to impact patient access to innovative medicines.

Further, we could speculate that this change could bring prices in the UK closer to France and Germany. The UK has been able to achieve low prices because of the powerful negotiating position of the UK’s single centralized payer for the majority of UK healthcare (the NHS), its deeply embedded health technology appraisal processes through NICE, which acts as the gatekeeper for the reimbursement of drugs, and through long-standing price-control mechanisms that effectively cap the NHS’s spend on innovative medicines — the most recent iteration of which is the Voluntary Scheme for Branded Medicines Pricing, Access and Growth (VPAG), and a fallback Statutory Scheme. The current VPAG scheme requires UK manufacturers to pay an effective clawback rate of 23.5% to the UK Government on “newer medicines” (22.9% clawback plus a 0.6% investment program funding, excluding new active substances) — far higher than comparators such as France (5.7%), Germany (7%), and Spain (7.5%).

 

Have you considered these and other impacts and is your team ready for these changes?

2025 in Perspective: Reflections From Our Newest Colleagues

Every year brings new faces, fresh ideas, and inspiring stories to Cytel. In 2025, these colleagues joined us from across the globe, each bringing unique experiences and ambitions. As the year closes, we asked them to share what stood out, what they’ve learned, and how they see their work shaping something bigger. Their reflections tell a story of connection, growth, and purpose.

 

Joining Cytel: Memorable moments and settling in

For Kasum de Souza Mateus (Senior Biostatistician, FSP) the most memorable part of joining Cytel was simple yet meaningful: “Being able to meet colleagues and mentors in person.” That feeling of connection resonated with many new Cytelians, from Adish Jindal (Senior Recruiter), who described the joy of reconnecting with familiar faces, to Luke Hilliard (Event Manager), who fondly recalls a team meeting: “I really enjoyed the trip to Bruges. It was such a pleasure meeting everyone in person. We came away with some fantastic ideas that we’ve since put into action for our events.”

Others found their defining moments and success in challenges that brought people together. Kanchan Kulkarni (Manager, Accounting) stepped into her role during a major system transition: “One of my most memorable experiences has been leading the Global GL Accounting function across EMEA, APAC, and NA regions during our Oracle ERP transition. It wasn’t just about systems and numbers — it was about connecting people, aligning processes, and building something stronger together.” And for Scott Rogers (CFO), the most powerful moment came during a Town Hall: “I was very moved by the presentation where we heard directly from a patient and understood how our work helped him realize the benefits he was seeing.”

For Macarena Pazos Maidana (Senior Market & Business Development Manager) success came early: “During my third week, I successfully secured a key renewal with a major pharmaceutical client for the East Horizon™ platform. This achievement not only boosted my confidence but also reinforced my belief in the value our solutions bring to the industry.” And Hannes Engberg Raeder (Principal Biostatistician, FSP) found pride in collaboration: “I’m proud of having been able to support one of our partnerships through process improvements that helped strengthen collaboration and overall efficiency.”

 

Leaning on advice

Of course, starting something new means leaning on advice from colleagues or mentors, and some words of wisdom stuck. Nicole Sheridan (Manager, Talent Management) shared the famous mantra that shaped her approach: “’Do or do not, there is no try.’ It’s simple, but it completely changed how I think about my work and even life outside of work. I realized it’s not about being perfect but it’s about showing up, committing, and seeing things through. That mindset has really helped me take initiative, stay resilient, and turn ideas into results.

Damian Kowalski (Principal Statistical Programmer, FSP) emphasized collaboration: “Don’t be afraid to ask questions. Collaboration is our strength.” And Sydney Jenkins (Senior Employee Relations & Engagement Partner) shared a perspective that guides her work: “Trust your logic. That perspective reminds me to approach challenges with a clear, rational mindset, even under pressure!”

 

Growth and ambition

This year was not only about settling into their role for our new Cytelians, however. It also marked a year of growth and achievements. Adish honed his global recruitment expertise: “One skill I’m particularly proud of developing in 2025 is my ability to manage global recruitment processes more effectively.” Monica Chaudhari (Associate Director, Biostatistics, FSP) shared a technical milestone: “My first study that I got assigned to was already closed. To help myself support the team through database lock, review of final outputs and drafting of the CSR, I created a swimmers plot summarizing all important endpoints on each subject’s trajectory that helped identify major deviations.”

Valeria Duque Mora (Project Coordinator, Resource Management) reflected on teamwork: “My current team has made a real difference in my daily work. They are the foundation of our success, always supporting each other and sharing new information with kindness and collaboration throughout every process.” For Dominika Wisniewska (Senior Statistical Programmer, FSP), the impact was deeply personal: “I am grateful that Cytel gave me the opportunity to work directly for our client where I work on research within rare diseases and neurology diseases. I am particularly interested in neuro because of personal reasons, and I am happy to participate in maybe discovering new treatments.” And Sankhyajit Sengupta (Senior Statistical Programmer, FSP) embraced learning: “In this very short period of time (three months), I’ve had the opportunity to gain exposure to R programming in live studies and also completed required trainings on R, an important step as the industry is moving in this direction.”

Looking ahead, our new colleagues are already thinking about how to make an even bigger impact in 2026. Kanchan hopes to drive automation and efficiency, Luke dreams of organizing a standalone event, and Ye Miao (Associate Director, Biostatistics, FSP) plans to deepen expertise in R programming to contribute more effectively to data analysis and reporting tasks in his FSP role. Sydney aims to strengthen policy awareness and consistency across the organization, while Macarena is focused on enhancing client retention and satisfaction. Each goal reflects a commitment to making an even bigger impact in year two.

 

Connecting to the bigger picture

Every role at Cytel connects to our mission of improving patient lives. Adish summed it up well: “As a Global Senior Recruiter, I help bring in the talent that powers our mission. Every great hire strengthens our culture, drives innovation, and helps the company achieve its goals globally.” Wyatt Gotbetter (Senior Vice President, Global Head Evidence, Value and Access) described the EVA team’s role: “I like to describe the work of EVA as the essential ‘last mile’ in our client’s drug development journey — after decades of scientific discovery, animal and human trials, and regulatory approvals, we play a vital role in helping ensure patients get access to needed therapies.” And Damian reminded us of the impact behind the data: “Every dataset we program and validate helps ensure reliable insights for clinical trials. It’s amazing to know that our work plays a role in bringing life-saving therapies to patients worldwide.”

 

The voices of our newest colleagues remind us that Cytel is more than a workplace. It’s a community driven by purpose, collaboration, and innovation. Here’s to their continued success and to another year of making a difference together.

Women’s Health Is Society’s Wealth: Unlocking Economic Value When Bridging the Gender Health Gap

When the facts are loud and clear: investing in women’s health could unlock $1 trillion in global gross domestic product annually by 2040, prevent 24 million female life years lost to disability, and yield exponential returns to economy for every investment across obstetrics and gynecology, female and maternal health, immunology, neurology, cardiology, and oncology.

 

Improving global health equity has been increasingly recognized as a strategic priority for different stakeholders in healthcare,1 including policymakers, industry, governments, investors, and global health organizations. Beyond an ethical and human rights imperative, reducing health inequities and ensuring that everyone has a fair opportunity to achieve their full health potential — independent of socioeconomic status, sex, gender, geography, or race/ethnicity — leads to economic and societal benefits and resilient healthcare systems.2 Although progress continues to be made toward improving general health outcomes, Cytel researchers have found that this has not translated equally for men and women.3

 

The gender health gap: A global health crisis

Attention to women’s health inequities and the potential economic impact from closing this health gap remain largely unnoticed. The gender health gap — the long-standing, unfair differences in health outcomes between women and men — has been only recently recognized as a medical and healthcare issue. The underinvestment in female health research, the absence of systematic data collection to understand and document the unique biological needs of females and assess disparities, as well as biases in male-dominant clinical trial programs have all contributed to the neglect of women’s health issues. The survival paradox is documented, with women outliving men but experiencing poorer general health, including mental health; recent data showing that women live approximately five years longer than men does not adequately categorize the fact that women spend more than one-quarter of their lives in poor health.4, 5 This health gap is a global health crisis that affects women of all ages to varying extents depending on geography and income levels.4

The gender health gap is generally defined by the conditions that affect women uniquely, differently, or disproportionately, and are not limited to those related to sexual and reproductive health.4 For example, women from the general population are at significantly greater risk of mental health disorders (e.g., depression, suicide) than men, and women with type 2 diabetes mellitus have a disproportionally higher risk of adverse cardiac events, including mortality.6 Men, on the other hand, are significantly more likely to have adverse events after specific types of surgeries and higher mortality after COVID-19.6 Despite the fact that cardiovascular disease is the top cause of death for women in the US, males outnumber females two to one in related clinical trials.7

 

Quantifying the economic benefit of closing the women’s health gap

Quantifying the economic benefit of closing the women’s health gap for societies and economies is important for several reasons and makes visible the “invisible” topic of women and their health. By attaching a measurable economic gain — such as productivity gains, increased workforce participation, reduced healthcare spending — policymakers and investors can grasp the tangible impact on global economies and growth. As financial pressures are restraining healthcare spending, prioritization of resource allocation where interventions yield the greatest returns to economies, such as women’s health, may be placed higher in the list of investment priorities.7 Therefore, systematic efforts to quantify the economic value when bridging the gender health gap will push reframing health equity as a driver for inclusive and sustainable growth, making it a strategic imperative for governments and businesses and overturn the negligible investment in women’s health (only 5% in 2020).8

 

Our findings: The value of investment to improve women’s health

We conducted a comprehensive literature review that aimed to systematically investigate and summarize quantitative evidence on the economic impact of investments to close the women’s health gap globally. We identified robust evidence to demonstrate that when investment is made to improve women’s health, there is return to this investment by bringing back higher value to economies.

A recent report jointly published by the World Economic Forum and the McKinsey Health Institute, for example, highlights that investments in addressing the women’s health gap could not only extend life years and healthy life years, but also have the potential to boost the global economy by $1 trillion annually by 2040.4 These findings were supported by an additional impact analysis conducted by Women’s Health Access Matters across three indications: rheumatoid arthritis, coronary artery disease, and Alzheimer’s disease. Key findings indicated that an investment of $300 million in women’s health research across these three diseases would conservatively result in a $13 billion return to the US economy.7

Over the past 70 years, the influx of women into the workforce has been closely linked to economic growth.4 Since nearly half of the health burden affects women in their working years, this can have serious consequences for the income-earning potential of women, causing a ripple effect on society.4 Economic benefits in the same direction were also documented by simulation studies in other countries such as the United Kingdom whereas limited data were identified for low- and middle-income countries.

We are committed to standing at the forefront of assessing public policy trends and critical policy matters that highlight emerging challenges and seizing opportunities for improving public health. Some examples include our environmental scan of publicly available data repositories to address disparities in healthcare decision-making,9 an umbrella review of the impacts of climate change on maternal health and birth outcomes,10 and blueprints for collective action to close the women health gap.

 

 

Interested in learning more?

Grammati Sarri, Lilia Leisle, and Jeffrey M. Muir will be at the upcoming ISPOR Europe conference in Glasgow, Scotland, where they will present “The Economic Case for Gender Equity: How Closing the Women’s Health Gap Benefits Healthcare Systems and Economies” on Wednesday, November 12, 2025, from 9 to 11:30 a.m. Register below to book a meeting or visit us at Booth #1024 to connect with our experts:

Generative AI in Evidence Synthesis: Harnessing Potential with Responsibility

The integration of AI into the healthcare research landscape is accelerating, with one obvious area of application being evidence synthesis. From early scoping reviews to comprehensive systematic literature reviews (SLRs), AI promises to reduce manual burden and enhance efficiency by saving time. However, it is crucial to understand both the strengths and limitations of using AI in this broad context to ensure compliance, reliability, and scientific rigor.

 

Knowing where it works: A targeted approach

Artificial intelligence, including generative AI models, shines when used for targeted literature reviews (TLRs) or when generating summaries of scientific articles to support evidence-based decision-making at an early development stage. AI can synthesize large volumes of information quickly, offering valuable insights during exploratory or early-phase research.

However, it’s critical to distinguish these from regulatory-facing systematic literature reviews, especially those intended for payer or health technology assessment (HTA) submissions. In this context, SLR extractions have traditionally been completed by two independent human reviewers. This human oversight ensures objectivity and reproducibility, key elements of regulatory compliance.

 

Expertly trained models vs. generalist giants

The current landscape is filled with large generalist language models trained on diverse internet-scale data. While impressive, these models often exhibit hallucinations — the generation of plausible but incorrect or fabricated content — particularly in domain-specific applications like evidence synthesis.

This is why domain-trained expert models are preferred. These models are fine-tuned on biomedical and scientific corpora, ensuring higher reliability and reducing the risk of misinterpretation or erroneous conclusions. They understand field-specific terminology, data structures, and compliance requirements far better than their generalist counterparts.

 

The imperative of data traceability

In evidence synthesis, transparency is non-negotiable. Any AI-generated output must allow users to:

  • Highlight the exact source (i.e., sentence or section) of the original scientific article from which a conclusion or data point was extracted.
  • Compare the model’s interpretation with the source text to identify discrepancies or nuances that could affect meaning or validity.

Using structured tags to annotate key terms, qualifiers, and relationships can make these comparisons clearer and more systematic but also inform advanced search and retrieval activities. By surfacing subtle differences, tagging supports expert review, preserves contextual integrity, and strengthens the reliability and defensibility of the synthesized evidence.

 

Measuring what matters: Precision and beyond

Traditional evaluation metrics like precision, recall, and F1 score (the harmonic mean of precision and recall) remain foundational when assessing AI model performance in literature screening and data extraction.

But in generative contexts — where the task may be summarization, paraphrasing, or abstract reasoning — additional measures become valuable:

  • Answer correctness: Does the output convey a factual, verifiable point?
  • Semantic similarity: How closely does the AI output align in meaning with the ground truth?
  • BLEU, ROUGE, and BERTScore: These Natural Language Processing metrics offer quantitative insights into the quality of generated text, especially for summarization and content generation tasks.

Selecting the right mix of these metrics provides a comprehensive view of model performance and reliability.

 

Where AI makes a difference: Screening and beyond

One of the most promising applications of generative AI in evidence synthesis is in literature screening, or the ability to assess whether a publication (abstract or full text) meets the criteria for inclusion. Studies and pilot implementations suggest that AI can reduce screening time by up to 40%, making it a powerful ally for research teams.

AI tools have been leveraged to assign a probability of inclusion to a title or abstract or full text to guide the screening process but also to allow researchers to quickly understand the impact of modifying search strategies on yield. By automating this repetitive and time-consuming phase, organizations can reallocate expert human resources to higher-value tasks, such as:

  • Resolving ambiguous or context-dependent data extractions
  • Validating nuanced findings and offering insights into implications of these findings
  • Ensuring alignment with HTA submission standards

In this way, AI doesn’t replace human reviewers but augments them, driving efficiency without compromising accuracy.

 

AI with guardrails

Generative AI is reshaping the landscape of evidence synthesis, but its integration must be strategic, measured, and compliant. By combining domain-trained models, robust traceability, appropriate evaluation metrics, and human oversight, organizations can unlock the true value of AI — accelerating workflows without sacrificing quality or compliance.

When used thoughtfully, generative AI becomes more than just a tool — it becomes a partner in advancing scientific research.

 

Meet with us at ISPOR 2025!

Manuel Cossio and Nathalie Horowicz-Mehler will be in Glasgow for ISPOR Europe 2025! Click the link below to book a meeting, or stop by Booth #1024 to connect with our experts:

Breaking Barriers in Rare Disease Research with Generative AI and Synthetic Data

In healthcare innovation, one of the most pressing challenges lies in rare disease research. There are approximately 7,000 rare diseases affecting over 300 million people worldwide. With only a handful of patients dispersed globally, gathering sufficient data to power robust clinical studies or predictive models is a monumental hurdle. However, a solution is emerging at the intersection of generative AI and real-world data (RWD) — a novel approach with the potential to reshape possibilities and unlock insights to address unmet medical needs in rare diseases.

 

The rare disease data dilemma

In the U.S., rare diseases are defined as conditions affecting fewer than 200,000 people. Despite their low individual prevalence, rare diseases collectively impose a significant burden on both patients and healthcare systems.

Research and development in rare diseases often face a vicious cycle: low prevalence leads to data scarcity. Traditional clinical trials are often infeasible and/or statistically underpowered due to the limited pool of participants.

Meanwhile, RWD sources such as electronic health records (EHRs), insurance claims, registries, and patient-reported outcomes offer valuable, albeit messy and fragmented, glimpses into the patient journey. Yet even RWD struggles to paint a complete picture in rare diseases. This is where generative AI steps in.

 

Enter generative AI: Making data where there is none

Generative AI — especially models like Generative Adversarial Networks (GANs), Variational Autoencoders (VAEs), and, more recently, large foundation models — has a transformative ability: it can learn patterns from limited datasets and generate synthetic yet realistic datasets.

How it works

  1. Learning from RWD: Even small datasets from rare disease patients can be used to train and fine-tune generative models. These models identify patterns, distributions, and time-dependent relationships present in the data.
  2. Synthesizing patients: Once trained, the model can create new, synthetic patient records that preserve the statistical properties and characteristics of the original data. These “digital patients” simulate disease progression, treatment responses, and comorbidities.
  3. Validating realism: Synthetic data must be validated to ensure it reflects the real-world data it was trained on. Techniques like distributional comparison, propensity scoring, and expert validation are used to ensure accuracy and utility.

 

Why synthetic data matters for rare diseases

Synthetic data can enhance rare disease clinical research in many ways, including:

 

1. Augmenting small cohorts

Synthetic data can boost sample sizes for rare disease studies, enabling:

  • Simulation of clinical trials
  • Development of more robust predictive models
  • Generation of synthetic control arms where traditional controls are ethically or logistically impractical

 

2. Enhancing privacy

In rare diseases, patient re-identification is an increased risk due to unique phenotypes or genetic markers. Synthetic data protects patient privacy, while at the same time preserves the utility of the data.

 

3. Facilitating global collaboration

As synthetic data is deidentified, it facilitates data sharing among researchers, institutions and borders, minimizing regulatory hurdles and fostering cross-collaborative discovery.

 

4. Accelerating drug development

Pharma and biotech companies can use synthetic data to:

  • Test drug targeting strategies
  • Model long-term outcomes
  • Conduct in silico trials in the earliest stages of development

 

Challenges and considerations

While promising, this approach is not without its challenges:

  • Bias amplification: Synthetic data reflects the biases of its training data. If the RWD is incomplete or skewed, so will the synthetic outputs be. Strategies to handle bias are essential.
  • Regulatory acceptance: Regulatory bodies are still evaluating how to incorporate synthetic data into approval pathways.
  • Validation standards: There is a need for consistent benchmarks and best practices for validating synthetic data — both in terms of privacy and utility, as well as broader generative AI applications in healthcare.

 

Looking ahead

The marriage of generative AI and RWD opens new doors for rare disease research. With the ability to synthesize patient data that preserves real-world complexity, we can begin to break free from the constraints of scarcity — generating insights, hypotheses, and interventions that were once out of reach.

As we move forward, interdisciplinary collaboration among clinicians, data scientists, regulatory bodies, and patient advocacy groups will be key to harnessing this potential ethically and effectively.

 

Interested in learning more?

Download our complimentary ebook, Rare Disease Clinical Trials: Design Strategies and Regulatory Considerations:

External Control Arms: A Powerful Tool for Oncology and Rare Disease Research

In clinical research, the randomized controlled trial (RCT) has been considered the gold standard. Yet in many areas — especially in oncology and rare diseases — running an RCT with a balanced control arm is not always possible. Patients, physicians, and regulators often face a difficult reality: how do we evaluate promising new therapies when traditional designs aren’t feasible?

This is where external control arms (ECAs) come into play. By carefully drawing on existing data sources and applying rigorous methodology, ECAs can help provide the context and comparative evidence needed to make better decisions.

Here, we will explore why ECAs are particularly valuable in oncology and rare diseases, how they support decision-making and study design, what data sources they can rely on, and which statistical methods are essential to reduce bias. We will also introduce the concept of quantitative bias analysis and conclude with why experienced statisticians are key to the success of this methodology.

 

Why external control arms matter in oncology and rare diseases

Oncology and rare disease research share several challenges that make traditional RCTs difficult:

  • Small patient populations: In rare diseases, the number of eligible patients is often extremely limited. Asking half of them to enroll in a control arm may make recruitment impossible.
  • High unmet need: In oncology, patients and families are eager for new options. Many consider it unacceptable to randomize patients to placebo or outdated standards of care.
  • Ethical constraints: For life-threatening conditions, denying patients access to an experimental therapy can be ethically challenging.
  • Rapidly changing standards of care: In oncology, new treatments are approved frequently. A control arm that was relevant when a trial began may become outdated by the time results are available.

In such contexts, single-arm studies (where all patients receive the experimental therapy) are common. But single-arm results alone are not sufficient. Without a comparator, how do we know if the observed survival or response rate truly reflects an advance? ECAs provide the missing context.

Even when a trial includes a control arm, unbalanced designs — such as smaller control groups or cross-over to experimental treatment — can limit the ability to make clean comparisons. External controls can augment these designs, helping to stabilize estimates and provide reassurance that results are robust.

 

Supporting internal and regulatory decision-making

ECAs serve multiple purposes:

  1. Internal decision-making:
    • Companies developing new therapies must decide whether to advance to the next trial phase, expand into new indications, or pursue partnerships.
    • ECAs help answer questions like: Is the observed benefit large enough compared to historical data? Do safety signals look acceptable in context?
  2. Regulatory decision-making:
    • Regulatory agencies such as FDA and EMA increasingly accept ECAs as part of submissions, especially in rare diseases and oncology.
    • While not a replacement for RCTs, ECAs can strengthen the evidence package and demonstrate comparative effectiveness in situations where randomization is not feasible.
  3. Helping the medical community:
    • Physicians, payers, and patients need to interpret trial results. An overall survival rate of 18 months in a single-arm study may sound promising, but how does it compare to similar patients receiving standard of care?
    • ECAs help put numbers into perspective, allowing the community to better understand the true value of a new therapy.

 

Designing better studies with ECAs

External controls are not only a tool for analyzing results — they can also improve study design.

  • Feasibility assessments: By examining real-world data or prior trial results, sponsors can estimate expected event rates, patient characteristics, and recruitment timelines. This reduces the risk of under- or over-powered studies.
  • Endpoint selection: Understanding how endpoints behave in historical or real-world settings helps refine choices for the trial, ensuring relevance to both regulators and clinicians.
  • Eligibility criteria: RWD and earlier trial data can reveal which inclusion/exclusion criteria are overly restrictive. Adjusting them can broaden access while maintaining scientific rigor.
  • Sample size planning: By leveraging ECAs, trialists may reduce the number of patients required for an internal control arm, easing recruitment in small populations.

In other words, ECAs can shape trials from the start, rather than being seen only as a “rescue” option after the fact.

 

Sources of external control data

An ECA is only as good as the data it relies on. Broadly, there are three main sources:

  1. Other clinical trials:
    • Prior trials of standard of care treatments can serve as external comparators.
    • Individual patient-level data (IPD) is preferred, but often only summary data is available.
    • These data are typically high quality but may not perfectly match the new study population.
  2. Published studies:
    • Systematic reviews and meta-analyses of the literature can provide comparator data.
    • Useful when IPD is unavailable but limited by reporting standards and heterogeneity across studies.
  3. Real-world data (RWD):
    • Sources include electronic health records, registries, and insurance claims databases.
    • These capture routine clinical practice, reflecting the diversity of real patients.
    • However, RWD often suffers from missing data, variable quality, and lack of standardized endpoints.

Each source has strengths and weaknesses. Often, the best approach is to triangulate across multiple sources, ensuring that conclusions do not rest on a single dataset.

 

The value of earlier clinical trials

Earlier-phase trials (Phase I and II) can be particularly valuable in constructing ECAs. These studies often include control arms, detailed eligibility criteria, and well-captured endpoints.

For rare diseases and oncology, earlier trials may be the only available benchmark. By carefully aligning populations and endpoints, statisticians can extract maximum value from these datasets.

The challenge, of course, is ensuring comparability. Patient populations may differ in prognostic factors, supportive care practices may evolve, and definitions of endpoints may shift over time.

This is where advanced statistical methods become essential.

 

Reducing bias with propensity scoring

One of the key criticisms of ECAs is the risk of bias. Without randomization, patients receiving the experimental therapy may differ systematically from those in the external control.

Propensity score methods are a powerful way to reduce this bias. The idea is simple:

  • For each patient, estimate the probability (the “propensity”) of receiving the experimental treatment based on baseline characteristics.
  • Match or weight patients in the external control group so that their distribution of covariates mirrors that of the trial patients.

This approach creates a “pseudo-randomized” comparison, balancing measured variables. While it cannot eliminate unmeasured confounding, it greatly improves fairness in comparisons.

 

Quantitative bias analysis: Addressing the unmeasured

Even with careful propensity scoring, unmeasured confounding remains a concern. Clinical researchers often ask: What if there are factors we didn’t account for?

This is where quantitative bias analysis (QBA) enters. QBA does not eliminate bias but helps us understand its potential impact.

For example:

  • Analysts can model how strong an unmeasured confounder would need to be to explain away the observed treatment effect.
  • Sensitivity analyses can simulate scenarios with different assumptions about unmeasured variables.

By explicitly quantifying uncertainty, QBA provides transparency. Regulators and clinicians gain confidence that conclusions are robust — or at least, that limitations are clearly understood.

 

The need for experienced statisticians

Constructing an ECA is not a “plug-and-play” exercise. It requires expertise across multiple domains:

  • Data curation: Selecting fit-for-purpose datasets, cleaning and harmonizing variables, and aligning endpoints.
  • Study design: Defining eligibility, follow-up time, and analysis plans that minimize bias.
  • Statistical methodology: Applying techniques like propensity scoring, inverse probability weighting, Bayesian borrowing, and QBA.
  • Regulatory communication: Explaining assumptions, limitations, and sensitivity analyses in language that regulators and clinicians can understand.

In short, ECAs demand both technical skill and strategic judgment. Partnering with experienced statisticians ensures that external controls provide credible, decision-grade evidence rather than misleading comparisons.

 

Final takeaways

External control arms are rapidly becoming an indispensable tool in modern clinical research — especially in oncology and rare diseases, where traditional RCTs often fall short.

They offer:

  • Context for single-arm studies and unbalanced designs.
  • Support for both internal and regulatory decisions.
  • Guidance in study design and feasibility planning.

By leveraging diverse data sources — from earlier trials to real-world evidence — and applying rigorous methods such as propensity scoring and quantitative bias analysis, ECAs can bring clarity and credibility to difficult development programs.

But the value of ECAs depends on how well they are planned and implemented. Done poorly, they risk misleading decisions. Done well, they empower researchers, regulators, and clinicians to make better choices for patients.

As the field evolves, one thing is clear: the expertise of skilled statisticians is the cornerstone of successful ECAs.

 

Interested in learning more?

Join Alexander Schacht, Steven Ting, and Vahe Asvatourian for their upcoming webinar, “Beyond the Standard Clinical Trial in Early Development: When and Why to Consider External Controls” on Thursday, October 16 at 10 a.m. ET: