Collective Leadership Models Emerging in the Life Sciences
Collective Leadership at PHUSE APAC Connect and Beyond
In clinical research, structure defines much of how we operate. We work within protocols, regulatory frameworks, statistical hierarchies, and governance models. Accountability is clear. Escalation paths are defined. Ownership is documented. Structure gives us control, but participation gives us adaptability. And in today’s life sciences environment, adaptability matters more than ever.
As PHUSE APAC Connect kicks off its inaugural edition, what stands out is not simply its expansion into a new geography, but rather the way it is being built. It is being shaped collectively by stream leaders, contributors, presenters, and sponsors who have chosen to engage because they care about advancing clinical data and analytics.
PHUSE APAC Connect reflects something larger than an event. It reflects a shift in how influence works in our industry: influence is becoming more distributed, and leadership must evolve accordingly.
The community convenes not because it is instructed to, but because its members understand that progress in complex systems is co-created. Leadership through participation is no longer an abstract idea. It is how real progress happens.
Complexity has changed the rules
Over the past decade, the life sciences landscape has changed in meaningful ways:
- Clinical programs span continents
- Data volumes have expanded dramatically
- Regulatory expectations continue to evolve
- Digital transformation is no longer a roadmap, it is daily reality
We now operate within interconnected ecosystems rather than isolated silos. A trial design decision in one region influences submission strategy in another. An analytics innovation within one capability center can reshape processes globally. In such a system, centralized control has limits — contribution does not.
Participation is not symbolic; it has practical impact:
- It shortens decision cycles
- It enables faster knowledge sharing
- It strengthens collective memory
- It reduces vulnerability when complexity increases
Alignment in environments like ours cannot simply be mandated. It must be built. As Peter Drucker once said, “The best way to predict the future is to create it.” In our field, that creation happens through consistent collaboration. It happens when experienced professionals step forward, share openly, and help others navigate complexity.
Working together turns expertise into progress.
A parallel evolution: GCCs beyond arbitrage
In my recent white paper, “Beyond Cost Arbitrage: How Global Capability Centers Are Becoming Engines of Life Sciences Innovation,” I explored a transformation that closely parallels this shift.
Global Capability Centers (GCCs) were once primarily positioned around cost and scale. They were designed to optimize labor economics and expand operational capacity. That model delivered value in an earlier phase of globalization. Today, that view no longer captures the full picture.
Across life sciences, GCCs have matured into integrated capability hubs. They bring together clinical scientists, statisticians, regulatory specialists, advanced analytics teams, and digital engineers. They influence submission strategy, automation initiatives, and enterprise transformation efforts.
The most meaningful shift I observed was not structural. It was psychological. Leaders within these centers began to see themselves not as recipients of strategy, but as contributors to it. That shift changes the dynamic entirely.
When capability centers help shape standards, architecture, and innovation priorities, they move from supporting enterprise strategy to strengthening it. The center of gravity becomes more distributed, and with it, so does leadership.
That same redistribution of influence is visible in communities like PHUSE APAC Connect.
Collective stewardship in data standardization
Data standardization provides another perspective.
Standards do not evolve because they are declared. They evolve because experienced practitioners examine them, question them, refine them, and test them across real-world applications.
Respected contributors in this space, including colleagues such as Angelo Tinazzi, demonstrate how credibility is built over time through sustained engagement. Consistent participation in standards forums and industry dialogue reinforces an important principle. Influence in data and standardization is earned through contribution.
In global standardization efforts, credibility compounds gradually:
- Participation builds trust.
- Collaboration builds alignment.
- Alignment strengthens regulatory confidence.
Shared stewardship of standards is not an idealistic concept; it is central to ensuring submission quality and regulatory trust.
What this means for Cytel
For us at Cytel, this discussion is more than conceptual.
We operate at the intersection of science, statistics, and regulatory strategy. Our work shapes trial design decisions, submission readiness, analytical rigor, and ultimately patient outcomes.
In that context, expertise alone is not enough, engagement matters. Participating actively in communities like PHUSE helps us stay aligned with evolving expectations, exchange knowledge across regions, and contribute meaningfully to broader industry progress.
As capabilities become more globally distributed, leadership must become more inclusive and collaborative. Participation is not an extension of our strategy; it sits at its core.
Collective leadership strengthens resilience. It increases learning velocity and helps organizations adapt with confidence in an environment that continues to evolve.
From regional milestone to industry signal
PHUSE APAC Connect represents more than a regional milestone. It signals that APAC, supported by expanding GCC ecosystems and deep domain expertise, is not simply a delivery geography. It is an active contributor to global thought leadership.
When professionals volunteer their time to shape agendas, share implementation insights, and mentor emerging talent, they strengthen the connective tissue of the industry.
Leadership does not weaken when it is shared. It becomes more durable. In distributed systems, shared ownership strengthens outcomes.
Closing reflection
Across this industry, one observation continues to hold true: titles define reporting structures, but participation defines influence. In complex environments, authority may initiate progress, but contribution sustains it.
The future of clinical data and analytics will be shaped by those who consistently engage, who collaborate across boundaries, and who invest in strengthening the ecosystem around them.
Leadership is not something granted once. It is something practiced repeatedly and participation is how it is practiced at scale.
Interested in learning more?
Download my new white paper, “Beyond Cost Arbitrage: How Global Capability Centers Are Becoming Engines of Life Sciences Innovation”:
SDTM IG 4.0 and SDTM 3.0: Celebrating the End of SUPP?
After about five years since the release of CDISC IG 3.4, CDISC has just released CDISC IG 4.0 and SDTM 3.0 for public review. Comments are due April 6, with expected final release expected later this year.
The public review includes the Conformance Rules version 3.0 as well as three draft Knowledge Base articles exploring some of the main changes expected with IG 4.0:
- NS– Datasets: Why they were built as they were.
- Why change the structure of SDTMIG metadata?
- Why does the DC domain differ from what’s described in FDA’s TCG?
For a quick overview of the impact of these changes, see the CDISC Standards timeline webpage or the revision history available in the draft version wiki for public review.
Celebrating or regretting the end of SUPP?
We will be moving, for example, from something called SUPPAE to something called NSAE, with a less “normalized” structure. Will this be “a small step for a man, a giant leap for mankind”? “Ai posteri l’ardua sentenza.”1
The change will require us to go from this:
to this:
The structure of these new dataset(s) is “One record per related dataset record,” meaning that the many-to-one relationship will no longer be possible, for example, an NS that applies to several records in the parent domain via –GRPID. That said, there is a hope that this new structure will simplify metadata handling and potentially facilitate the adoption of future data exchange format, such as CDISC Dataset-JSON.
Three new domains
Three new proposed domains have been introduced:
- DC (Demographics for Multiple Participations)
- GI (Gastrointestinal System Findings)
- EA (Event Adjudication) are three new proposed domains
DC has been around, unofficially, for some time, following the requirements introduced by the FDA in its FDA Study Technical Conformance Guide (see here my previous blog). This domain supports the representation of multiple enrollments within the same study. Along with DC, SUBJID has been added to all subject-level domains to differentiate data “generated” from each individual subjects’ participations.
Compared with FDA requirements, SDTM IG 4.0 also covers scenarios in which the same subject is enrolled multiple times, not only multiple screenings.
Identification of “Primary Enrollment,” and therefore how DM variables are populated, is left to the sponsor’s discretion. However, in cases where a subject experiences one or more screen failures before finally enrolling, the successful enrollment should clearly be considered the primary one.
EA, a Findings About domain, provides a common structure for studies requiring independent, peer-reviewed endpoint adjudication. In my view, it partially solves the issue of representing study endpoints where more complex “adjudication” is required; for example, in oncology study with efficacy based on tumor response.
Changes in metadata
Several new metadata have been introduced, along with some changes. The goal is to improve understanding of variables and their intended use, without impacting metadata included in a submission, e.g., define.xml.
So, when looking at the new SDTM IG, you will notice the following key differences among others:
- Controlled Terms, Codelist or Format is now split into three separate columns
- Variable Group has been added to group variables, for example Results Unit, or Results Value
- Some information previously included the “CDISC Notes” column are now reported in the “Examples” column
Other Changes
New versions of IGs are also an opportunity to fix issues (such as typos) and to clarify implementation that previously caused misunderstandings. For example, additional guidance on what Specimen-based Findings domain to use under specific circumstances, such as clarifying that anti-microbial antibody testing data should be mapped to IS domain rather than MS.
Some standard variables have been deprecated, such as –BLFL (Baseline Flag) for Findings domains, and others have been added. One notable addition is –CLASI variable, particularly useful for classifying Protocol Deviations to support requirements for “ICH E3 Q&As (R1)).” This variable is now officially part of the DV domain as DVCLASI, e.g., MINOR/MAJOR. More details on planned new and deprecated variables in all Observational Classes can be found in the CDISC Wiki.
Rumors about deprecating the PP domain appear to be unfounded, as PP is still there.
Want to know more?
You can participate in the public review and explore the details yourself. Check here.
My former colleague Varun Debbeti has also done an excellent job in his clinstandards webpage.
A more in-depth discussion of the expected changes will be also presented at upcoming CDISC-EU Interchange in May and this time in my hometown, Milan, and co-chaired by my colleague Silvia Faini.
Cytel will be present with two oral presentations and one poster:
- “It Got Worse Than Expected: Three Years of Retrospective CBER Requests on SDTM, ADaM, and TFLs” by Mark Malayas and Angelo Tinazzi
- “Authenticity Matters: Preserving Standards Integrity from Clinical Data Models to Tiramisù” by Angelo Tinazzi
- “JSON and CORE Unlocking Adoption” by Silvia Faini, Sebastià Barceló, Hugo Signol, and Angelo Tinazzi
See the here full draft agenda.
We look forward to reconnecting with colleagues from around the world, meeting new peers, and exchanging ideas at 2026 CDISC + TMF EU Interchange 2026.
See you in Milan?
Parkinson’s Disease Through a Statistical Lens
Parkinson’s disease — a progressive movement disorder of the nervous system — affects more than 1.1 million people in the US (and over 11 million globally), with an estimated 90,000 new diagnoses each year, making it the second-most common neurodegenerative disease after Alzheimer’s disease.1,2 The prevalence and rise of Parkinson’s disease has led to robust investment in understanding and treating this disorder.3
Here, we provide a brief overview of Parkinson’s disease and discuss common endpoints used in clinical trials with an illustrative case study on how those endpoints may be analyzed.
An introduction to Parkinson’s disease
Parkinson’s disease is a progressive movement disorder of the nervous system.4 It causes nerve cells (neurons) in parts of the brain to weaken, become damaged, and die, leading to symptoms that include problems with movement, tremor, stiffness, and impaired balance. As symptoms progress, people with Parkinson’s disease (PD) may have difficulty walking, talking, or completing other simple tasks.
The rate of PD progression and the particular symptoms differ among individuals. The four primary/hallmark symptoms of PD are tremor, rigidity, bradykinesia, and postural instability.
Other problems related to PD may include mental and emotional health problems, speech changes, dementia or other cognitive problems, pain, and fatigue.
On and Off states/periods
The On state is when PD medications are effective and motor and non-motor symptoms are controlled. The Off state is when PD symptoms return between medication doses or in the morning before the first dose.
Measuring Parkinson’s disease severity: Two evaluation methods
MDS-UPDRS: Evaluating motor and non-motor symptoms
The MDS-UPDRS (Movement Disorder Society–Unified Parkinson’s Disease Rating Scale) was developed to evaluate various aspects of PD, including daily non-motor and motor experiences and motor complications.5, 6
It is the most frequently used outcome in clinical trials, though it can also be employed in the clinical setting. It consists of four parts with 50 items in total, with each item rating the impairment with scores from 0 (normal) to 4 (severe). A patient’s global impairment is calculated as the total sum of these scores, with a higher score indicating greater impairment. Missing values might be imputed by the worst-case value of 4 (severe) if sufficient items are scored, otherwise the total score is set to missing. Each part can be analyzed separately as well.
MDS-UPDRS:
Parts of the MDS-UPDRS can be assessed during the ON and OFF state to evaluate the differences between those two states.
PDQ-39: A patient-reported health status questionnaire
The PDQ-39 (Parkinson’s Disease Questionnaire) is a 39-item patient-reported measure that assesses Parkinson’s disease–specific health-related quality of life.7, 8
It requires the patient to grade how often he/she experienced difficulties over the past month. Each item is scored on a scale from 0 (never) to 4 (always or cannot do at all, if applicable), with lower scores indicating better status. Items are grouped into eight dimension subscales.
PDQ-39:
PDQ-39 subscale scores range from 0 to 100, with 0 representing perfect health for the dimension and 100 representing worst health for the dimension. A PDQ-39 total score — the PDQ-39 Summary Index (PDSI) — can be computed as the mean of the eight PDQ-39 subscale scores providing an overall score reflecting the impact of Parkinson’s on quality of life.
In case of missing values, a possible approach is to impute missing values with the mean of the available subscale items, if the number of missing values is smaller than 50% within the subscale.
LED (Levodopa Equivalent Dose)
The dose of antiparkinsonian medication is standardized to the LED in mg based on predefined conversion rates.
A confirmatory Parkinson’s study: Statistical analysis and adaptive design
Our team partnered with a large biotech and biomedical engineering company to conduct the statistical analysis of a multi-center, open-label (one-arm) adaptive confirmatory study that used a device providing deep brain stimulation for Parkinson’s patients. The efficacy and futility boundaries of the adaptive design were computed using Cytel’s East Horizon™ platform.
The study had the following endpoints:
- Primary endpoint: MDS-UPDRS (part III)
- Secondary and exploratory endpoints: Other parts of MDS-UPDRS, PDQ-39, Clinical Global Impression of Change (CGI), Schwab and England ADL (Activities of Daily Living), antiparkinsonian medication use
Statistical analysis and its challenges
MDS-UPDRS (part III) score, PDQ-39, and antiparkinsonian medication use were analyzed using the paired t-test and CGI was analyzed using the non-parametric Wilcoxon signed-rank test. The Schwab and England ADL scale was analyzed with an ANOVA.
The first challenge was to understand the differences between the Off and On states. We also had to deal with missing data. It was decided that the missing values on visit level would be imputed by the worst response observed among all participants (primary analysis), with sensitivity analyses employing the baseline observation carried forward (BOCF) and the multiple imputation (MI) using Markov chain Monte Carlo (MCMC) methods.
Another more challenging aspect was understanding and programming the antiparkinsonian medication use (analyzed as secondary endpoint), which is calculated in LED. For this task, a close collaboration with the sponsor’s medical experts was needed to define the conversion factors and handle correctly special cases of medication combinations.
An adaptive design with four interim analyses
The study was designed to include four interim analyses and one final analysis, using the Lan-DeMets group sequential method with the O’Brien-Flemming α-spending function and Pocock β-spending function. The O’Brien-Fleming boundaries preserve a nominal significance level at the final analysis that is close to that of a single test procedure, so it is very conservative for the earlier interim analysis.9 The Pocock β-spending function uses approximately equal cutoffs for each analysis.
The efficacy and futility boundaries were computed via Cytel’s EAST software, which is integrated into the East Horizon™ platform. For the interim analyses, the efficacy and futility boundaries had to be recalculated based on the actual sample sizes.
Final takeaways
Parkinson’s disease is a lifelong and progressive, degenerative multiple-symptom disease that affects millions worldwide. The treatment is highly individualized and depends on the disease stage and severity of motor and non-motor symptoms. When symptoms become bothersome, current therapies primarily focus on symptom management, with pharmacological options such as levodopa and dopamine agonists forming the cornerstone of care. For those whose symptoms don’t respond well to medication in later stages, advanced options like deep brain stimulation (DBS) offer hope, which can provide relief for tremors and reduce dyskinesias.
The adaptive design of the case study offered a flexible, efficient, and ethical approach without compromising the validity and integrity of the study, which is implemented in the East Horizon™ platform that offers a comprehensive tool for trial design during all stages of development.
Why Biomarkers Are Transforming How We Diagnose and Understand Alzheimer’s Disease
Alzheimer’s disease (AD) is the most common cause of dementia worldwide, accounting for about two thirds of cases. It is a progressive neurodegenerative condition that slowly erodes memory, thinking, and the ability to manage daily life. Globally, tens of millions of people live with Alzheimer’s today — numbers expected to grow sharply as populations age. The personal toll on individuals and families is profound, and the economic burden runs into hundreds of billions annually when considering healthcare costs, social care, and lost productivity.
Early diagnosis remains one of the biggest challenges. Historically, clinicians relied on observing cognitive decline — often when the disease is already advanced and irreversible brain damage has occurred. Emerging tools, particularly biomarkers, are beginning to change this picture.
The value of biomarkers in Alzheimer’s disease
Biomarkers are measurable indicators of biological processes, disease activity, or treatment response. In Alzheimer’s disease, they help detect characteristic brain changes — such as beta amyloid plaques and tau protein tangles — long before symptoms appear. These can be identified through imaging, cerebrospinal fluid tests, and, increasingly, blood-based biomarkers.
Research shows that:
- Biomarkers can detect early pathological changes along the Alzheimer’s disease continuum, even in individuals without symptoms.1
- Blood-based biomarkers (BBMs) such as plasma amyloid and tau provide a low cost, accessible alternative to PET scans and lumbar punctures.2
- Updated diagnostic frameworks now define Alzheimer’s biologically, emphasizing that biomarker-detected pathology is equivalent to diagnosing the disease.3
This shift opens opportunities for earlier intervention, improved trial recruitment, and more personalized care.
What recent biomarker research tells us
1. Alzheimer’s is now considered a “biological disease” detectable long before symptoms.
Updated clinical criteria emphasize that identifying Alzheimer’s related proteins — through imaging, cerebrospinal fluid, or blood — is sufficient to diagnose the disease, even in people who feel cognitively normal. This reflects decades of evidence showing pathology accumulates silently before memory loss begins.4
2. Blood tests are emerging as game‑changers.
Advances in ultra‑sensitive technologies now allow scientists to detect minute amounts of proteins that leak from the brain into the blood. These tests measure markers like amyloid‑β and phosphorylated tau — proteins central to Alzheimer’s disease. Because they require only a simple blood draw, they enable repeated testing over time, making disease monitoring easier and more accessible.5
3. Blood biomarkers may revolutionize primary care detection.
Many people with early cognitive impairment go undiagnosed, especially in community settings. New blood-based biomarker tests can be integrated into routine care to flag individuals at high risk earlier, long before they reach a specialist.6
4. Early diagnosis enables better care and more timely treatment.
Because Alzheimer’s pathology starts decades before symptoms, identifying the disease early can help clinicians initiate supportive measures, guide lifestyle interventions, and offer patients and families time to plan. It also allows people to access relevant clinical trials at the most impactful disease stage.7
5. Biomarkers deepen scientific understanding of Alzheimer’s progression.
Different biomarkers reveal different aspects of disease — amyloid reflects early accumulation, tau correlates more strongly with neurodegeneration, and neurofilament light chain indicates nerve cell damage. Using multiple tests helps clinicians and researchers understand where a person lies along the disease continuum.8
Focusing on Amyloid Beta (Aβ)
I, along with my co-authors, recently published a study that investigated whether changes in amyloid beta (Aβ) can reliably predict whether a treatment will help patients think and function better, i.e., whether Aβ is a surrogate marker. Many new Alzheimer’s drugs reduce Aβ levels, and some have even been approved based on this effect. But the big question remains: Does lowering Aβ actually translate into meaningful clinical benefit?
We collected data from 23 clinical trials of seven different anti-amyloid monoclonal antibody drugs.
These trials reported treatment effects on both:
- Aβ levels (using brain scans such as PET SUVR or the Centiloid scale), and
- Clinical outcomes, including:
- Clinical Dementia Rating – Sum of Boxes (CDR‑SOB)
- Mini‑Mental State Examination (MMSE)
- Alzheimer’s Disease Assessment Scale–Cognitive Subscale (ADAS‑Cog)
The team used a Bayesian meta-analysis, a statistical method that combines results across many studies to look for broad patterns.
The results showed that lowering amyloid‑beta often — but not always — leads to better clinical outcomes in Alzheimer’s disease. Aβ is a promising surrogate marker at the group level, but it is not reliable enough to predict benefit for individual drugs without additional evidence. Its use by health technology assessment agencies such as NICE and ICER to make decisions about the value of the new disease modifying treatments should take this into account.
The need for continued research into Alzheimer’s biomarkers
Biomarkers are reshaping the landscape of Alzheimer’s disease, offering hope for earlier, more accurate diagnosis and more tailored therapeutic strategies. But despite these advances, more work is needed.
Continued research is vital to:
- Improve the accuracy and reliability of blood-based tests
- Ensure tests are validated across diverse populations
- Link biomarker changes more precisely to clinical outcomes
- Support equitable access in primary care and low resource settings
As we enter an era of disease-modifying therapies, biomarkers will be indispensable — guiding diagnosis, monitoring response, and helping patients receive the right treatment at the right time.
The future of Alzheimer’s care will be biomarker driven, and ongoing research is the key to making that future accessible to all.
Interested in learning more?
FDA’s Bayesian Guidance: Strategic Considerations for Sponsors
The FDA’s January 2026 draft guidance, “Use of Bayesian Methodology in Clinical Trials of Drug and Biological Products,” clarifies how the Agency expects sponsors to justify Bayesian approaches, especially when an informative prior borrows external information to support primary inference. As a draft guidance, it is nonbinding and not for implementation.
This blog highlights strategic considerations that should inform development planning, protocol/SAP design, and FDA engagement.
Type I error control is not the only path
The guidance notes that calibrating Bayesian success criteria to a Type I error rate “may not be applicable or appropriate” when borrowing external information. In those settings, sponsors may instead define success using posterior probability criteria (e.g., Pr(d>a)>c) and, where appropriate, benefit-risk or decision-theoretic frameworks.
At the same time, the draft guidance also recognizes that Bayesian methods are often used within an overall frequentist framework (e.g., to facilitate complex adaptive designs), where Type I error calibration can remain appropriate. Regardless of the framework, success criteria should be pre-specified and justified.
Strategic implication:
When the FDA and sponsor agree that a design does not need to be calibrated to the Type I error rate (often discussed in pediatrics and rare diseases), the draft guidance describes alternative operating characteristics such as Bayesian power (probability of success averaged over a prior) and the probability of a correct decision (akin to positive predictive value). That flexibility increases the premium on a well-justified analysis prior, credible simulations, and early FDA alignment.
Prior specification is now a regulatory deliverable
The draft guidance recommends that sponsors pre-specify and justify the prior in the protocol, document external information sources (including exclusions), and quantify prior influence metrics. For informative priors, the FDA emphasizes a systematic, transparent review of the totality of relevant evidence — effectively bringing evidence-synthesis discipline into prior construction.
Key expectations:
- Pre-defined source selection criteria before searching for external data
- Patient-level data preferred over published summary statistics
- Randomized controlled evidence is generally preferred over single-arm or observational sources
- Documentation of sources considered and excluded, with rationale
Strategic implication:
Prior construction cannot be a post-hoc exercise. Build the evidence base for your prior prospectively — ideally while Phase 2 is ongoing — and then plan early for patient-level data access and any needed re-analyses to align primary estimand/estimators/strategies for handling intercurrent events. If patient-level data from prior studies are not accessible, negotiate data-sharing early or design natural history studies with Bayesian use in mind.
Dynamic discounting provides protection — with complexity
The draft guidance discusses both static and dynamic discounting approaches for borrowing external information. Dynamic approaches (e.g., commensurate/supervised power priors, mixture priors, Bayesian hierarchical models, elastic priors) can reduce borrowing when prior-data conflict emerges. These approaches can improve robustness but introduce additional parameters and assumptions that need justification. The FDA also notes the applicability of discounting methods is case-by-case and should be discussed with the Agency.
Strategic implication:
For rare diseases with uncertain external data relevance, dynamic discounting is often an important safeguard. For common diseases with robust and highly relevant prior data, simpler (static) discounting may suffice and can simplify the regulatory narrative. Either way, determine the discounting approach while still blinded to the results of the trials that will be borrowed — per the guidance’s explicit recommendation — and support the choice with simulations that span plausible degrees of prior data conflict.
Effective sample size is a central metric — Not Type I error inflation
The draft guidance recommends against using Type I error inflation to measure prior influence, calling it “philosophically inconsistent.” Instead, it highlights Effective Sample Size (ESS) and other metrics (e.g., the prior-only estimate) as more interpretable ways to quantify borrowing. The guidance also notes that multiple ESS calculation methods exist, and that ESS can exceed the source-study sample size when the variability in the target population variability is higher.
Strategic implication:
Quantify and present ESS across a plausible range of outcomes, including summary statistics such as maximum and mean values. For dynamic methods, show how ESS changes with different degrees of prior-data agreement. Be prepared to explain why ESS may differ from the original study’s nominal samples size and reassess influence after trial completion when dynamic priors are used.
Simulation standards are now explicit
The draft guidance recommends providing a comprehensive simulation report (including code, implementation details, and results) across pre-specified, plausible scenarios, including pessimistic assumptions about treatment effect. Simulations should address statistical parameters (e.g., variance, background rate, intercurrent events) as well as operational assumptions such as accrual rate. For MCMC-based analyses, computational settings (warmup/burn-in, iterations, chains, convergence diagnostics) and any other important algorithm-specific settings should be documented for reproducibility.
Strategic implication:
Treat simulations and computational reproducibility as submission-grade deliverables, not just internal design exploration. Establish reproducible computational workflows from the start. Pre-specify scenarios and decision rules, and define contingency procedures for implementation issues (e.g., MCMC non-convergence) before the first interim look and before the final analysis.
Early FDA engagement is essential
The draft guidance states that “the time needed for FDA and the sponsor to align on an appropriate prior should be considered in the development of the intended trial” and recommends submitting information “as early as possible to ensure sufficient time for FDA feedback prior to initiation.” The draft guidance also states that sponsors should have early discussions with the Agency about the planned estimands, estimators, and approaches for handling missing data in the analyses of external data that will be borrowed, and any differences relative to the approaches planned for the prospective trial data.
Strategic implication:
Use early interactions (e.g., Pre-IND or End-of-Phase 2 meetings and, where applicable, the Complex Innovative Trial Design (CID) program) to align on prior specification, success criteria, operating characteristics, and simulation strategy before protocol finalization. Include detailed design comparisons in meeting packages — the draft guidance explicitly recommends comparing proposed Bayesian designs against an alternative, including simpler alternatives.
Interim analyses: Design the decision points upfront
The guidance emphasizes that in trials with interim decision-making (e.g., group sequential designs), success criteria should be specified for each decision point. When Bayesian success criteria are calibrated to Type I error rate, interim criteria can be constructed to preserve overall control of the family-wise error rate across looks.
For designs not calibrated to Type I error rate, operating characteristics are calculated relative to the prior and can be especially sensitive when the sample size is small — or when an early interim look makes the effective sample size small. The guidance also notes that skeptical (or enthusiastic) priors can be used in adaptive settings to temper early stopping behavior for efficacy (or futility), but the resulting decision framework should be demonstrated via simulation.
Key interim analysis considerations:
- Pre-specify what decisions can be made at each look (e.g., stop for efficacy, stop for futility, adapt) and the exact posterior or predictive criteria that trigger each action.
- Simulate interim timing under realistic accrual, endpoint maturation, and missing data patterns — not just idealized information fractions.
- Plan prior sensitivity and robustness checks targeted at early looks, where prior influence is greatest (e.g., alternative priors and alternative borrowing strengths).
- Operationalize Bayesian computation for interim timelines: reproducible pipelines, diagnostic thresholds, locked code/versioning, and contingency plans for non-convergence.
- Protect safety and benefit-risk interpretability: consider minimum exposure or follow-up requirements even if an early efficacy threshold is met.
Strategic implication:
Treat interim analyses as part of the regulatory-facing Bayesian package, with pre-specified decision rules, simulations that stress-test early looks, and an execution plan that can be reproduced under tight timelines.
Rare vs. Common Disease Considerations
| Consideration | Rare Diseases | Common Diseases |
| Justification for borrowing | Often straightforward: document infeasibility of a conventionally powered randomized trial (small populations and/or ethics) and explain how borrowing supports interpretable benefit-risk. | Higher burden: efficiency gains alone may not suffice; clearly demonstrate relevance, address potential bias, and explain why non-borrowing alternatives are not adequate. |
| Prior data availability | Often limited; may rely on natural history studies/registries, small prior trials, and/or structured expert elicitation. | Typically richer: Phase II/earlier indications, external trials, and real-world data may be available, but heterogeneity and relevance must be managed. |
| Recommended approach | Dynamic discounting and robust priors; success criteria not calibrated to Type I error may be appropriate when FDA and sponsor agree; plan extensive sensitivity analyses. | Bayesian methods embedded in a Type I calibrated frameworks when appropriate; borrowing (if used) is typically limited and carefully justified; pediatric extrapolation handled via separate extrapolation plan. |
| Key success factor | Prospective natural history characterization and early alignment on estimand definition and strategies to make external data relevant. | Early data-sharing to enable patient-level review, alignment on estimand definition and strategies, and covariance adjustment, plus a clear relevance narrative and drift/bias mitigation plan. |
The bottom line
This draft guidance provides a clearer regulatory pathway for Bayesian methods, but that pathway requires substantial upfront investment in prior construction, estimand definition, and strategies for handling intercurrent events, simulations, and documentation at submission quality. The strategic question is not whether Bayesian methods are acceptable in principle — it is whether the efficiency gains justify the additional complexity and review burden for your specific program.
For rare diseases, the answer is often yes. Bayesian borrowing may be the only viable path to interpretable and approvable evidence. For common diseases, the calculus is more nuanced; borrowing typically needs a stronger relevance argument and may be most defensible when embedded in a Type I calibrated framework. Either way, the strategic decisions about prior specification, discounting method, and operating characteristics should be made early, documented thoroughly, and aligned with FDA before the pivotal trial initiation.
What’s clear is that biostatisticians must now be prepared to operate in both paradigms:
- To calibrate Bayesian designs to Type I error when appropriate, and
- To construct and defend fully Bayesian alternatives (including borrowing) when circumstances warrant.
The January 2026 draft guidance does not eliminate the traditional framework; it expands the toolkit. Using that expanded toolkit effectively will require new skills, new conversations, and new ways of thinking about evidence.
The statistical methodology exists. FDA expectations are clearer. The challenge is execution.
Interested in learning more?
Cytel invites you to an interactive Office Hours session with Melissa Spann and Savina Jaeger on Wednesday, March 4 at 9 am ET, where you will have the opportunity to ask questions about the FDA’s Draft Guidance for Industry: Use of Bayesian Methodology in Clinical Trials of Drugs and Biological Products:




