Solutions
About Us
Insights
Careers

Evaluating Safety and Efficacy in Phase III Alzheimer’s Disease Trial: Endpoints and Statistical Analysis Methods

In clinical trials studying Alzheimer’s disease — a complex neurodegenerative condition that gradually impairs cognitive functions — cognitive performance and functional abilities are often assessed together. Understanding these dimensions and how they’re measured in clinical trials is essential in shaping Cytel’s statistical analyses.

Here, we discuss our experience working with a sponsor on a Phase III clinical trial evaluating the safety and efficacy of monotherapy in patients with Alzheimer’s disease and the statistical model we used to analyze the repeated measurements on two co-primary endpoints.

 

Alzheimer’s disease

Alzheimer’s disease is a complex neurodegenerative condition that gradually impairs cognitive functions. Its onset and progression are influenced by a range of risk factors and some of the most well-established include age, gender, family history, genetic predisposition, and underlying health conditions.

The disease unfolds in distinct stages, each reflecting a different level of cognitive and functional decline. These stages range from mild cognitive impairment to severe dementia, with symptoms worsening as the disease advances.

 

Evaluation of Alzheimer’s disease in clinical trials

In clinical trials, the severity of impairment is evaluated using various scales, each addressing distinct aspects of cognitive and functional decline. The most effective approach combines both cognitive and functional assessments, as functional abilities are closely tied to cognitive performance.

Understanding these dimensions and how they’re measured in clinical trials is essential in shaping the statistical analyses used. Multiple discussions between stakeholders and the sponsor need to take place to reach a consensus on the appropriate endpoints and statistical methods to be used for the analyses.

 

Investigating safety and efficacy of monotherapy in patients with Alzheimer’s disease

We recently collaborated with a small biotech company specializing in Alzheimer’s research on a Phase III clinical trial investigating the safety and efficacy of monotherapy in participants with Alzheimer’s disease, followed by a 12-month open-label treatment. This study has been the subject of complementary analyses exploring biomarkers (p-tau181 and p-tau217) and additional comparative effectiveness analyses with external control arms.

 

Two primary endpoints: ADAS-Cog11 and ADCS-ADL23

To evaluate treatment efficacy in the Phase III trial, we focused on two co-primary endpoints: the ADAS-Cog11 and the ADCS-ADL23, measured at multiple timepoints throughout the study.

 

ADAS-Cog11: The cognitive assessment

The ADAS-Cog11 is a cognitive subscale that assesses key domains such as memory, praxis, orientation, and language. Scores range from 0 to 70, with higher scores indicating greater cognitive impairment. A more refined version of the ADAS-Cog11, known as the ADAS-Cog13, includes two additional items that assess memory and attention. This new version provides additional sensitivity to change in cognition at earlier stages of AD.

For the primary analysis, ADAS-Cog11 was retained as the primary endpoint. This decision was guided by its use in previous studies evaluating the same investigational product, ensuring consistency and comparability across trials. The added value of the ADAS-Cog13 was also analyzed as an explorative efficacy variable to provide deeper insights into cognitive outcomes.

 

ADCS-ADL23: The functional perspective

The ADCS-ADL23 scale complements the ADAS-Cog11 by providing a functional perspective that reflects the impact of cognitive decline. It evaluates the ability to perform daily living activities, with scores ranging from 0 to 78, where higher scores reflect better functional ability and thus less impairment.

 

Cytel’s approach: Analysis with Mixed Models for Repeated Measures (MMRM)

To analyze the repeated measurements on the co-primary endpoints, we employed Mixed Models for Repeated Measures (MMRM). This approach allows the comparison of cognitive and functional changes over time across treatment arms in a robust and flexible way.

In our models, several key risk factors are included to ensure a well-adjusted analysis. These include baseline disease severity, as measured by the Mini-Mental State Examination (MMSE), prior use of standard AD treatments, and geographic region, as fixed effects. Adjustment for baseline values of the ADAS-Cog11 or ADCS-ADL23 scores is considered to account for differences between subjects at baseline. This helps improve the precision of treatment effect estimates and correct for any imbalances between treatment groups. We also include the treatment group indicator along with its interactions with visit timing to capture if and how treatment effects evolve over time.

This method is particularly valuable for multiple reasons. First, it allows controlling for variables that could influence the observed outcomes — like known risk factors — to be able to understand the treatment effect more accurately. Additionally, by using mixed effects models, both the between and within-subject variability over time is accounted for, which is especially important in a heterogeneous condition like Alzheimer’s. Finally, one of the key strengths of MMRM is its ability to handle incomplete data, meaning it can account for missing values without requiring imputation.

The MMRM method supports the generation of individual and group profile graphs over time. These visualizations offer a clear and intuitive way to observe the evolution of treatment effect. They make it easier to compare trends across groups or subjects, and communicate findings in a straightforward manner, both to scientific audiences and to stakeholders who may not be familiar with the statistical details.

 

Final takeaways

Alzheimer’s disease is the most prevalent neurodegenerative disease and remains one of the most complex challenges in clinical research, requiring robust methodologies to capture both cognitive and functional decline over time. Complementary and adapted clinical scales are essential tools for assessing disease progression, and advanced statistical methods offer a robust and flexible interpretation of the treatment effect.

By leveraging adaptive models, mixed-effects approaches, and sensitivity analyses, we help sponsors generate reliable insights that drive decision-making in neurodegenerative drug research.

A Preview of Cytel’s Contributions at PHUSE EU 2025

I can’t believe it has already been a year since we wrapped up PHUSE EU Connect 2024, and in two weeks we will be gathering another exciting PHUSE EU Connect conference, only a few kilometers from Heidelberg, where everything started twenty years ago with the very first PHUSE event. I was one of the couple hundred lucky attendees and now, twenty years later, I have the great honor of supporting Jennie McGuirk and Jinesh Patel as Conference Co-chair for this year’s edition.

With a promising agenda featuring about 190 presentations, 34 posters, 9 hands-on workshops, 2 panel discussions, and 3 inspiring keynote speakers, this year we are going to the city of Hamburg for the 21st PHUSE EU Connect. The agenda is full of topics looking toward the future, with about 40 talks and posters referring to AI in their titles, and once again open source will be the confirmed leitmotif.

Cytel will make a significant contribution this year, perhaps more than ever, with six presentations, one poster, active participation in both panel discussions, and co-chairing the “Scripts, Macros and Automation” and “People Leadership & Management” streams.

 

Monday topics: Agile code writing, extracting metadata from R OOP functions, and leadership

The week kicks off on Monday with Kamil Foltynski, who will present “Overcoming Challenges in Collaborative Spreadsheet Editing with Shiny, SpreadJS and JSON-Patch” in the Application Development stream at 11:30 am. Kamil will provide a technical deep dive into enabling real-time spreadsheet editing within Shiny applications, using tools such as SpreadJS, sharing key lessons learned so far. Following Kamil’s presentation, Eswara Satyanarayana Gunisetti, will present “Micro-Decisions, Macro Impact: The Role of Agile Thinking in Every Line of Code” in theCoding Tips & Tricks” stream at 12 pm. See his recent blog on the topic. Eswara will share how an agile “mindset” can positively influence the way we write code.

In the same stream, a few hours later at 2 pm, another colleague Edward Gillian, in collaboration with Sanofi, will present “Risk.assessr: Extracting OOP Function Details,” discussing strategies for extracting metadata from R Object-Oriented Programming functions. Prior to Eswara and Edward’s sessions, at 1:30 pm, Kath Wright, will moderate the Interactive People Leadership & Management session “Invisible Glue: Trust, Influence and The Architecture of Teamwork.” With this live workshop, attendees will engage in practical exercises to learn how to identify barriers to trust, evaluate influence dynamics, and apply evidence-based strategies to strengthen collaboration in both physical and virtual environments.

 

Tuesday topics: Industry trends, extracting macro usage and dependency information from SAS programs, and integrating ECA data into CDISC-compliant datasets

Tuesday also brings two presentations and one poster. Right after lunch at 1:30 pm, Cedric Marchand will join other industry leaders in the panel discussion “Reimagining Statistical Programming: AI, Standards & the Talent of Tomorrow.” The panel will explore how current industry trends, such as AI, open source, and the evolution of data standards, will influence the next generation of statistical programmers.

The afternoon continues at 4 pm with my young and talented colleague Marie Poupelin, who will present “From Zero to Programming Hero: How Internships Shape Statistical Programmers in a CRO” in the “Professional Development” stream. Marie is a great example of the success of our internship program, and she will share her journey from having “zero” statistical programming experience to becoming an industry-ready programmer. Thirty minutes later, at 4:30 pm, Guido Wendland will present “Which Macros Are Used in the Study?” in the “Scripts, Macros and Automation” stream, a stream co-led this year for the first time by my colleague Sebastià Barceló. Guido will discuss techniques to extract macro usage and dependency information from SAS programs; this is particularly useful for identifying potential issues or estimating the impact of macro updates.

Later, in the traditional Tuesday evening poster session, you can join my colleague Cyril Sombrin in discussing “Our Journey in Integrating External Control Arms (ECAs) and RWD for Rare Disease Trials.” There you can discuss real-world case studies on integrating ECA data into CDISC-compliant datasets, exploring the unique challenges and solutions when aligning real-world data with CDISC standards.

 

Wednesday topics: Real-time spreadsheet editing within Shiny applications and real-time validation and streamlined submissions

On Wednesday at 12 pm, Hugo Signol, another young talented Cytel statistical programmer and a product of our internship program, will present his talk “From XPT to Dataset-JSON: Enabling Real-Time Validation and Streamlined Submissions.” Building on Cytel’s experience from CDISC Dataset-JSON-Viewer Hackathon, Hugo will demonstrate a Shiny application that supports interactive exploration and real-time validation through API-based checks.

 

Meet us there!

Cytel will be at Booth 9 at the conference, where you can engage in discussions with our team or meet any of us throughout the week.

I hope I didn’t miss anyone, or anything! We look forward again to reuniting with colleagues and friends from around the world and meeting new acquaintances.

See you all in Hamburg!

Statistical Insight for Strategic Impact: How Statisticians Help Medical Affairs Make the Most of Their Data

Medical affairs is a critical function in the drug development and commercialization process. It ensures that scientific and medical information about a drug is accurate, balanced, and clearly communicated to healthcare providers, medical professionals, patients, and other stakeholders.

Statisticians work closely with Medical Affairs teams to help make the most of their data.

In this blog, we will introduce medical affairs, discuss the role of statisticians in supporting their teams, and share a case study illustrating our collaboration with a sponsor for a neurology drug.

 

Medical Affairs: A bridge between clinical research and the broader healthcare community

Medical affairs guide drug development strategy and clinical communication through different aspects:

 

Post-marketing studies

In the post-marketing phase of a drug’s lifecycle, Medical Affairs play a key role by supporting:

  • Phase IV studies designed to gather additional information on a drug’s effectiveness and safety in the real-world setting.
  • Observational studies aimed at understanding treatment patterns, patient outcomes, and disease epidemiology.
  • Local affiliate studies driven by country teams to address specific market needs, regulatory requirements, or strategic priorities.

 

Exploratory analyses

Medical affairs teams perform exploratory analyses of clinical trials data to gather new insights in addition to those available at the time of the drug approval. These analyses could support:

  • Label expansion, by extending efficacy or safety results to other populations.
  • Scientific publications that help the activities in the medical and scientific community.
  • Real-world relevance, by contextualizing trial results with clinical practice or patient needs.

Exploratory medical affairs analyses also help by exploring tertiary endpoints from completed clinical trials or by looking further at endpoints such as quality of life or biomarkers.

 

Analyses for publications

In addition to exploratory analyses, medical affairs conduct data analyses that are used to support:

  • Medical education materials
  • Congress abstracts, posters, and oral presentations
  • Manuscripts

 

The role of the statistician in medical affairs

Statisticians (like those of Cytel’s Project-Based Services (PBS) team) work closely with medical affairs to support scientific and strategic objectives. Their role includes:

 

Deep understanding of the therapeutic context

Statisticians ensure analyses are relevant and aligned with clinical objectives by building a strong foundation of medical and scientific knowledge.

 

Support of medical affairs studies

Statisticians are hands-on, analyzing and interpreting medical affairs studies, helping generate real-world evidence and actionable insights.

 

Interdisciplinary collaboration
Working with clinical and medical writing teams ensures that outputs are robust and meet the latest standards in data communication.

 

Close collaboration with clients
Open communication helps align priorities, understand timelines, and ensure that deliverables support both scientific communication and business strategy.

 

Clear communication of statistical results
Statisticians translate complex analyses into accessible messages that can be shared with healthcare professionals and internal stakeholders.

 

Familiarity with regulatory submissions
Knowing the key findings from submission dossiers allows for continuity between regulatory and post-marketing activities.

 

Case Study: Supporting a neurology drug from submission to post-marketing

A pharmaceutical company conducting neurology studies needed support for the regulatory approval process and post-approval work. Cytel’s specialized team of statisticians first collaborated with the sponsor at the regulatory submission stage by:

  • Supporting the regulatory submission package (e.g., developing the statistical analysis plans for individual study and integrated safety and efficacy analyses, providing input on the clinical study report (CSR), etc.
  • Supporting the regulatory queries (e.g., conducting sensitivity checks, long-term safety and efficacy analyses, etc.

 

Key contributions

The team’s familiarity with the data, as well as their ability to quickly adapt to evolving project needs and meet tight deadlines, enabled them to help the sponsor transition seamlessly from the submission stage to the post-marketing stage. Here, key contributions included:

  • Exploratory analyses of clinical data to support scientific discussions.
  • Continuous data interpretation and communication, transforming complex results into clear, impactful messages for medical teams.
  • Support of the medical communication team in the development of congress abstracts by ensuring that statistical results are clearly and accurately presented. This includes creating high-quality graphical displays (e.g., Sankey plot, Volcano plot) and many other standard and unusual visualizations that can be directly used in congress posters.
  • Creation of slide decks for external use, tailored to different audiences such as steering committees or advisory boards. These presentations require visually engaging graphics and a clear, concise interpretation of results to effectively communicate key findings.

 

Building effective collaboration

This collaboration with the client was highly effective, built on a foundation of mutual trust, communication, and shared scientific ambition. It was based on the following principles:

  • Flexibility
    We adapt quickly to evolving project needs, changing timelines, and emerging priorities.
  • Rapid Turnaround
    Our teams, composed on statisticians and statistical programmers are structured to deliver high-quality outputs under tight deadlines, which is critical for time-sensitive materials like congress abstracts.
  • Ongoing alignment with current data
    We adjust/re-run analyses as new data becomes available, helping Medical Affairs teams stay up to date with the latest evidence.
  • Specialized teams
    Our statisticians and programmers are experienced in preparing data for scientific communication. They are not only familiar with a wide range of data presentations that are typically used in the medical literature, but can also bring in creative solutions on how to make complex results accessible and relevant for both scientific and non-specialist audiences, including graphical data displays.
  • Regular meetings and flexible communication
    Cytel PBS and client statisticians and programmers working on the same drug hold regular (usually monthly) meetings to ensure alignment and efficient progress. They embrace flexible and dynamic communication methods, including quick ad hoc meetings or written messages. This collaborative approach fosters transparency, improves cross-functional communication, and supports timely delivery of project milestones.

 

Final takeaways

This unique partnership enables high-quality deliverables and meaningful scientific engagement. Successful collaboration with medical affairs teams is driven not only by technical excellence, but also by trust, deep commitment, and cross-functional synergy.

Analyzing Endpoints in Multiple Sclerosis Clinical Trials: Statistical Considerations

Clinical trials studying multiple sclerosis (MS) — a chronic, inflammatory, progressive, autoimmune disease affecting the central nervous system — employ various common endpoints. These typically target frequency of relapses, progression of disability, and MRI activity, as well as “no evidence of disease activity” (NEDA), which is a concept/composite endpoint combining the prior three components. Analyzing these can present several statistical challenges.

Here, we provide an overview of the common endpoints (including their definitions) in MS clinical trials and key statistical considerations together with the statistical modeling techniques to analyze them, as well as considerations of how to overcome several statistical challenges that we encountered in this indication.

 

Frequency of relapses

A key clinical feature of MS is the occurrence of relapses, i.e., episodes of new or progressing neurological dysfunction, lasting for a period, followed by periods of remission. Distinguishing a relapse from other clinical conditions may not be straightforward; therefore, an accurate definition should be included in the protocol.  A typical endpoint here is the number of relapses occurring within one year, i.e., the annualized relapse rate (ARR).

From a statistical point of view, this constitutes count data and thus, we analyze it using:

  • Poisson regression model, or
  • Negative binomial model. This model accounts better for a high number of zero counts (i.e., zero inflation) and overdispersion (i.e., greater variability than expected in terms of variance being larger than the mean).

Both models are often adjusted for MS prognostic factors.

In recent years, we’ve seen a decrease in number of relapses, largely due to earlier MS diagnoses and the widespread use of high-efficacy disease-modifying therapies. In the case of absence of relapses, derivation of ARR might go wrong. When performing quality checks of a sponsor’s analyses, especially if a not-adjusted approach was followed for cases with no relapses, the exposure or observation time in the study is mistakenly not accounted.

Another common method is time to first relapse, applying the survival data analysis methods. At Cytel, our teams have also explored recurrent event analysis using the Mean Cumulative Function (MCF) method, though these are limited by the evolving nature of relapse patterns.

 

Progression of disability

The most widely used measurement tool to describe disease progression in patients with MS is the Expanded Disability Status Scale (EDSS).1 The EDSS includes a neurological evaluation of seven functional systems (plus “other”) in conjunction with observations and information concerning gait and use of assistive devices to rate the level of disability, resulting in a single score.

While EDSS is a widely accepted measure, it has been criticized for certain limitations. For example, a 1-point increase at lower EDSS levels (e.g., 2.0 to 3.0) may reflect different functional implications than the same increase at higher levels (e.g., 6.0 to 7.0).

To address these limitations, we commonly use Confirmed Disability Progression (CDP), which is based on an increase in the EDSS score (e.g., 0.5, 1.0, or 1.5 points) that is confirmed after a specified period (e.g., 3 or 6 months), depending on the baseline EDSS value.

It’s important to note that neither terminology nor definition are standardized; there are several variations in its application across different sponsors.

When Cytel analyses CDP as a binary endpoint (yes/no), we typically use logistic regression adjusting for relevant MS prognostic factors. One of the challenges encountered with this approach is the presence of incomplete data when attempting to obtain a relevant assessment to confirm the disease progression:

  • Some patients withdraw from the study.
  • In other cases, the EDSS assessments are not frequent or consistent enough to confirm progression. For example, if the confirmation of CDP is required in 6 months, the study protocol should define the corresponding visits frequency, and the statistical plan should consider the minimum time interval required for the confirmatory assessment, e.g. 6 months x 30 days – 14 days, so that the confirmation would not be missed by few days.

Both scenarios result in patients being classified with “unknown” status, which can complicate the interpretation and robustness of the analysis.

Another approach is to analyze time to first CDP via survival data analysis methods.

 

Magnetic resonance imaging (MRI)

Relapses or EDSS may not be a sufficient indicator of MS activity. The inflammation caused by MS does not always result in a relapse or any visible symptoms and may only be seen with an MRI.

The most common MRI-related endpoints are T2 lesions count or volume, active (i.e., new or enlarging) T2 lesion count, and T1 (Gd+ / hypointense) lesions count or volume.

The expectation from the treatment with the MS drug is that MRI activity is also “suppressed” (e.g., broadly speaking, we do not observe new or enlarging T2 lesions; new T1 Gd+ do not appear, etc.). This usually happens at the early stage of the study according to the foreseen onset of action for a given drug. However, new lesions may eventually appear, or others may grow in size.

For the statistical analysis, MRI-derived endpoints reflect MRI activity such as counts of lesions that are new or enlarging. The counts can be further classified on:

  • binary scale where at least one new or enlarging lesion is present vs none (coded as 1/0 for lesion count: >0/ =0).
  • continuous scale, e.g. changes in MRI activity such as counts of lesions that are new or enlarging compared to baseline (or any other relevant visit).

Such data is challenging, but can be handled using parametric approaches, including:

  • counts via Poisson regression, or in case of zero inflation via generalized linear model assuming negative binomial distribution adjusted for MS prognostic factors.
  • binary scale using the McNemar test since a shift in the number of lesions through visits from present to none is frequently of interest.
  • continuous endpoint (change from baseline) can be handled via (mixed; if random effects accounted) linear regression model adjusted for MS prognostic factors.

(The non-parametric methods are not discussed here.)

The analysis is, however, much more complex due to the nature of data collection for lesion count: the assessment of the lesions may be performed multiple times per visit (such as for T1 Gd+ lesions), or in reference to previous MRI scan to detect new or enlarging lesions. This is reflected in the analysis by random effects, standardization, or by using offset in the count models, accounting for the number of scans or time since baseline.

 

NEDA

No evidence of disease activity (NEDA), also referred to as freedom from disease activity, is one of the composite endpoints taking clinical and imaging endpoints into account.

NEDA is defined by the absence of:

  • Relapse
  • CDP
  • MRI activity

NEDA is typically analyzed as a binary outcome (yes/no), using logistic regression adjusted for MS prognostic factors.

While the goal of most MS DMDs is to reduce relapse frequency, relapses tend to be less common in treated people with MS. As a result, the two remaining components of NEDA, CDP and MRI, may have greater impact on its outcome.

However, there is no crisp definition regarding CDP and MRI endpoints (frequency and parameter selection), and thus the reports on NEDA between different studies might be incomparable:

  • We observe that studies present 3-months, 6-months, or even 12-months CDPs,
  • MRI activity may be defined in various ways, including
    • frequency of MRI scans,
    • selection of relevant MRI readouts (e.g. presence of T1 Gd+ lesions or new or enlarging T2 lesions), etc.

The more frequently the measures are taken, the higher the likelihood of identifying the progression. In case of post-marketing studies that reflect real-world clinical practice (half-yearly or yearly visits), EDSS measures necessary for CDP definition and MRI scans are often not collected in the necessary frequency.

In addition, for assessments that have rather rare scheduled frequency, each missing assessment may affect NEDA heavily and by experience, there is no harmonization of dealing with such missing across different companies. In such situations, NEDA might be analyzed differently including time to first NEDA using survival analysis methods. Overall, it is important that the potential challenges (whether related to data collection or analytical methods) need to be carefully considered already at the early stages of study planning.

 

Final takeaways

While these endpoints provide valuable frameworks for assessing disease progression and treatment efficacy of MS patients, statistical challenges remain. Addressing these challenges, in close collaboration with medical experts, is essential to ensure that the analyses remain both scientifically sound and clinically meaningful.

From Metadata to Submission: Rule-Based Robotic Process Automation for Statistical Programming Excellence

In the race to modernize data operations in clinical research and regulatory submissions, Robotic Process Automation (RPA) powered by rule-based systems has emerged as a dependable and high-impact solution. These systems offer clarity, control, and reproducibility — critical traits for industries like biopharma where regulatory compliance and data integrity are non-negotiable.

Here, we discuss rule-based RPA as the foundation for a scalable and auditable standards automation pipeline.

 

Rule-based automation: Transparent, trusted, and tunable

Unlike more probabilistic models, rule-based systems operate on deterministic logic. Every output is traceable back to an explicit rule, which enhances trust and simplifies troubleshooting. This transparency is particularly valuable when the processes must be easily explained to stakeholders and auditors.

Key strengths of rule-based RPA include:

Transparency

Each step in the workflow is rule-driven, making the logic easy to inspect, validate, and justify. This ensures regulatory reviewers can clearly understand how data was transformed or outputs generated — vital in submission contexts.

Consistency

Standard rules applied across studies generate consistent outputs. For example, Cytel’s ALPS system creates SDTM and ADaM code from structured specifications, producing reliable results that hold up across different projects and teams.

Customizability

Rule-based systems are modular. Teams can easily adapt existing rules to accommodate study-specific needs without overhauling the entire system. Tools like Prism allow this by applying both generic rules and study-specific layers for enriched metadata processing.

 

Cytel’s metadata-driven RPA workflow in action

Our internal automation pipeline demonstrates the power of rule-based RPA. It’s built on a modular architecture where each tool performs a specific, rules-driven task:

  • ALPS: Converts metadata specifications into ready-to-run SAS code for SDTM and ADaM datasets, reducing manual programming and minimizing error risks.
  • Lighthouse: Enables biostatisticians to build mock shells using reusable templates, ensuring consistency in table and listing structures.
  • Prism: Extracts metadata from mock shells and transforms it into XML-format ARMs (Analysis Results Metadata), enriching it through rules and generating code for up to 60% of standard safety outputs.
  • TAB Macros and CytelDocs: Automate the creation of summary tables and documentation, saving hours of effort and ensuring compliance with standardized formats.

This end-to-end pipeline reduces manual touchpoints, maintains high quality, and boosts team efficiency.

 

Where generative AI complements RPA

While rule-based systems are ideal for tasks requiring consistency and auditability, generative AI can complement these systems — particularly in areas where variability is acceptable and outputs don’t require deterministic reproducibility. For example, Gen AI can assist with:

  • Drafting exploratory narratives or documentation
  • Suggesting code for non-critical outputs
  • Enhancing user interfaces with intelligent prompts
  • Enrich the set of study specific rules to be used

However, these AI-driven capabilities are best applied where hallucinations won’t compromise integrity, and outputs don’t demand rigid consistency.

 

Business and quality benefits of rule-based RPA

By relying on rule-based RPA for core data workflows, we’ve realized several tangible gains:

  • Time efficiency: Standard code is generated automatically, freeing time for custom analysis.
  • Reduced redundancy: Developers no longer rewrite common code across projects.
  • Improved QA: Outputs are independently validated and built on rigorously tested rule sets.
  • Collaboration at scale: Uniform rules simplify onboarding and knowledge transfer.
  • Focus on what matters: Teams can concentrate on non-standard elements that require expertise.

 

Final takeaways

Rule-based RPA systems provide the transparency, structure, and adaptability required for high-stakes data environments. At Cytel, we’ve found them indispensable in our mission to expedite regulatory submissions without compromising on quality or compliance. As AI continues to evolve, generative technologies may enrich this foundation — but rule-based automation remains the core engine that ensures accuracy, accountability, and speed.

Master Protocols in Oncology Trials

A master protocol is defined as a protocol designed with multiple sub-studies, which may have different objectives and involve coordinated efforts to evaluate one or more investigational drugs in one or more disease subtypes within the overall trial structure. Master protocol trials include three trial designs: basket trials, umbrella trials, and platform trials.

FDA guidance released in March 2022 provides recommendations for master protocol trials.

In this blog, we discuss master protocol trial designs, challenges and best practices, and the benefit of these innovative designs in oncology trials.

 

Types of master protocol trials

Basket trials

Basket trials are designed to test a single investigational drug or drug combination in different populations defined by different cancers, disease stages for a specific cancer, histologies, number of prior therapies, genetic or other biomarkers, or demographic characteristics.

 

Umbrella trials

Umbrella trials are designed to evaluate multiple investigational drugs administered as single drugs or as drug combinations in a single disease population.

 

Platform trials

Platform trials are master protocols in which arm(s) can be dropped or added based on knowledge gained from previously evaluated parts of the trial.

 

Figure 1: Basket Trials, Umbrella Trials, and Platform Trials

Image credit: Park, J. J. H., Siden, E., Zoratti, M. J., Dron, L., Harari, O., Singer, J., Lester, R. T., Thorlund, K., & Mills, E. J. (2019). Systematic review of basket trials, umbrella trials, and platform trials: A landscape analysis of master protocols. Trials, 20.

 

Key challenges with master protocol trials

Master protocol trials are inherently complex due to their expansive scope and varied components. Let’s refine these challenges further:

 

Data management and analysis

  • Large amounts of data need efficient integration and processing.
  • Basket trials involve multiple indications and endpoint definitions, and/or response criteria may vary across the indications.
  • Umbrella trials have multiple drugs, leading to complex exposure and safety summaries.
  • Platform trials continuously add new treatment arms, generating a dynamic dataset that requires real-time integration and analysis. This necessitates robust data management systems capable of handling evolving data structures and ensuring consistency across various cohorts.

 

Safety profile considerations

  • Variability in drug effects requires tailored safety monitoring strategies.
  • Adverse events of special interest might need to be defined for each drug separately.

 

Biomarker data complexity

  • Data can be relatively large and complex.
  • Having the data transfer specifications at an early stage is important to ensure that the correct data will be received and in the expected format.
  • Intensive discussion might be needed with biomarker data specialists to define the rules for deriving biomarker/genomic profile of interest.
  • Mapping those data from raw data to SDTM can also be challenging.

 

Statistical Analysis Plan (SAP) and shell development

  • Potential additional complexity for statistical inference (e.g., adaptive features, multiplicity, and Bayesian methods).
  • Require the team to focus on the main objectives of the study, otherwise SAP and shell can become very extensive.
  • The number of tables, figures, and listings can grow significantly, making prioritization essential.
  • Layout complexities arise when need to display numerous columns across multiple cohorts.

 

Operational and reporting challenges

  • Each cohort may follow different timelines, complicating interim and final analyses.
  • Frequent reportings require good planning.
  • CSR(s) strategy (e.g., separate CSR for each cohort versus single CSR) should be defined sufficiently early.

Staying focused on the key study objectives is crucial to prevent data overload and inefficiencies in reporting. Exploratory analyses can be planned in a second step.

 

Comparative Overview: Basket vs. Umbrella vs. Platform Trials

(Click table to enlarge)

 

Final takeaways

Master protocol trials represent a transformative shift in clinical research — enabling the simultaneous evaluation of multiple therapies or disease subtypes under a unified framework. While designs like basket, umbrella, and platform trials offer flexibility and efficiency, they also introduce significant operational, statistical, and data management complexities.

Success is built on early planning, early discussion with safety and biomarker teams, and a focus on core study objectives to ensure meaningful insights and readiness.

Implementing RECIST/iRECIST in Oncology Clinical Trials

The majority of clinical trials evaluating cancer treatments for objective response in solid tumors are using RECIST, or Response Evaluation Criteria in Solid Tumors. RECIST is crucial for evaluating the effectiveness of cancer therapies, but it’s not without its challenges.

In this blog, we detail RECIST, how it’s used in statistical analysis, the development of iRECIST for immunotherapy trials, statistical and clinical challenges with RECIST/iRECIST, and best practices for implementing RECIST/iRECIST in oncology trials.

 

What is RECIST 1.1 and why is it important in oncology?

RECIST (Response Evaluation Criteria in Solid Tumors) 1.1 is a standardized set of rules used to measure tumor response to treatment using imaging. It helps determine whether a tumor is shrinking, stable, or growing, which is crucial for evaluating the effectiveness of cancer therapies. As of today, the majority of clinical trials evaluating cancer treatments for objective response in solid tumors are using RECIST.

 

What are the key response assessments in RECIST 1.1?

The overall response for a given timepoint is the combination of target lesion response relying on unidimensional measurements, non-target lesion response, and presence/absence of new lesions.

  • Complete Response (CR): Disappearance of all target lesions and non-target lesions and no new lesions. Any pathological lymph nodes must have a reduction in short axis to <10mm.
  • Partial Response (PR): At least a 30% decrease in the sum of diameters of target lesions, taking as reference the baseline sum diameters, and no progression of non-target lesions and no new lesions.
  • Stable Disease (SD): Neither sufficient shrinkage to qualify for PR nor sufficient increase to qualify for PD, taking as reference the smallest sum diameters while on study, and no new lesions.
  • Progressive Disease (PD): At least a 20% increase in the sum of diameters and an absolute increase of ≥ 5mm, taking as reference the smallest sum of diameters on-study, or progression of non-target lesions or appearance of new lesions.
  • Not Evaluable (NE)

 

How is RECIST used in statistical analysis?

RECIST is used to derive key endpoints like:

  • Objective Response Rate (ORR)
  • Disease Control Rate (DCR)
  • Progression-Free Survival (PFS)
  • Duration of Response (DOR)
  • Time to Response (TTR)

 

RECIST 1.1 criteria state that confirmation of response (CR or PR) is required for non-randomized trials with a response primary endpoint to ensure responses identified are not the result of measurement error. However, in all other circumstances, i.e., in randomized trials (phase II or III) or studies where stable disease or progression are the primary endpoints, confirmation of response is not required since it will not add value to the interpretation of trial results.

The FDA generally expects a confirmed response for ORR in single-arm trials where it is the primary endpoint, especially for accelerated approval. The FDA/EMA may also request confirmation if ORR is a primary of key secondary endpoints or if imaging intervals are long. Nevertheless, this point should be discussed with Health Authorities as this additional confirmatory scan is usually requested 4 weeks later and the protocol might not plan for it; such analyses cannot be conducted ad hoc if the confirmatory assessment is not initially planned in the protocol.

 

What are the statistical and clinical challenges with RECIST 1.1?

Inter-reader variability

Despite the use of standardized RECIST 1.1 criteria for response, different radiologists may interpret imaging results differently, especially when measuring borderline lesions. This can introduce measurement bias and affect response classification (e.g., PR vs. SD) and therefore impact trial outcomes. As an example, the average discrepancy rate at the patient level was found to be 59.2% in lung cancer trials using RECIST 1.1.1

Lesion selection and measurement errors

  • RECIST 1.1 limits the number of target lesions (up to 5 total, max 2 per organ).
  • Importance of selecting the same target and non-target lesions to be followed across all timepoints otherwise patient level response will not be valid.
  • Small errors in measuring lesion diameters can significantly impact response categorization.

Non-measurable disease

When the patient has only non-measurable disease, the increase must be substantial to lead to an overall response PD, which is relatively subjective.

Handling non-target and new lesions

Non-target lesions are assessed qualitatively, which introduces subjectivity.

The appearance of new lesions automatically triggers PD, even if the overall tumor burden is decreasing. Therefore, the finding of a new lesion should be unequivocal, i.e., not attributable to differences in scanning technique, change in imaging modality, or findings thought to represent something other than a tumor.

RECIST criteria are based on anatomical size, not functional or viable tumor volume

Focuses on unidimensional measurements, regardless of internal characteristics like necrosis or cavitation (common in lung or liver metastases).

Other criteria (e.g., Choi criteria for GISTs) may be more appropriate when necrosis is a key feature of response.

In some tumor types or trials, modified criteria (e.g., mRECIST for hepatocellular carcinoma) are used, which do consider viable tumor (e.g., arterial enhancement) rather than total size.

RECIST does not capture atypical responses

Especially in immunotherapy, tumors respond differently compared with chemotherapy, raising questions about the assessment of changes in tumor burden. In particular, for immunotherapy, RECIST 1.1 may misclassify pseudoprogression as PD.

This has led to the development of iRECIST, but many trials still rely on RECIST 1.1.

Time-to-event endpoint challenges

PFS and DOR depend on accurate and timely assessments.

Delays in imaging or inconsistent scan intervals can lead to informative censoring or biased survival estimates.

Missing or incomplete data

Patients may miss scans or drop out, leading to missing data that complicates statistical modeling. Interval censoring can be used as sensitivity in that case.

Imputation is difficult due to the non-linear and categorical nature of RECIST outcomes.

Impact on interpretation

Low concordance between Independent Central Review and the Investigator would question the reliability of results.

 

Why was iRECIST developed and how does it differ from RECIST 1.1?

Traditional RECIST criteria may misclassify immune-related responses as progression. iRECIST was developed to:

  • Reflect atypical response patterns in immunotherapy
  • Allow continued treatment beyond initial progression
  • Improve consistency in trial design and data interpretation

 

iRECIST is an adaptation of RECIST 1.1 designed for immunotherapy trials. It accounts for pseudoprogression, where tumors may initially appear to grow before shrinking due to immune cell infiltration. iRECIST introduces:

  • Unconfirmed Progressive Disease (iUPD)
  • Confirmed Progressive Disease (iCPD)

This two-step confirmation helps avoid prematurely stopping effective immunotherapy.

 

What are the statistical challenges with iRECIST?

Delayed treatment effects

Immunotherapies may show delayed clinical benefits, which violate the proportional hazards assumption used in standard survival analysis (e.g., Cox models). This can complicate sample size estimation, primary analysis, and, in particular, hazard ratio interpretation.

Pseudoprogression and confirmation requirements

iRECIST introduces iUPD and requires a follow-up scan to confirm progression as iCPD, which delays the determination of progression and requires more complex modelling of iPFS. This also introduces interval censoring and time-dependent bias.

The exact time of progression is not precisely known — it lies between the iUPD and iCPD scans. Uncertainty around the exact date of progression, which is already present with RECIST, is larger with iRECIST, given that the second scan is needed to confirm the PD. A specific method like the interval censoring method might be more appropriate than the Kaplan-Meier and Cox models.

Patients who survive long enough and/or are still in the study to get a confirmation scan are not randomly selected — they may be little healthier. This may introduce selection bias and time-dependent confounding.

Endpoint ambiguity

Common endpoints like PFS and ORR are harder to define, which can lead to inconsistent endpoint definitions across trials:

  • Should PFS be based on iUPD or iCPD?
  • How should iDOR be calculated?
  • What if patients drop out before confirmation?
  • SAP should clearly define the derivations

Data interpretation and trial comparability

Trials using iRECIST are not directly comparable to those using RECIST 1.1.

Meta-analyses and pooled analyses become more difficult.

The protocol/SAP may plan for both RECIST and iRECIST analyses, increasing complexity.

Increased risk of missing data

Patients may discontinue before confirmation scans for progression.

Imaging schedules may not align with iRECIST requirements: iRECIST requires a follow-up scan (typically within 4–8 weeks) after an initial iUPD to determine if the progression is real or pseudoprogression. However, in many clinical trials or treatment protocols, imaging is scheduled every 8–12 weeks, which may not fit with the expected confirmation window and increase the risk of missing data.

This leads to informative censoring and missing not at random (MNAR) data, which are hard to handle statistically.

Limited validation and standardization

iRECIST is still considered exploratory, especially for phase III trials (as per the guidelines).

There is no consensus on how to incorporate iRECIST endpoints into pivotal trials.

Validation requires large-scale data sharing, which is still limited.

 

Best practices for implementing RECIST/iRECIST in trials

  • Follow published guidelines.
  • Ensure the CRF appropriately collects the data (e.g., date of new lesions). Examples are available on the RECIST website.
  • Ensure standardized imaging schedules and methods.
  • Train radiologists and clinicians on RECIST/iRECIST criteria.
  • Consider blinded independent central review to reduce variability, when relevant.
  • Plan for additional scans to confirm progression with iRECIST.
  • Ensure responses criteria used are clear in SAP, outputs, CSR, and manuscripts.

 

Where can I learn more or access the guidelines?

Full RECIST guideline

Full iRECIST guideline

RECIST Questions and Clarifications

iRECIST

 

Final takeaways

RECIST 1.1 is the standard tool for evaluating tumor response in oncology trials, offering a consistent framework based on anatomical measurements. While it has brought uniformity to clinical research, it comes with some limitations — such as subjectivity in lesion selection and inability to capture atypical responses — especially with immunotherapies. To address these challenges, iRECIST was introduced as an adaptation that accounts for immune-related phenomena like pseudoprogression. However, it also brings statistical complexity and remains exploratory and is not yet fully reliable, with limited validation for pivotal trials.

This is precisely where Cytel can bring value to sponsors. By combining deep statistical expertise with operational insight, Cytel helps design and implement robust RECIST and iRECIST strategies — from endpoint definition to handling complex censoring and missing data. Cytel supports sponsors in navigating regulatory expectations, ensuring that trial results are both scientifically sound and submission-ready.

The Estimand Framework in Oncology Trials

Oncology clinical trials are complex due to the nature of cancer progression, long follow-up times, start of further therapies, and ethical considerations. The estimand framework introduced in ICH E9(R1) provides a structured approach to align the clinical question with endpoints, intercurrent events, and analysis strategies.

 

Understanding the estimand framework in oncology

The estimand framework helps define what exactly a trial aims to measure, especially in the presence of intercurrent events (ICEs) that occur after treatment initiation and affect either the interpretation or existence of the outcome (like treatment discontinuation or new therapies).

Estimands need to be clearly defined in both the protocol and the Statistical Analysis Plan (SAP) using the five attributes outlined in the ICH E9(R1) addendum: population, variable (endpoint), treatment, intercurrent events and handling strategies, and population-level summary.

ICEs can complicate the estimation of treatment effects in oncology trials. Among these, the start of further anticancer therapy is particularly complex, especially when evaluating endpoints like Progression-Free Survival (PFS) and Overall Survival (OS).

Among all ICE handling strategies, two strategies are often used to handle the start of further anticancer therapy:

 

Hypothetical strategy

Estimate treatment effect in a world where further anticancer therapy would not exist.

  • Implementation: Typically involves censoring patients at the time they start further anticancer therapy or using advanced statistical methods.
  • Could be more meaningful from patient’s and prescriber’s perspective if subsequent therapies are not yet approved drugs and thus do not reflect clinical practice.
  • May require additional data on baseline and/or time-dependent covariates to support modeling.

 

Treatment policy

Estimate treatment effect regardless of any further anticancer therapy, aiming to reflect real-world clinical practice.

  • Implementation: Includes all events regardless of further anticancer therapy.
  • Often considered as most relevant by regulatory authorities and other stakeholders if subsequent therapies are already approved and reflect clinical practice.
  • Tend to dilute treatment effect.
  • Assessments must continue beyond start of subsequent therapy.

 

Regulatory landscape

Historically, the FDA’s 2007 guidance leaned toward censoring at the start of new anticancer therapy — aligning with the hypothetical strategy. However, more recent guidance (2015, 2018) acknowledges both strategies, and the EMA’s 2012 guidance implicitly supports the treatment policy approach by recommending that progression should be considered even when observed after new anticancer treatment.

In Acute Myeloid Leukemia (AML), the FDA’s 2022 guidance is particularly clear: subsequent treatments like HSCT or anti-AML drugs should be considered part of the overall treatment regimen and not censored in the primary analysis.

 

When the hypothetical strategy may be preferable

In trials where conditions diverge significantly from routine clinical practice — such as early crossover or use of unapproved therapies — a hypothetical strategy may better capture the true clinical question.

Advanced methods like Rank Preserving Structural Failure Time (RPSFT) and Inverse Probability Censoring Weighting (IPCW) can help estimate what would have happened without treatment switching — but they come with assumptions and complexity.

 

Handling missing data in oncology

Effectively addressing missing data is essential for ensuring the reliability and integrity of statistical analyses in oncology trials. With regulatory agencies embracing the estimand framework, it’s essential to distinguish between ICEs and missing values, and to navigate their implications for primary and sensitivity analyses.

There is no consensus yet regarding how to handle missing tumor assessments in the primary analysis of PFS. Here’s a snapshot of key regulatory viewpoints:

According to the FDA’s 2018 guidance, “We recommend assigning the progression date to the earliest time when any progression is observed without prior missing assessments and censoring at the date when the last radiological assessment determined a lack of progression.”

The 2015 FDA NSCLC guidance offers case-based examples where progression events after two or more missed assessments are either censored or considered as events depending on the context, illustrating a cautious approach to ensure data robustness.

The 2012 EMA oncology guidelines advise against censoring for missed assessments: “The time of the progression or recurrence event is determined using the first date when there is documented evidence that the criteria have been met, even in situations where progression is observed after one or more missed visits, treatment discontinuation, or new anti-cancer treatment.”

Those different censoring rules can deeply impact PFS estimates, especially when early dropout rates are imbalanced between treatment arms.

Depending on the approach retained for the primary analysis, sensitivity analyses should be considered to assess the impact of missing tumor assessments. It may include a different set of censoring from the FDA guidance, but also interval censoring method.

 

Sensitivity and supplementary analyses

Understanding how different analyses relate to the primary estimand is critical for drawing robust and credible conclusions from clinical trial data. Two important analysis categories — sensitivity and supplementary analyses — serve distinct purposes and must be thoughtfully pre-specified in the SAP.

 

Sensitivity analyses: Testing the estimand’s foundations

According to ICH E9(R1), a sensitivity analysis is “a series of analyses conducted with the intent to explore the robustness of inferences from the main estimator to deviations from its underlying modeling assumptions and limitations in the data.”

Purpose: To verify that conclusions drawn from the primary analysis remain valid under alternative assumptions or data limitations. These analyses also probe key risks to inference, such as missing data or model specification.

Examples:

  • Using an unstratified Cox model instead of a stratified one.
  • Comparing investigator-assessed PFS with blinded independent central review (BICR)-assessed PFS.
  • Applying alternative censoring rules (e.g., censoring after ≥2 missed tumor assessments), or interval-censored models for PFS.
  • Using Restricted Mean Survival Time (RMST) to explore robustness under non-proportional hazards.

 

Supplementary analyses: Exploring beyond the estimand

While less explicitly defined, ICH E9(R1) describes supplementary analyses as: “Other analyses that are conducted in order to more fully investigate and understand the trial data.”

Purpose: To explore different strategies or assumptions that may be clinically or scientifically relevant.

Example: Using a different intercurrent event (ICE) strategy than the primary estimand.

 

Final takeaways

There’s no one-size-fits-all approach. Regulatory expectations continue to evolve, and sponsor decisions should balance regulatory guidelines, clinical practice norms, relevance to prescribers and patients, and feasibility of continued assessments.

Engaging in early discussions to align estimands with trial objectives and regulatory requirements is critical to ensuring efficient drug development and timely delivery to patients.

Blinded Independent Central Review in Oncology Trials: Key Challenges

Blinded independent central review (BICR) is a process used in clinical trials, in which a group of independent experts review trial data, like radiographic images, to review assessments without access to information on patients’ treatment assignments. BICR of radiographic images is frequently conducted in oncology trials to address the potential bias of local evaluation by investigators (INV) of endpoints such as progression-free survival (PFS) and objective response rate (ORR).

 

What is the aim of BICR?

The BICR process serves several purposes. These include:

  • Reducing bias: An investigator can be influenced in his or her assessment by prior knowledge of the treatment assignment in the case of an open-label study or patient toxicities. Blinded review enhances objectivity.
  • Reducing measurement variability across sites and readers: Tasking a small number of central reviewers with expertise in a specific area with reviewing imaging may lead to more accurate and reliable assessments compared to local site reads. This is particularly important in multi-center trials.
  • Ensure standardization: Centralized review ensures the standardized application of response criteria (e.g., RECIST 1.1, iRECIST).
  • Improved data quality: Centralized monitoring allows for regular quality control, helping to identify issues such as inconsistent imaging techniques or poor-quality scans.
  • Enhanced regulatory confidence: Regulatory agencies like the FDA and EMA often prefer or require BICR for pivotal oncology trials, in particular for open-label studies or those with higher bias risk. This strengthens the credibility of primary endpoints derived from tumor assessments.

 

What are the limitations of BICR?

While the BICR process can help achieve the above aims, there are limitations. For example:

  • Operational complexity, time, and cost considerations: BICR requires a lot of coordination between multiple stakeholders (sponsors, imaging CROs, radiologists, adjudicators), resources, and logistics (e.g., reader’s training and data blinding, transfer, storage, and tracking).
  • Informative censoring: BICR may introduce bias due to informative censoring, which results from having to censor unconfirmed locally determined progressions. Indeed, once a patient has progressed according to the local assessment, s/he might discontinue the study, and further imaging is unlikely to occur. As a result, determining the BICR progression time may be impossible. This type of censoring will be informative: patients who progress according to local review (but not according to central review) will be more likely to progress by the next scheduled scan than patients who have not been determined to progress by local review. One alternative to this issue is to request at least one additional scan beyond progression assessed by investigators. Even though it may be required in protocol, it may be difficult to implement.
  • Regulatory agencies often expect BICR in pivotal oncology trials, in particular for open-label studies. However, inconsistencies between local and central results can complicate data interpretation and submission.

 

How is BICR data used?

As the primary endpoint (e.g., PFS-BICR): All patients need to be reviewed by BICR either in real time or by regular batch to be agreed upon.

For sensitivity analysis: All patients need to be reviewed by BICR either in real time or by regular batch (to be agreed upon) or as retrospective BICR. Retrospective BICR can be implemented only if the trial is positive based on local assessments (INV). Ideally, images should be collected and archived even after progression, to allow for retrospective BICR, if needed.

As an audit tool: In such context, only a random subset of patients can be reviewed and concordance between BICR and INV assessed on this subset of patients.

 

Practical aspects in BICR implementation

Imaging Charter

The Imaging Charter needs to be written in accordance with the protocol to ensure consistent methodology between investigators and independent review assessments.

Readers involved need to be specified in the charter; in general, two primary readers and one adjudicator are involved.

Adjudication paradigms should be detailed, in particular:

  • Are the primary readers compared at the patient level or visit level?
  • Which criteria are used to consider whether assessments from primary readers are different? For example:
    • At the patient level: the date of progression and/or best response are different
    • At the visit level: the sum of diameters on target lesions > xx mm

The choice of criteria impacts the number of cases that go to adjudication. Indeed, studies have shown that discrepancies between two readers can be substantial. For example, in lung cancer trials using RECIST 1.1, the average discrepancy rate at the patient level was around 59.2%,1 with adjudications often required to resolve differences. These discrepancies may stem from medically justifiable differences in interpretation or from errors, both of which can affect trial outcomes. Training and monitoring by a Central Imaging vendor can mitigate a large portion of the commonly encountered reading errors and therefore reduce variability.

  • If adjudication occurs, which reader is the accepted one? (e.g., one of the primary readers (forced adjudication) and/or a new reader by adjudication (open adjudication, less common)).
  • If adjudication does not occur, which reader is the accepted one? (e.g., reader 1 as default).
  • The timing of readers and adjudication should be defined in the Charter (e.g., real-time review or review only once the patient discontinues the treatment, adjudication only once per patient or once per study, etc.).

 

Data transfers

The independent review must remain independent; imaging results should not be shared from the site to the BICR or vice versa.

The timing/frequency of the transfer to be defined:

  • If only adjudication data are included, there might be a greater backlog for the tracking of BICR events.

Data reconciliation. Review of general consistency between the INV and BICR:

  • Check whether patient populations, set of scans, visits, dates, and method of assessments (if needed) are consistent between BICR and INV datasets.
  • Visits with Investigator Tumor Assessment but Missing BICR Assessment

Concordance between the INV and BICR results can be assessed during data review and at time of analysis.

  • For PFS: Concordance of Occurrence and Timing of disease progression
  • For ORR: Concordance of occurrence of Complete Response/Partial Response

 

Impact of tracking events and prediction of analysis timing

When the primary endpoint is PFS-BICR, interim or primary analyses are triggered by the number of BICR-assessed events. In such context:

  • Monitoring BICR events may be more challenging than INV events. Indeed, it is hardly ever real-time event monitoring, there is some backlog for the review and adjudication time is to be considered.
  • For event projections:
    • PFS-INV can be used as a first surrogate; if observed concordance is relatively high, it should provide relatively good estimate. The study team could consider estimating the BICR-assessed number of events based on an estimated ratio of investigator-assessed events (e.g., xx% of INV events).
    • Adjudication must be initiated early, to ensure that the number of BICR-assessed events are accurate. However, in some instances, the study may have pending adjudications at the time of data transfer for event projection.
    • BICR readers should read the full set of scans done up to a certain cutoff date and then transfer the data. Predictions can also be made for each reader separately if the adjudication is not yet completed.

 

Final takeaways

BICR enhances the objectivity and regulatory credibility of oncology trial endpoints by minimizing bias and standardizing radiographic assessments.

However, its implementation introduces operational challenges, may add complexity around data analysis with the risk of informative censoring, data interpretation in case of poor concordance between INV and BICR and prediction of analysis timing.

How CDISC and CDASH (CRF Standards) Streamline Clinical Trials

In today’s global research landscape, clear and consistent communication is more than a necessity — it’s a strategic advantage. It is particularly critical in clinical trials, where data must speak a universal language across teams, geographies, and regulatory frameworks.

The CDISC (Clinical Data Interchange Standards Consortium) and CRF (Case Report Form) standards serve as the universal language of clinical trials, ensuring consistency, clarity, and collaboration across the entire study lifecycle. By implementing these essential frameworks, organizations can optimize data collection, management, and submission — driving cost efficiency and accelerating medical advancements.

Here, we discuss CDISC and CRF standards and how they support the design, execution, and analysis of clinical trials.

 

The need for standardization

Overall, ensuring consistent and reliable data across multiple clinical studies requires the standardization of processes, procedures, and data collection methods. This uniformity can improve data quality, facilitate data sharing and analysis, and ultimately enhance the efficiency and validity of clinical research.

There are many benefits to utilizing CDISC and CRF standards, such as:

  • Improved data quality and reliability
  • Enhanced data sharing and integration
  • Increase efficiency
  • Improved communication and collaboration
  • Support for regulatory compliance
  • Scalability and repeatability

Let’s take a closer look at how CDISC and CDASH standards help create a foundation for data collection, presentation, and submission in clinical trials.

 

CDISC Foundational Standards

CDISC (Clinical Data Interchange Standards Consortium), a global non-profit organization, develops and promotes standards for data exchange in clinical research. The CDISC Foundational Standards support end-to-end clinical and non-clinical research processes, focusing on the core principles for defining data standards, and include models, domains, and specifications for data representation.

 

FDA guidance on CDISC standards

In recent years, the FDA has clearly stated its preference for receiving both clinical and analysis data formatted in compliance with CDISC standards. This has been communicated through a series of guidance documents, correspondence with sponsors, and presentations at conferences. As a result, CDISC models have become the de facto standard for submitting data to the FDA.

As of today, the FDA requires the following CDISC standards:

  • Controlled terminology
  • SEND
  • SDTM
  • ADaM
  • Define-XML

 

CDASH: Maximizing data quality

CDASH (Clinical Data Acquisition Standards Harmonization), a foundational standard developed by CDISC, focuses on harmonizing data collection in clinical trials, providing guidance on how to design and populate case report forms (CRFs) to ensure consistent data collection across studies. These standards help maximize data quality in order to streamline processes across the entire spectrum of medical research, from crafting clinical research protocols to reporting and regulatory submissions.

CDASH Model v1.3 — the latest version — was released in September 2023.

 

Key features of CDASH

  • Provides guidance on designing and populating CRFs/eCRFs, covering all therapeutic areas and phases of clinical trials
  • Specifies standard field names, meanings, and how to fill them
  • Characterizes fields as highly recommended, conditional, or optional
  • Includes a CDASH Model and CDASH Implementation Guide

 

The benefits of CDASH

Instead of following bespoke standards, CDASH’s guidelines for CRFs/eCRFs help sponsors collect data consistently across studies. This further aids in producing data in SDTM format for submission purposes and allows regulators to review data submission packages more accurately and efficiently, identifying concerns or making approvals faster. In addition, you can remove the duplication of trials and post-marketing evaluation, improving patient centricity.

CDASH standards also provide guidance for the development of data collection tools, which are clear, understandable, and precise. Following CDASH standards ensures traceability of trial data from the time the data is collected at the site until the data is ready for final analysis and regulatory submission. This maintains the integrity of source data to support the trial’s outcome/findings.

Sponsors can further save on time required for setting up new studies following the CDASH standards as most of the data collection and associated programming can be standardized across studies.

 

CRF libraries

A Case Report Form (CRF) library in clinical trials is a collection of standardized, reusable CRFs designed to streamline data collection and management. These libraries, whether electronic or physical, offer templates and guidelines for collecting data across different trials and therapeutic areas. They ensure uniformity, accuracy, and efficiency in data collection, ultimately benefiting trial conduct and analysis.

CRF libraries can reduce the cost and time budgeted for the clinical trial database preparation by:

  • Streamlining processes
  • Reducing training
  • Accelerating clinical trials
  • Using resources more efficiently
  • Improving adaptability and consistency
  • Focusing on design

 

Final Takeaways

CDISC standards, including CDASH and CRF standards, have revolutionized the way clinical data is managed, presented, and submitted, enhancing its integrity and efficiency in clinical research and drug development. Conformance to these standards is thus a critical aspect of clinical studies to ensure uniform data collection and submission processes, ultimately bringing quality treatments to patients faster.

 

Interested in learning more?

Watch our on-demand webinar, “Boosting Efficiency with CRF and CDISC Standards”: