Solutions
About Us
Insights
Careers

Career Perspectives: A Conversation with Joe Maginnity

In this edition of our Career Perspectives series, we had the pleasure of speaking with Joe Maginnity, Biostatistician II at Cytel. With a background in biological sciences, Joe shares insights into his professional journey, the collaborative nature of his role as a biostatistician in Data Monitoring Committees (DMC), and how biostatistics is evolving alongside advances in AI and machine learning. He also reflects on the importance of communication, remote work strategies, and the value of maintaining balance beyond the screen.

 

Can you give us a little background on your career so far? What inspired you to pursue a degree in biostatistics and a career as a biostatistician?

After graduating with a degree in Biological Sciences from the University of California, Davis, I originally considered pursuing a career as a physician, but ultimately discovered the great field of biostatistics. I wanted to apply both my knowledge of medicine and mathematics and biostatistics was the perfect fit. I graduated from the Ohio State University with my MS in Biostatistics in 2020 and was hired by Cytel in March 2021 as a Biostatistician I. The following year, I was promoted to Biostatistician II. Over the past four years, I have grown into a more independent role within the DMC and have been the lead biostatistician on multiple projects.

 

Can you walk us through what a typical day looks like in your role? What kinds of tasks do you usually focus on, and how closely do you work with clients?

I am based in Seattle, Washington, and my clients range from all over the United States and Europe. I usually start my workday early to stay in contact with clients in Europe, with the remainder of my morning reserved for meetings. Then I arrange my day around my high priority work. In addition to daily tasks such as QC reports, report deliveries, and minutes reviews, I also attend DMC meetings, working very closely with clients beforehand to ensure everything runs smoothly and all bases are covered.

 

Are there any common misconceptions about being a biostatistician in clinical trials?

I think a common misconception is that biostatisticians only work on data analysis and statistics. However, to be a successful biostatistician in clinical trials, communication is very important. It is a huge part of this job. You have to complete many time-sensitive tasks to ensure that you are producing high-quality deliverables and providing insightful statistical knowledge for many different clients. Without the ability to communicate effectively and perform tasks in a timely manner, you would not be able to execute the tasks required of a biostatistician here.

 

What makes for a successful collaboration between statisticians and other members of a clinical trial team?

Successful collaboration is built primarily on great communication. Having a complete understanding of what work is being expected from us and being able to communicate with the clinical trial team when we are in need of more clarification or in need of some more statistical insight goes a long way. I always try to be as communicative and clear as possible with all the clinical trial teams and DMC I work with in order to build a strong and successful partnership.

 

In your thesis research, you used machine learning methods and statistical model building. How do you see the role of biostatistics evolving in the next 5–10 years, especially with the increased use of AI and machine learning?

I think in the next 5–10 years, biostatistics will likely become more intertwined with AI and machine learning, leading to new biostatistics roles and the redefinition of existing ones. The increasing demand for AI-powered tools and data analysis will most likely require biostatisticians to expand their expertise in these areas. This includes using AI to improve risk prediction, identify patterns in large datasets, and personalize treatment plans. In using machine learning, biostatisticians may become more proficient in analyzing complex data and making statistical predictions.

 

As a remote employee, how do you maintain a healthy work-life balance? What strategies work for you, and do you feel supported by Cytel in this regard?

My home is my office, so I enjoy creating a fun workspace that keeps me motivated and focused. I have a standing desk where I do most of my work, and it is located next to my record player. Throughout the day — when I am not in a meeting, of course — I like to listen to different types of records, as it requires me to take breaks when one side of a record is done playing. It helps me stay focused while also reminding me to take small breaks away from the computer screen.

By being remote, I am also allowed the privilege of working while I am traveling. This has allowed me to visit friends and family in many different cities while saving up vacation time for when I want to travel, but not work. I feel very supported by my manager and team. I just need to give them enough notice of where I may be working remote from, especially when the time zones are much different.

 

What are your main interests outside of work?

Being in Seattle, there are so many amazing activities in this lively city. I really enjoy going to live music concerts. I probably attended 50 concerts last year alone! I also enjoy baking for my friends — and they all enjoy eating baked goods, especially my chocolate chip cookies. Seattle also has many different record stores, and I like browsing all their different varieties of music. And as you may have noticed earlier, I especially love traveling, both within the United States and internationally. I recently visited Japan, and this summer I plan to travel to Europe for 6 weeks, visiting places like London, Dublin, Oslo, Copenhagen, and Amsterdam.

 

Finally, what’s one piece of career advice you wish you had received earlier?

Set boundaries early and stick to them. I give 100% of myself when I’m at work, and I give 100% of myself to me, my family, and friends after work.

 

Best Practices for Ensuring Data Quality in Clinical Trials

Good data is essential for successful clinical trials. It helps ensure accurate analysis, guides important decisions, and supports the approval and safe use of new treatments. As trials become more complex with remote setups, many data sources, and stricter rules, keeping data quality high is more important than ever.

In this post, we’ll look at simple, effective ways to protect the accuracy and trustworthiness of data in clinical trials.

 

Create a strong data management plan

A good Data Management Plan (DMP) is the first step to quality data. It explains how data will be collected, checked, cleaned, and stored during the trial. It also helps everyone involved know their role.

A strong DMP should include:

  • Clear roles and responsibilities
  • Information about study set up including the electronic data capture used and the audit trail.
  • Step-by-step instructions for entering and handling data
  • Data cleaning process and details
  • Management of Serious Adverse Event (SAE) reconciliation and medical coding within the study

If you start your DMP early and keep it up to date, it will help avoid confusion and keep the trial consistent.

How to create a strong data management plan infographic

Use standardized data collection methods

Data collection should follow a consistent approach. It starts with designing a smart Case Report Form (CRF) that only asks for the necessary information and matches the trial goals. Using standard forms (like CDASH) across studies makes data easier to manage and review.

Other ways to keep data collection consistent:

  • Use standard medical terms (e.g., MedDRA, WHO Drug Dictionary)
  • Train staff on correct data entry
  • Use reliable electronic systems with built-in checks to catch errors
  • Avoid the comment or text field as much as possible
  • Do not collect data twice (duplicated data)

These strategies will reduce mistakes and save time fixing issues later.

 

Monitor data actively

To keep data quality high, you need to prevent problems and catch them early. Active monitoring either remotely or based on risk can help spot problems before they get worse.

Examples of active monitoring:

  • Dashboards that show missing or unusual data
  • Review of top priority data for the primary analysis
  • Regular review of key items like side effects or medication use
  • Focus monitoring on high-risk sites and processes

Finding and fixing issues early keeps your data reliable. Moreover, fixing problems as early as possible enables the site to avoid recurring issues.

 

Handle queries quickly and clearly

Resolving data queries (questions or issues) takes time, so it’s important to manage this well.

Tips for efficient query handling:

  • Use automated checks to catch simple issues
  • Focus manual review on complex or safety related data
  • Keep clear records of how each query is resolved by adding a comment
  • Pay attention to queries opened for several days to check with the sites on the reason why

Good query management keeps the trial moving and ensures the data is clean and complete.

 

Combine data from different sources carefully

Today’s trials often use data from many places like labs, apps, devices, and imaging systems. Keeping this data consistent is key.

Best practices include:

  • Creating a Data Transfer Agreement (DTA) detailing data transfer specifications, like for instance how the data will be transmitted, the frequency of the transfer, and the data to be transmitted
  • Checking and validating all incoming data
  • Setting up checks to make sure sources agree (e.g., comparing lab data with system data)

Good data integration helps you understand results more clearly and trust the final data.

 

Follow regulatory guidelines

High data quality also means following the rules. Agencies like the FDA and EMA expect clean, traceable, and well-documented data.

To be compliant:

  • Have clear procedures and test your systems
  • Run regular data audits
  • Make sure your data follows ALCOA+ principles (Attributable, Legible, Contemporaneous, Original, Accurate, and more)

Meeting these rules protects your study and shows the data is trustworthy.

 

Staff training and communication

Even with great tools, skilled people are essential. Train all team members regularly so they understand their role in protecting data quality. Make sure communication is clear between everyone’s sites, teams, and vendors. Write eCRF Completion Guidelines and perform a training video to train the sites and the investigator, explaining how the system works and how to perform data entry.

Sharing knowledge and working together helps build a culture where quality comes first.

 

Final thoughts

Keeping data quality high in clinical trials takes planning, careful checks, and teamwork. By following best practices like clear data collection, active monitoring, and smart integration, you can ensure your data is accurate and ready for review.

As clinical trials continue to evolve, one thing stays the same: quality data is key to faster approvals and better treatments for patients.

From Data Standards to Open Source and Beyond with AI

Key takeaways from CDISC EU Interchange and PHUSE-CSS

As clinical data science evolves rapidly, the CDISC EU Interchange and PHUSE-CSS conferences offer a glimpse into the future of regulatory submissions, standardization, and the rise of open-source tools and AI in drug development. In May, I had the privilege of attending both the events, in Geneva and in Utrecht. I’d like to share here some highlights from both conferences.

 

Data submission in Europe: EMA delays

As anticipated in my previous blog, we were waiting for further announcements from the EMA regarding the outcome of their pilot raw data submission project, for which an interim report was published last year.

Those, like me, who were expecting a final announcement were likely disappointed. The requirement for data submission to the EMA in support of drug approval has been postponed to 2028. The EMA, which was well represented at PHUSE-CSS, needs to further evaluate factors such as tools and technological impacts more broadly. At PHUSE-CSS, they showed particular interest in topics such as Dataset-JSON, BIMO, the use of tools such as R-Shiny and the {teal} framework, as well more advanced topics still under development such as the “Analysis Concept.” The pilot continues, and the EMA is seeking more volunteers. It was guaranteed that submitting data will not negatively affect or delay your drug approval!

It was clear, while discussing with EMA representatives, that a number of stakeholders within the agency still need to be convinced of the benefits of receiving datasets in addition to PDF documents and reports. Some appeared concerned about the additional time and effort required to assess submitted datasets. As we all know, updating regulations, as well as releasing new standards, require a great deal of “diplomacy” and the consensus among multiple stakeholders.

 

Open source: {teal} and R-Shiny adoption

The “open source” revolution continues to gain its momentum in our industry. At PHUSE-CSS, I attended the “{teal} Success Stories” workshop, where various sponsors, including J&J, Sanofi, Novartis, Boehringer, and Roche, shared their experiences.

I was fascinated by the solutions those sponsors have already implemented using {teal}, and how straightforward it seems to develop R-Shiny applications using the framework provided by this R package, which was originally developed by Roche.

For a deeper insight into the capabilities of this package, I recommend reading the paper presented by Roche at PHUSE US 2024.

 

Dataset-JSON pilot update

Another interesting workshop I attended at PHUSE-CSS was on Dataset-JSON, where we reviewed and contributed to a consolidated set of comments and feedback in response to the “FDA Requests for Public Comment on CDISC Dataset-JSON Standard,” which closes next week on June 9, 2025.

While the benefits of such a standard were widely acknowledged, particularly in accelerating drug approval and improving overall interoperability, the discussions also highlighted potential risks and implementation challenges. These included concerns about numeric precision when importing Dataset-JSON to and from SAS, as well as handling special characters.

We therefore emphasized the need for the FDA to provide additional guidance to support future adoption; there was also interest in possible future extensions of Dataset-JSON, such as the inclusion of more metadata and the potential to embed define.xml.

 

BIMO

BIMO was the focus of another PHUSE-CSS workshop. Among all the various topics discussed, such as the presentation of the BIMO PHUSE template reviewer guide, it was particularly interesting to learn that PHUSE will soon be developing a dedicated FAQ to support sponsors and CROs on the gray area of the BIMO requirement such as the definition of “major” studies, currently the object of BIMO requirements.

 

CDISC 360i reboot: Toward an end-to-end digital pipeline

The CDISC 360 initiative is back, and stronger than before, with a major shift toward a fully digitalized and standard-driven clinical development lifecycle. The goal is to break down silos through the application of standards such as the USDM and Biomedical concept. The mission is ambitious, but unlike when CDISC 360 was first launched, we now have more mature standards and technology to support it.

 

Use of AI to support clinical standards

AI remains a hot topic, and as in 2024, a full session was dedicated to it at the CDISC EU Interchange. The common theme across most presentations was the use of generative AI to support the implementation of data standards, such as AI acting as a subject matter expert (SME) for study teams. Although many of the showcased solutions from Argenx, SGS, and AstraZeneca are still in beta, they clearly demonstrate how proper model training can enhance search and navigation within complex data standards libraries, or help manage complex, multidimensional data (e.g., omics, wearable biosensors). Other AI use cases were also featured in several posters at PHUSE-CSS; for example, the application of AI to generate synthetic data or automate local lab ranges.

 

Other topics

For topics such as Digital Data Flow and USDM, I’ll refer you to the LinkedIn newsletter “View From The Coffee Shop,” curated by my friend Dave Iberson-Hurst. In it, he regularly shares insightful thoughts and updates on the ongoing digitalization efforts within our industry. He also summarized some key takeaways from both the CDISC EU Interchange and PHUSE-CSS.

I also had the opportunity to see good use cases of Analysis Results Standards (ARS) at CDISC EU Interchange, showing this relatively new standard have been well perceived by sponsors as well as vendors.

On the regulatory side, aside from the news from the EMA, I found the presentation from Sanofi and GSK particularly interesting. It covered a cross-industry initiative aimed at harmonizing vaccine regulatory submission to FDA-CBER, by sharing experience with this unique division, which often has its own set of sometime unexpected requirements (see also my previous blog on submission experience with FDA-CBER).

For other topics, see also here official CDISC posts for other conference sessions content:

 

Ongoing innovation

Overall, both conferences continue to showcase ongoing innovation in our Industry. It’s clear that change is happening at a pace I have never seen before in my 30-year career, and that’s good for patients, as well as an exciting time for those of us working in biometrics.

 

Interested in learning more?

Download Angelo’s new ebook, The Good Data Submission Doctor on Data Submission and Data Integration to the FDA, a collection of Angelo’s most critical insights on clinical data standards submission to the FDA, including key updates from the new FDA Study Data Technical Conformance Guide:

Strategies to Streamline the MHRA Inspection Process

The UK’s Medicines and Healthcare products Regulatory Agency (MHRA) plays a critical role in ensuring the safety, quality, and efficacy of medicines and medical devices. MHRA inspections are often a key step in bringing new therapies to market, but poor preparation can result in delays or regulatory setbacks.

Here, we outline the types of MHRA inspections, provide an overview of the essential steps and documents, and discuss challenges and how to overcome them to streamline your MHRA inspection process.

 

Types of MHRA inspections

There are two main MHRA inspection types. They include:

 

  • Statutory Good Clinical Practice (GCP) Inspection (a “routine” inspection): This inspection is performed as part of the risk-based compliance program and can be either systems-based or trial specific. Inspectors examine how an organization’s trial procedures are applied, considering previous inspection history, organizational changes, or intelligence from other external sources. Sponsors are usually notified six months ahead of time.

 

  • Triggered Inspection: Triggered inspections are initiated by concerns regarding a clinical trial’s conduct, often from sources like serious breach notifications, whistleblowers, or other MHRA departments. The nature of the information determines the level of notice provided, which may be short or none at all.

 

Essential steps and documents

The MHRA inspection process consists of three primary phases: planning, inspection, and reporting.

 

  • In the planning phase, sponsors receive an “Advance Notice of Statutory Inspection” notification, prepare an Inspection Dossier, and develop an Inspection Plan.

 

  • The inspection phase involves the main site inspection.

 

  • The reporting phase comprises issuing an Inspection Report and identifying Corrective and Preventive Actions (CAPA).

 

Strategies to streamline the inspection process

Apply the framework for engagement

Define clear duties and responsibilities, emphasize collaboration, and foster productive dialogue. This approach includes four considerations: steering group and communication, resource management and flexibility, documentation, preparation and strategic output. (We will discuss this framework in detail in our upcoming webinar, click the link below to register).

 

Prepare supportive documents

Create supportive documents that enable quick access to detailed and precise information during high-pressure situations, enhance understanding of the procedures and trial materials under review, and prepare tools and responses for anticipated critical discussion points.

 

Ensure staff is adequately trained

Mandatory training on current and historical versions of quality assurance documents, including SOPs, WIs, and related tools, is required to ensure understanding of both current processes and those used at the time of production of the deliverables for each study inspected.

 

Use live demos or “show and tell” sessions

Live demos can help visualize the process and ensure it is understood by the inspector as well as provide the opportunity to delve into the details.

 

Final takeaways

Navigating MHRA inspections requires a proactive approach, strategic preparation, and a deep understanding of evolving regulatory expectations. By leveraging innovative strategies, organizations can streamline their inspection readiness and enhance compliance outcomes. Equally crucial is the establishment of a well-defined framework that fosters effective collaboration among all stakeholders — sponsors and key vendors alike — ensuring a streamlined and coordinated approach to the inspection process.

 

Want to learn more?

Join Stephanie Dontenville and Nicolas Rouillé for their upcoming webinar to gain practical insights from Cytel’s MHRA inspection experience. Learn how to prepare thoroughly, execute precisely, and turn post-inspection feedback into innovation.

Advancing Clinical Trials Through Shared Expertise and Collaboration

Working across a range of Phase I, II, and III trials as well as numerous possible indications, our Project-Based Services (PBS) teams have specialists in many areas. In order to enhance expertise among colleagues, foster knowledge-sharing, and stay up to date on recent developments in the field, Cytel has developed a collaborative workstream initiative that brings together experts with shared statistical and therapeutic area interests.

Michaela Šedová, Biostatistics Director with PBS, moderates the neurology workstream. In this interview, Michaela discusses how these initiatives enable colleagues to share and grow their experience and skills, and how this behind-the-scenes work benefits sponsors.

 

Michaela, what inspired the creation of Cytel’s new workstreams that focus on specific therapeutic areas and methodologies?

We are a team of biostatisticians working within Cytel PBS. Most of our contracts are limited to specific projects, typically involving one or several related studies. This means that we handle a diverse range of trials — be it Phase I, II, or III, and on any possible indication. Our work includes trials conducted by small biotech companies that lack in-house statisticians, requiring substantial methodological and statistical input from our team, and from large pharmaceutical companies with more specific and focused requirements.

Given the wide range of disease areas, statistical methodologies, and operational aspects we encounter, it’s impossible to be experts in every domain. Instead, we tend to specialize in certain areas. The workstream initiative brings together colleagues with shared statistical and therapeutic area interests and provides a platform to share recent developments, enhance expertise, and foster knowledge-sharing within the company.

Companies sometimes overlook the wealth of experience and skills their employees possess, simply because they lack opportunities to use them. At Cytel, our workstream initiative helps us uncover such talents and foster expertise in specialized methodological and therapeutic areas.

 

Could you walk us through the focus of these new workstreams, and share some of the key activities involved?

Cytel PBS has set up workstreams focused on varied disease indications (neurology, oncology, type 2 diabetes/cardiovascular) and methodological aspects or regulatory know-how (statistical methodology, rare diseases, Phase I studies, submissions of new drug applications).

The objectives of the workstreams vary depending on their focus, though they all follow recent developments in their field and corresponding regulatory guidelines. Some workstreams have started creating internal standards (e.g., SAPs, table shells) and trainings to support other statisticians, or have documented experiences with specific statistical methods. The objective is to centralize the information and build a structured way of knowledge-sharing.

Workstream meetings and groups are an ideal forum to discuss specific topics and share opinions with colleagues. Some workstreams are also involved in sales and optimization initiatives. Externally, workstream members may attend and prepare presentations for conferences or workshops.

 

How do these workstreams integrate with the existing clinical biometrics services we offer?

Each statistician within PBS has the opportunity to contribute to one or more workstreams. It is a platform that helps to provide peer-to-peer support among individuals involved in specific projects. Overall, the workstreams aim to support and to improve our existing services.

 

How do our clients benefit from these workstreams?

At the request-for-proposal stage, clients are often interested in our experience within a specific topic. The workstreams help not only to summarize and provide feedback on such experiences, but within the platform, we can develop and maintain shared knowledge at the highest level within that particular area. We can easily ensure stability, suitable study specific assignments, and swift back-ups, if needed. Additionally, less-experienced statisticians — whether junior or new to the field — gain the broader support provided by the workstream, offering them guidance. Consequently, our clients benefit from this direct and indirect therapeutic area and methodology expertise.

 

Now, you moderate the neurology workstream. Can you share some of your activities?

The neurology workstream has two main areas of interest:

  • Multiple sclerosis (MS)
  • Alzheimer’s disease/Parkinson’s disease

These diseases are complex, which is why it’s essential to understand their etiology, development, and symptoms to comprehend typical endpoints and analyze them appropriately. Therefore, we have prioritized mapping, maintaining, and expanding the expertise we have gained through collaborations with various clients, primarily well-established pharmaceutical companies that have already marketed several products and are developing other compounds, as well as smaller biotech companies. The neurology workstream has created a few sets of training slides for biostatisticians who are new to the indication. We also collaborate with Cytel’s business development department on a better description of our capability for clients.

 

Could you be more specific and share examples?

For now, the focus is on MS and Alzheimer’s/Parkinson’s disease endpoints and statistical methodologies used to analyze them. Team members often bring varied experiences to each focus area. For instance, some colleagues have conducted numerous analyses on relapses, which are recurrent events that may require “qualification” or can have different competing risks — aspects that potentially need to be considered in the analysis. Another example is composite endpoints, such as NEDA (no evidence of disease activity), where different clients may adopt slightly different approaches. Additionally, we work with tools designed to measure levels of disability, cognitive function, and patient-reported outcomes (PROs) collected through questionnaires. These are typically repeated measures, analyzed based on specific manuals.

 

Can you share an example of a client engagement where your workstream made an impact?

The workstreams operate behind the scenes. For instance, we have long-term clients for which we run numerous exploratory analyses in the medical affairs area. These are often requested on short notice and are difficult to plan. The workstream enables us to remain flexible, distributing the workload among several team members and involving colleagues who would otherwise require significantly more time for onboarding.

A Preview of Cytel’s Contributions at the 2025 CDISC + TMF EU Interchange

This week, on May 14–15, the CDISC + TMF EU Interchange 2025 will take place, just a few steps from our Cytel’s Geneva office.

This year at Cytel, we’re making the event even more special! We will be hosting a pre-conference, open to anyone able to arrive into Geneva before the event begins.

But that’s not all, together with my Cytel colleagues we will have three presentations and one poster, where my colleagues will share insights from their work with CDISC standards, including the Trial Master File Standards Model (CDISC-TMF). And it is from the contribution of my colleague Caroline Terril that I’d like to start with some anticipations of what the four Cytel presentations will be about.

 

Key Considerations for Biometrics CROs Not Managing the TMF — The Journey So Far

Caroline Terril, Thursday, May 15, 10:00–10:30 — Session 5E: TMF Management (TMF Track)

If you’ve ever asked yourself after these last three years what really matters when it comes to managing the TMF with CDISC — especially for biometric CROs that don’t directly manage TMFs — then my colleague Caroline Terril might have the answer. In her presentation, she will delve into our journey so far in trying to adopt the CDISC-TMF standard.

 

The Curious Case of External Controlled Arms (ECA): Practical Solutions for External and RWD Integration

Gautham Selvaraj (co-authored by me), Wednesday, May 14, 09:30–10:00 — Session 5C: Real World Data

At Cytel, we’ve seen increasing use of external control arms (ECA) in sponsor projects. In this presentation, Gautham Selvaraj will walk through two real-world case studies on integrating ECA data into CDISC-compliant datasets, exploring the unique challenges and solutions in aligning real-world data with CDISC standards.

Interested in learning more about ECA clinical trial design? Explore more of Cytel’s offerings and insights:

 

Governing the Ungovernable: Can a CRO Effectively Govern Its Standards?

Angelo Tinazzi, Thursday, May 15, 14:30–15:00 — Session 7C: Applied Standard Governance

Are you a CRO struggling with different Sponsor Interpretation of Data Standards? Or perhaps across multiple therapeutic areas or indications? Hard life, isn’t it?

Spoiler alert: although there’s no “magic” in my presentation, and no AI involved, I will offer practical insights into the complexities of CRO data standard governance. Sponsors are also welcome to join to see what life looks like from the other side of the barricades!

 

Managing SDTM Mapping Challenges in Multi-Study Portfolios: A Guide to Standards and Consistency

Jing Zhang and Marianne Dutfoy, Wednesday, May 14, 12:30–13:30 — Poster Session

In their poster, Jing Zhang and Marianne Dutfoy offer guidance for navigating SDTM mapping across multi‐study portfolios. They’ll address challenges such as inconsistent CRFs, variations in source data, and the hurdles of aligning historical studies with newer versions of standards.

 

Interested in learning more?

Download Angelo Tinazzi’s new ebook, “The Good Data Submission Doctor on Data Submission and Data Integration to the FDA”:

From Toplines to Triumph: Visualizing the Pathways to Regulatory Approval

Achieving positive topline results in a clinical trial marks a critical milestone in the drug development process, yet it is far from the end of the submission journey. Instead, it signals the start of a complex, fast-paced effort to prepare for regulatory submission and navigate the FDA’s multi-stage review. The final “regulatory defense” stage demands rigorous collaboration, meticulous planning, and adaptability to meet the expectations of regulatory agencies.

Here we discuss the key stages in the post-topline journey, exploring key milestones, unexpected challenges, and best practices for ensuring a strong submission and a smooth path to approval.

 

1. The Preparation: Post-topline readiness and strategic planning

The preparation phase begins immediately after topline results are available. During this critical window — often lasting several months — cross-functional teams shift their focus to assembling the final submission package. Statisticians and programmers play a central role here, finalizing the tables, listings, and figures (TLFs) that will populate the Clinical Study Report (CSR) and preparing submission-ready datasets following CDISC standards, including ADaM, SDTM, and associated documentation.

In parallel, a pre-BLA or pre-NDA meeting with the FDA is typically scheduled to align on expectations, identify potential concerns, and set the foundation for a smoother review process. This phase is not just about document generation; it’s about establishing a strategy, anticipating regulatory scrutiny, and ensuring the submission is both complete and compelling. The quality of the groundwork laid here often dictates the ease — or difficulty — of the phases that follow.

 

2. The Submission: Crossing the threshold to regulatory review

Once the submission is filed, the process transitions into a more structured phase governed by the FDA’s review protocols. The agency begins with a 60-day filing review to assess whether the BLA or NDA is complete and acceptable for full review. If so, the sponsor receives a Day 74 Letter, which provides early feedback, flags any immediate concerns, and confirms the Prescription Drug User Fee Act (PDUFA) date — typically 10 months post-filing for standard reviews or 6 months for priority reviews. Although this phase may seem procedural, its significance is high. A clean, well-organized submission can streamline the review process, limit questions, and reduce the risk of delays. This is also the point where rolling submissions, if applicable under Fast Track designation, can offer a tactical advantage by accelerating document delivery and potentially shortening review timelines.

For statistical and programming teams, this is not a time to sit back and relax — it’s an opportunity to ensure internal alignment and anticipate questions the FDA may raise based on known data complexities. Strong documentation and traceability within datasets and outputs are essential at this point, helping to support any needed follow-up. Proactive communication and readiness during this phase help lay the groundwork for the more intensive regulatory engagement that follows.

 

3. The Regulatory Defense: Responding, clarifying, and defending your data

The regulatory defense phase is where the bulk of agency interaction occurs — and where flexibility and responsiveness become essential. During this time, the FDA may issue multiple information requests (IRs), asking for clarification on statistical methodology, specific data points, or safety and efficacy outcomes. Mid-cycle communications, typically occurring around months 4–5 for standard reviews, offer a formal opportunity to assess the review’s progress and surface any significant concerns.

In some cases, the agency may convene an Advisory Committee (AdCom) meeting to gather expert input, particularly when there are outstanding safety questions or complex benefit-risk considerations. Throughout this phase, the ability to quickly respond to ad hoc requests, provide high-quality data outputs, and maintain close collaboration across functions is critical. It’s a high-stakes stage where well-prepared teams can help preserve timelines and ensure the submission stays on track.

 

4. The Unexpected: Adapting to setbacks and charting a new course

In some cases, the regulatory journey doesn’t lead directly to approval. If the FDA identifies significant deficiencies in the initial submission — whether related to clinical data, statistical interpretation, manufacturing, or safety — it may issue a Complete Response Letter (CRL). This marks a temporary halt in the process, requiring the sponsor to address the concerns before resubmission. Depending on the scope of the deficiencies, the resubmission may fall under Class I (minor issues, reviewed in 2 months) or Class II (major issues, reviewed in 6 months).

For statisticians and programmers, this could mean conducting additional analyses, integrating new data, or adjusting the structure and presentation of the submission package. While a CRL can be a setback, it’s also an opportunity to recalibrate, seek additional guidance from the FDA, and improve the likelihood of approval in the next cycle. The key is to approach this phase with transparency, strategic thinking, and a readiness to adapt and respond.

 

Final takeaways

The path from topline results to regulatory approval is rarely linear. Timelines can range from as little as 12 months in expedited reviews to over 30 months in cases involving major deficiencies and resubmissions. Success in this post-unblinding phase hinges on proactive planning, adaptable resourcing, and the ability to respond quickly and thoroughly to regulatory needs. Equally important is collaboration across functions — clinical, regulatory, biostatistics, programming, and operations must work closely and cohesively to anticipate challenges, align timelines, and respond efficiently to agency requests. Whether following a standard or accelerated route, the shared priority is a comprehensive, high-quality submission that stands up to regulatory scrutiny — and ultimately supports timely access to new therapies for patients.

 

Interested in learning more?

Watch Jasperlynn Kao and Florence Le Maulf’s recent webinar, “From Toplines to Triumph: Visualizing the Pathways to Regulatory Approval”:

Data Submission to Health Authorities: Current Practices and Future Directions

How far is 2041? Update on data submission to health authorities

Back in the summer 2023, I was invited to present “Standards and Open-Source Hand-in-Hand: Leveraging Automation to Expedite Drug Market Request Review Process” at PharmaSUG-China. I was trying to imagine the future of data submission, travelling to 2041 and envisioning how AI can support and expedite the regulatory drug submission process, and how AI could enhance the preparation and review of data submission packages. I then brought the discussion back to the present, sharing some reflections on the journey ahead — a journey that will inevitably require better use of standards, open-source adoption and solutions, and collaborative industry initiatives.

About 18 months later, the topic of AI became predominant in our industry. This is clearly reflected by the growing number of AI-related presentations at conferences, including the recent PHUSE US Connect Conference held last March in Orlando, and the upcoming CDISC-EU Interchange this May, just a few steps from our offices here in Geneva.

Here, I would like to provide a brief overview of the latest updates on data submission requirements, as well as industry initiatives aimed at improving how we create clinical data packages for submission to health authorities in support of market drug approval.

 

FDA data submission requirements update

Regulatory data submission requirements, more specifically those from the US FDA, have been refined through various updates of their guidance. Since my January 2024 summary of the latest changes, the following additional requirements have been added:

 

  • Submit a dataset, LC, copy of LB with US conventional unit as standard unit (March 2024)
  • Viral load results should be placed in the MB domain, confirming there is still misuse of specific laboratory related data domain e.g., LB, IS and MB (October 2024)
  • The requirement for US conventional unit was recently extended to ADaM, with ADLC dataset (March 2025)

 

See here the latest March 2025 version of the FDA Study Data Technical Conformance Guide.

It’s also worthwhile to mention the FDA’s “Protocol Deviations for Clinical Investigations of Drugs, Biological Products, and Devices,” which provides various recommendations around the management of protocol deviations. This includes some specific recommendations for SDTM mapping, such as including a variable in the DV domain that provides the sponsor’s determination of whether the protocol deviation was important.

 

EMA data submission requirements

While the EMA has not made data submission mandatory — nor specified a required data format — the European Medicines Agency launched the “Raw Data” pilot proof-of-concept project about two years ago. In this initiative, selected applicants were invited to submit structured clinical trial data as part of their initial applications and post-authorization procedures. Clinical trial data in this context refers to individual patient-level data, including:

 

  • Clinical laboratory results
  • Images
  • Medical records

 

The aim of the pilot is to assess whether the use of structured clinical trial data can help speed up and improve the drug assessment process.

An initial outcome of the project was published in a report released last October. It summarizes lessons learned from five data submissions received between September 2022 and December 2023, out of the ten originally planned. Among the key learnings and outcomes, CDISC standards, namely SDTM and ADaM with define.xml and a data reviewer’s guide, were confirmed as suitable formats for data review. The software tools being explored included SAS and R for statistical analysis, and SAS JMP Clinical for visualization. While SAS XPT files were required, other transport formats such as XML or JSON were also accepted, upon mutual agreement between EMA and the applicant.

Although these standards and formats are not yet mandatory, additional guidance has been provided in a Q&A document (e.g., regarding maximum data package size). Since then, the EMA has decided to extend the project’s duration. Final recommendations are expected in 2025 — potentially with some early updates to be shared at the upcoming CDISC EU Interchange in May.

 

Industry initiatives update

Since my speech at PharmaSUG-China, the industry initiatives I discussed there have progressed quite rapidly:

 

   The R Pilot Submission Experience: All four planned pilots have been completed, and in February a fifth pilot was announced. This time, the goal is to establish the new dataset-JSON format as a CDISC standard for clinical data submissions (see here a report from successful pilot submitting data with the new format to the FDA.

•   R Packages for SDTM and ADaM: Both the SDTM (oak) and ADaM (admiral) R packages are now widely used in our industry for submission projects.

•   The Analysis Results Standard (ARS): The first version of the ARS was released in April 2024, along with a new initiative, the eTFL Portal, which shares examples and templates for the most common TFLs.

•   The CORE Project: The project continues its mission to develop Open Conformance Rules, alongside a growing number of Open Source initiatives.

 

Interested in learning more?

Get your copy of Angelo Tinazzi’s latest ebook, “The Good Data Submission Doctor on Data Submission and Data Integration to the FDA”:

Expediting the Regulatory Submission Process with Automated Tools

In the biopharmaceutical industry, expediting regulatory submissions is crucial for timely access to life-saving medications. As a statistical programming team, our role involves accelerating the drug approval process by meticulously preparing Electronic Common Technical Document (eCTD) packages, including the statistical review and programming process of mapping SDTM, deriving ADaM, and TLF generation.

Here we discuss the process and benefits of the metadata-driven approach. From mapping to report, this approach enhances the efficiency in attaining results and generating submission packages promptly by reducing manual interventions.

 

What are eCTD packages and how are they prepared?

The eCTD is the “standard format for submitting applications, amendments, supplements, and reports to FDA’s Center for Drug Evaluation and Research (CDER) and Center for Biologics Evaluation and Research (CBER).”1 It facilitates the electronic submission of dossiers for market approval requests, such as for a new drug (NDA).

Among files stored in the eCTD, there are some key components related to Biometrics deliverables:

  • SDTM Dataset: The Study Data Tabulation Model (SDTM) is one of the most important CDISC data standards. It’s a framework used for organizing source data collected in human clinical trials.
  • ADaM Datasets: Analysis datasets are created to enable statistical and scientific analysis of the study results. CDISC Analysis Data Model (ADaM) specifies the fundamental principles and standards to ensure that there is clear lineage from data collection to analysis.
  • TLF: Analytical outputs, in the form of tables or figures, are used to summarize the analysis required for the submission to the regulatory agencies. These outputs are supported by listings that display the actual data at all the data points.

 

The need for automation

When working on any project / analysis, certain elements remain unchanged regardless of the study design. Therefore, standardizing and automating their production could lead to efficiency, ensuring consistency, and reduce the overall time required for submission. Also, by automating these items, we could reduce manual intervention, thereby minimizing the chances of human error.

This approach has several benefits, including:

  • Efficiency: Since the team can focus more on the non-standard parts of the outputs, the overall efficiency of the team is increased.
  • Consistency: Since automated tools generate standard code based on a set of rules, the resulting code remains highly consistent across various projects. This makes it easier to understand and debug (in case of any updates).
  • Quality: Since the tools have been rigorously tested, they produce extremely high-quality and reliable outputs.
  • Reduced manual intervention: Since manual intervention is limited, the possibility of human error is minimized. As long as the specifications are correctly drafted, the output generated by the standard code should be error-free.

A metadata-driven approach

Many companies, including Cytel, have adopted a metadata-driven approach to accelerate tasks such as SDTM, ADaM, and TLF code generation. The goal of this approach is not to automate 100% of the final code but rather to generate as much standardized and structured code as possible. This approach enhances efficiency while simplifying modifications when needed.

While a Metadata Repository (MDR) can maximize automation in the long run, currently available MDR tools remain cumbersome.2 For this reason, while still assessing the benefit of MDR solutions, Cytel has taken a different approach — extracting metadata from existing documents that statistical programmers already use in their daily work. Without adding extra workload, this metadata is stored in a structured format, allowing us to apply automated rules to enrich it. From there, we can generate SDTM, ADaM, and TLF code efficiently.

For example, metadata can be extracted from ODM.xml files or raw datasets to streamline SDTM specification mapping. These specifications can then be leveraged to generate SAS or R code automatically. Similarly, metadata from study mock shells — such as titles, footnotes, table headers, and table body structure and content — can drive the creation of TLFs with minimal manual intervention.

Another key advantage of this metadata-driven approach is its language agnosticism. By structuring metadata independently of the programming language, the same metadata can be used to generate both SAS and R code. This ensures consistency, facilitates the transition for SAS programmers moving to R, and maintains quality without impacting project timelines.

 

Final takeaways

In line with the premise that “one solution does not fit all,” CROs can maximize the value of metadata within clinical trial delivery by leveraging the metadata already embedded inherent in study artifacts. If you are able to define the way of extracting as much metadata as possible from the documents you already use, you can obtain a lot of value if you are able to transform that metadata into real deliverables.

This metadata-driven approach is sensitive to the fact that CROs must accommodate a multitude of sponsor standards and delivery requirements, without sacrificing the benefits of automation in an ecosystem rich in interdependencies between regulatory authorities, industry consortia, sponsors, CROs, and other third-party technology vendors.

 

1 US FDA. (4 October, 2024). Electronic Common Technical Document (eCTD).

2 PHUSE White Paper (2 October, 2024). Best Practices in Data Standards Implementation Governance.

 

Interested in learning more?

Watch Manish Deole and Sebastià Barceló’s on-demand webinar, “Expediting Regulatory Submissions through Automation”:

Getting Your Data Strategy Right: Seven Tips for Balancing Science, Efficiency, and Patient Centricity

In today’s clinical trial landscape, the sheer volume of data collected is both a blessing and a curse. While advances in data collection and analysis offer unprecedented insights into drug development, they also bring logistical challenges, increasing costs, and burdens on patients and research sites.

In the coming year and beyond, an effective approach to data will be more and more critical. Clinical research organizations (CROs) and sponsors must craft data strategies that are not only scientifically robust but also operationally efficient and patient-centric.

Here, we explore how to get your data strategy right by focusing on key principles and practical approaches that balance scientific objectives, operational realities, and participant well-being.

 

1. Define clear objectives: Focus on what matters most

An effective data strategy starts with clarity about what the trial is designed to achieve. The endpoints — whether efficacy, safety, or exploratory — should drive every decision about data collection. Too often, protocols become bloated with “just in case” data points, which can increase complexity without adding meaningful insights.

  • Prioritize critical endpoints: Identify and align on the primary and secondary endpoints that are essential for regulatory approval and decision-making.
  • Stakeholder collaboration: Work closely with sponsors, regulators, patient advocacy groups, and key stakeholders to define the minimum viable dataset required for success.
  • Eliminate non-essential data: Conduct feasibility assessments to identify redundant or low-value data points and exclude them from the protocol.

By narrowing the focus to critical data, you can reduce trial complexity, improve operational efficiency, and ease the burden on sites and patients.

 

2. Streamline safety data collection

Safety monitoring is a cornerstone of clinical trials, but it is also one of the most resource-intensive components. Collecting excessive safety data can overwhelm both sites and patients, delaying timelines and inflating costs. However, reducing safety data collection must be done carefully to ensure participant well-being is not compromised.

  • Timing and frequency: Align safety assessments with the drug’s pharmacokinetics and expected adverse event timelines to avoid unnecessary data collection.
  • Remote monitoring: Wearable devices, mobile apps, and telemedicine can be used to collect safety data in real time, reducing the need for site visits.
  • Simplify reporting: Limit detailed reporting to serious adverse events (SAEs) and high-priority concerns while streamlining processes for common, low-severity events.

By leveraging these approaches, trials can maintain high safety standards while reducing unnecessary data collection and operational overhead.

 

3. Optimize operational feasibility

Even the most scientifically sound protocol can fail if it is operationally impractical. Clinical trial designs must account for the practicality at research sites and the realities of the patient participation.

  • Site workload: Avoid overwhelming sites by simplifying data collection processes and limiting unnecessary assessments.
  • Patient-centric protocols: Minimize the burden on participants by reducing visit frequency, consolidating procedures, and using remote or decentralized trial models.
  • Stakeholder input: Engage site investigators and patients during protocol development to identify pain points and refine processes before trial launch.

Operational feasibility isn’t just about reducing site and patient burden; it’s also critical for ensuring data quality. Overly complex protocols can lead to errors, incomplete datasets, and costly delays.

 

4. Leverage real-world evidence (RWE)

Real-world evidence offers a powerful way to supplement trial data and reduce the need for redundant or duplicative collection. By tapping into existing data sources, such as electronic health records (EHRs), claims databases, and patient registries, CROs can streamline trial operations while gaining valuable insights.

  • Historical Comparisons: Use RWE to establish baseline safety and efficacy data, reducing the need for extensive data collection in the trial itself.
  • Synthetic control arms: Replace traditional placebo or control groups with synthetic arms derived from RWE, reducing the number of participants required.
  • Patient stratification: Leverage RWE to refine inclusion and exclusion criteria, ensuring trials target the right populations from the outset.

When integrated thoughtfully, RWE can significantly enhance efficiency while maintaining scientific rigor.

 

5. Harness technology for smarter data collection

Digital tools and advanced analytics are transforming how data is collected, managed, and analyzed in clinical trials. These innovations can help streamline processes, reduce redundancies, and improve data quality.

  • AI and machine learning: Apply predictive algorithms to identify critical data points and flag potential safety concerns, reducing the reliance on exhaustive datasets.
  • Decentralized trials: Implement decentralized models that allow participants to complete assessments remotely, improving accessibility and reducing dropout rates.
  • Wearable devices: Collect real-time physiological data through wearables, reducing the need for manual measurements and frequent site visits.

The right technology can make data collection more efficient while enhancing patient convenience and trial outcomes.

 

6. Engage regulators early

Regulatory expectations often drive the scope of data collection in clinical trials. Engaging with regulators early in the design process can help ensure that your data strategy meets compliance requirements without unnecessary over-collection.

  • Regulatory guidance: Familiarize yourself with evolving guidance, such as FDA’s initiatives on patient-focused drug development and real-world data.
  • Pre-submission meetings: Use pre-submission meetings to discuss and align on the minimum data required for approval.
  • Streamline post-market plans: Shift exploratory safety and efficacy data collection to post-market surveillance or Phase IV trials where appropriate.

By aligning with regulators upfront, sponsors can avoid unnecessary rework and streamline approval timelines.

 

7. Analyze and learn from past trials

Every completed trial offers a wealth of information about what worked and what didn’t. By analyzing past protocols, sponsors can refine their data strategies and avoid repeating mistakes.

  • Post-trial reviews: Identify data points that were collected but not used in analysis and eliminate them from future designs.
  • Feedback loops: Create systems for gathering feedback from sites, patients, and operational teams to inform future trial strategies.
  • Benchmarking: Compare your trial performance against industry benchmarks to identify areas for improvement.

Learning from experience and continuous improvement is key to optimizing data strategies over time.

 

Final takeaways

Getting your data strategy right is about finding the sweet spot between collecting enough data to meet scientific and regulatory goals and avoiding the pitfalls of over-collection. By focusing on clear objectives, leveraging technology and RWE, streamlining safety data, and designing trials with operational feasibility and patient needs in mind, sponsors and CROs can achieve this balance.

As the clinical trial landscape continues to evolve, a thoughtful, optimized, and patient-focused data strategy will be essential for success. By prioritizing efficiency without compromising quality, the industry can deliver better results — for sponsors, sites, and, most importantly, patients.