Expediting the Regulatory Submission Process with Automated Tools
April 1, 2025
In the biopharmaceutical industry, expediting regulatory submissions is crucial for timely access to life-saving medications. As a statistical programming team, our role involves accelerating the drug approval process by meticulously preparing Electronic Common Technical Document (eCTD) packages, including the statistical review and programming process of mapping SDTM, deriving ADaM, and TLF generation.
Here we discuss the process and benefits of the metadata-driven approach. From mapping to report, this approach enhances the efficiency in attaining results and generating submission packages promptly by reducing manual interventions.
What are eCTD packages and how are they prepared?
The eCTD is the “standard format for submitting applications, amendments, supplements, and reports to FDA’s Center for Drug Evaluation and Research (CDER) and Center for Biologics Evaluation and Research (CBER).”1 It facilitates the electronic submission of dossiers for market approval requests, such as for a new drug (NDA).
Among files stored in the eCTD, there are some key components related to Biometrics deliverables:
- SDTM Dataset: The Study Data Tabulation Model (SDTM) is one of the most important CDISC data standards. It’s a framework used for organizing source data collected in human clinical trials.
- ADaM Datasets: Analysis datasets are created to enable statistical and scientific analysis of the study results. CDISC Analysis Data Model (ADaM) specifies the fundamental principles and standards to ensure that there is clear lineage from data collection to analysis.
- TLF: Analytical outputs, in the form of tables or figures, are used to summarize the analysis required for the submission to the regulatory agencies. These outputs are supported by listings that display the actual data at all the data points.
The need for automation
When working on any project / analysis, certain elements remain unchanged regardless of the study design. Therefore, standardizing and automating their production could lead to efficiency, ensuring consistency, and reduce the overall time required for submission. Also, by automating these items, we could reduce manual intervention, thereby minimizing the chances of human error.
This approach has several benefits, including:
- Efficiency: Since the team can focus more on the non-standard parts of the outputs, the overall efficiency of the team is increased.
- Consistency: Since automated tools generate standard code based on a set of rules, the resulting code remains highly consistent across various projects. This makes it easier to understand and debug (in case of any updates).
- Quality: Since the tools have been rigorously tested, they produce extremely high-quality and reliable outputs.
- Reduced manual intervention: Since manual intervention is limited, the possibility of human error is minimized. As long as the specifications are correctly drafted, the output generated by the standard code should be error-free.
A metadata-driven approach
Many companies, including Cytel, have adopted a metadata-driven approach to accelerate tasks such as SDTM, ADaM, and TLF code generation. The goal of this approach is not to automate 100% of the final code but rather to generate as much standardized and structured code as possible. This approach enhances efficiency while simplifying modifications when needed.
While a Metadata Repository (MDR) can maximize automation in the long run, currently available MDR tools remain cumbersome.2 For this reason, while still assessing the benefit of MDR solutions, Cytel has taken a different approach — extracting metadata from existing documents that statistical programmers already use in their daily work. Without adding extra workload, this metadata is stored in a structured format, allowing us to apply automated rules to enrich it. From there, we can generate SDTM, ADaM, and TLF code efficiently.
For example, metadata can be extracted from ODM.xml files or raw datasets to streamline SDTM specification mapping. These specifications can then be leveraged to generate SAS or R code automatically. Similarly, metadata from study mock shells — such as titles, footnotes, table headers, and table body structure and content — can drive the creation of TLFs with minimal manual intervention.
Another key advantage of this metadata-driven approach is its language agnosticism. By structuring metadata independently of the programming language, the same metadata can be used to generate both SAS and R code. This ensures consistency, facilitates the transition for SAS programmers moving to R, and maintains quality without impacting project timelines.
Final takeaways
In line with the premise that “one solution does not fit all,” CROs can maximize the value of metadata within clinical trial delivery by leveraging the metadata already embedded inherent in study artifacts. If you are able to define the way of extracting as much metadata as possible from the documents you already use, you can obtain a lot of value if you are able to transform that metadata into real deliverables.
This metadata-driven approach is sensitive to the fact that CROs must accommodate a multitude of sponsor standards and delivery requirements, without sacrificing the benefits of automation in an ecosystem rich in interdependencies between regulatory authorities, industry consortia, sponsors, CROs, and other third-party technology vendors.
1 US FDA. (4 October, 2024). Electronic Common Technical Document (eCTD).
2 PHUSE White Paper (2 October, 2024). Best Practices in Data Standards Implementation Governance.
Interested in learning more?
Watch Manish Deole and Sebastià Barceló’s on-demand webinar, “Expediting Regulatory Submissions through Automation”:
Watch on demandSubscribe to our newsletter
Manish Deole
Principal Statistical Programmer
Manish Deole is Principal Statistical Programmer at Cytel. He has over 18 years of experience in SAS programming. He is an expert in clinical trial data (both Safety and Efficacy), CDISC SDTM and ADaM, Pinnacle21 reports, and define.xml. He has worked on Phase I to IV studies in Oncology, Rheumatology, CNS, Endocrinology, and Vaccine therapeutic areas. He has expertise in handling complex Analysis Datasets, Tables, Figures, and Listings, along with regulatory submissions (FDA and EMEA). Manish has been working with Cytel for over 11 years and likes to spend quality time with his family.
Read full employee bio
Sebastià Barceló
Associate Director, Statistical Programming
Sebastià Barceló is Associate Director, Statistical Programming, at Cytel in Geneva. He has more than 10 years of experience in the field of clinical research in the areas of data management, biostatistics, and statistical programming with different roles in CROs in Spain and Switzerland. Sebastià currently manages a team working on automation initiatives and tool development using multiple programming languages.
Read full employee bioClaim your free 30-minute strategy session
Book a free, no-obligation strategy session with a Cytel expert to get advice on how to improve your drug’s probability of success and plot a clearer route to market.