Solutions
About Us
Insights
Careers

Advancing Clinical Data Standards: Guidance, Regulations, and Key Standards Developments

This last year has been marked by new standards, industry templates and initiatives, and regulatory guidance — an intense year for Clinical Data Standards Development.

And it can certainly be challenging for sponsors to keep up with the evolving landscape: Standard Development Organizations (SDOs), such as the CDISC, advanced clinical data standards; regulatory agencies like the FDA added specific requirements; and industry initiatives such as PHUSE worked to clarify and address different stakeholders’ expectations through the development of white papers.

Here I provide a quick summary of what was released in the year we have just left behind.

 

Regulatory guidance

While the industry awaits the EMA’s long-anticipated clinical data standards submission requirements (see the current status of their pilot project here), the FDA celebrated the 10th anniversary of its Study Data Conformance Guide last April. If you missed it, check out my previous post “New FDA Data Submission Requirements and Substantial Changes,” which discussed a decade of guidance improvements and the latest requirements. Since then, the FDA has released two new versions of its Clinical Data Standards Guidance. These updates do not introduce major new requirements other than confirming that SDTM and ADaM Medical Devices IG for both CDER and CBER FDA divisions “align with FDA current business needs.”

It’s also worthwhile to mention the recent release of the FDA’s Protocol Deviations for Clinical Investigations of Drugs, Biological Products, and Devices. The guidance provides various recommendations around the management of protocol deviations, including some specific recommendations for SDTM mapping, such as including a variable in the DV domain that provides the sponsor’s determination of whether the protocol deviation was important. See also my friend Eanna Kiely’s LinkedIn post.  

 

CDISC standards

While continuing to advance initiatives aimed at full clinical trial study data digitalization with for example the release of the third version of the CDISC USDM (CDISC Unified Study Data Model), throughout 2024, CDISC continued to release new standards, update existing ones, and advance cross-industry initiatives such as CDISC CORE.

Notable examples of new releases in 2024 include the first version of the CDISC Analysis Results Standard, aimed at providing a framework for linking analysis results directly to the data and metadata that support them. Several presentations and workshops were held at various events over the past two years to promote first the idea and then gather feedback from the audience prior to its final release last April.

Later in the year, similar to the eCRF Portal, an eTFL Portal was also released. This resource includes examples of the most common analysis table layouts, complete with table shells, input ADaM datasets, define.xml, and the JSON version of the analysis results along with metadata. While not a regulatory requirement, this standard adds an important missing piece in our industry’s “dream” end-to-end goal.

It is also worth mentioning the first version of the CDISC Tobacco Implementation Guide (TIG), developed in collaboration with the FDA’s Center for Tobacco Products (CTP). This guide aims to assist tobacco companies in implementing CDISC foundational standards, such as CDASH, SDTM, and ADaM. If you are working at a Tobacco company, check also the recent launched TIG eSubmission Pilot.

Other highlights include updates to existing standards and new resources added to the CDISC Library (see the full list of resources here). Regular updates to CDISC Biomedical Concepts also became a key focus of the CDISC Library.

For a look ahead, check out the CDISC Standards Timeline for 2025.

 

Industry initiatives

PHUSE continued its work through dedicated Working Groups (WGs) focused on Optimizing the Use of Data Standards. In addition to maintaining a peer-reviewed QA for SDTM ADaM Implementation, these WGs have released two White Papers (WPs) summarizing the Data Standards challenges, nowadays and looking ahead with new trial designs, and how companies implement and govern Data Standards.

Both WPs are based on the outcomes from two surveys conducted during the 2022 and 2023 PHUSE Computational Science Symposium (CSS), with the second one in particular proposing some potential governance best practices across the industry aiming to establish some implementation consistency across companies. It’s worthwhile to mention that these WPs share that about 50% of respondents “affirmed” they do not have a dedicated data standards team or were unaware if one existed in their company. Among those with data standards teams in place, there was significant variation in structure, ranging from cross-functional groups to teams focused on specific functions. Only a few centralized teams addressed topics beyond foundational CDISC standards, such as SAPs and TFL generation. The survey also highlighted that many organizations, more than 50%, rely on Excel or internally developed tools for managing standards, as commercial metadata repositories (MDRs) are often seen as complex and require long-term investment. These challenges remain consistent across pharma, biotech, and CROs.

Applying CDISC standards to non-interventional studies and real-world data continues to be a challenge due to diverse data sources and scenarios, such as the use of external control arms. This was clear when I had the opportunity to host a CDISC ADaM training at one company in Europe specializing in Observational Studies.

The release of the CDISC Considerations for SDTM Implementation in Observational Studies and Real-World Data v1.0 in February addressed some of the SDTM mapping issues. Similarly for the previously released PHUSE Data Standards for Non-Interventional Studies, which covers additional ADaM-related topics, the FDA Data Standards for Drug and Biological Product Submissions Containing Real-World Data guidance (December 2023) provided some initial recommendations when submitting data packages including real-world data.

With the release of the first version of the CDISC Dataset-JSON in 2023, the long-awaited alternative to SAS XPT as a standard for data submission took a significant step forward. Following the release of the first version, in collaboration with the FDA and CIDSC, PHUSE completed a pilot project in 2024, with a full report made available last June. Last December, CDISC released an updated version (1.1) of the standard. Among recommendations from testers, some bugs were detected, for example, when importing or exporting to SAS, which includes some known differences between software such as SAS and R with numeric precision and date representation, or limitations requiring updates to analytics tools used by the FDA (e.g., SAS JMP). Details of the changes implemented or items rejected from the public review can be found here.

In 2024, open source continued to be a leitmotiv in our industry, with tools and initiatives aimed at regulating the use of open-source solutions in clinical data submission (see also Cytel blog “The Journey into Open Source So Far”). Also, check the progress of industry initiatives such as CDISC COSA.

Furthermore, the Using R to Submit Research to the FDA initiative completed the first part of its final Pilot 4. While the focus of the three previous pilots was on demonstrating that SDTM and ADaM packages could be created and submitted to the FDA using R — including packages that might include Shiny applications for reviewers to use — this pilot explored the use of novel technologies such as Linux containers and WebAssembly to bundle a Shiny application into a self-contained package, facilitating a smoother process of both transferring and executing the application, allowing agency reviewers to easily run and evaluate software without complex setups.

 

New releases and updates from 2024

See below a complete list of new releases and updates that occurred in 2024 with links to individual resources. Do not hesitate to spot any missing items!

 

February

 

March

 

April

 

June

 

September

 

October

 

December

 

Interested in learning more?

Download Angelo Tinazzi’s new ebook, “The Good Data Submission Doctor on Data Submission and Data Integration to the FDA”:

The Journey into Open Source … So Far!

Written by Sebastià Barceló, Malte Stein, and Angelo Tinazzi

Open source has been a leitmotif in our industry for many years now, but its adoption poses a number of challenges. At Cytel, our journey into open source began a couple of years ago. Since then, we have focused on building a dedicated Statistical Computing Environment (SCE), defining new processes, and developing new tools to support these processes. Additionally, we also contributed to industry initiatives such as the R {admiral}.

This year, PHUSE-EU will feature a dedicated stream, Open-Source Technology, where presenters will share their experience with open-source technology adoption. In this spirit of collaboration, we will be contributing with two presentations, both addressing critical aspects:

  • The co-existence of R and SAS in the same SCE
  • The risk assessment of R packages

 

Integrating RStudio POSIT and SAS in the same environment

Our new SCE integrates RStudio POSIT and SAS Grid across both Windows and Linux servers. The integration was designed to create a unified and efficient environment for data analytics, leveraging both SAS and POSIT’s capabilities.

The integration was complex and presented several obstacles and surprises along the way. For instance, we encountered compatibility issues, particularly around data access and permissions. To address these, we implemented dual protocol drive, enabling real-time data sharing across platforms, and the use of Git as a version control system, which allows us to maintain and publish content in Connect in a more robust and secure way.

Additional challenges in managing this SCE include balancing security with usability for internal and external access to POSIT Connect and optimizing R package management.

Figure 1 illustrates the final infrastructure.

 

 

R packages risk assessment

Installing and using R packages in the SCE requires assessing the risks associated using these packages. These packages are typically accessed through CRAN, the primary source for R packages developed by various organizations and individuals. Risk assessment is especially critical in industries like pharmaceuticals, where strong compliance requirements (e.g., GxP), necessitate that packages are well maintained, documented, and, after all, reliable.

A key aspect of the risk assessment is the collection of packages metadata, enabling us to classify and assess the reliability of all potential packages we will want to make available in our SCE.

At Cytel, we applied a comprehensive assessment approach by extracting metadata from R packages. We began by evaluating various techniques, such as APIs and web scraping, and compared our approach with the R riskmetric package. This comparison highlighted limitations in conventional methods, which often only address the latest package version. As a result, we enhanced our metadata extraction process.

 

Interested in learning more?

If you are attending the PHUSE-EU in Strasbourg from November 10–13, do not miss Sebastià and Malte’s poster and presentation, where the co-existence of R and SAS and our approach to extracting metadata from R packages will be discussed in more detail:

 

“Bridging Platforms: Integrating RStudio POSIT and SAS Grid in the Same Environment”

Cytel presenters: Sebastià Barceló and Malte Stein

Tuesday, November 12, at 5:30 p.m. (Poster Session – PP28)

 

“Unveiling R Package Risk Assessment: A Comparative Analysis of Metadata Extraction”

Cytel presenters: Malte Stein and Sebastià Barceló

Wednesday, November 13, at 1:30 p.m. (Open-Source Technology Stream – OS14)

 

Angelo Tinazzi will moderate the Scripts, Macros and Automation stream, which will also cover some open-source experiences from other organizations.

 

Cytel will be at Booth #6! We hope to see you there!

Career Perspectives: A Conversation with Guillaume Hervé

In this latest edition of our Career Perspectives series, we had the privilege of interviewing Guillaume Hervé, Director Statistical Programming in PBS. Guillaume shares his journey in statistical programming, highlighting his extensive experience and pivotal roles. He discusses Cytel’s collaborative culture, innovative project management approaches, and the importance of mentorship. Additionally, Guillaume offers insights into the skills essential for success in the field and advice for aspiring statistical programmers.

Can you give us a little background on your career and your professional journey so far?

After completing my master’s degree in biostatistics and multiple internships as a biostatistician, I started my career as a statistical programmer 14 years ago at Novartis in Rueil-Malmaison (near Paris). I was quickly promoted to lead programmer, a position that allowed me to express my full potential as both a programmer and a team lead. During those 8 years, I gained a solid foundation of knowledge and experience in the pharmaceutical industry, especially within biometrics and clinical trial management.

In 2018, Cytel opened their new office in Basel, which is where my journey with Cytel began. I now had the opportunity to evolve in a new environment — the world of CROs. Cytel was expanding, which opened the door for me to consolidate and strengthen my experience as a team leader and provided me with the opportunity to take on the role of operational manager, and later line manager. I currently supervise a team of 20+ programmers across various regions, including Europe and APAC.

What is your role at Cytel?

I’m Director of Statistical Programming for Cytel’s Project-Based Analytical Solutions in Europe. My current role involves line management responsibilities, oversight of projects’ scope management, and development/expansion of the programming group.

Scope management mainly involves ensuring optimal utilization of our programmers across projects, controlling the quality of deliverables, overseeing the financial health of projects, and monitoring the correct implementation of programming processes. I am also actively involved in recruiting and onboarding new team members, establishing company processes, developing standard tools, and supporting department initiatives.

An illustration of such an initiative is the internship program in the programming department I developed in 2021. During the past 3 years, sustainable partnerships with 3 different universities have been built, and each year, for 6 months, we welcome students aiming to discover the role of statistical programmer in the pharmaceutical industry. This program often concludes by the conversion of the internship into a permanent contract, which shows how successful it really is.

What motivated your transition from biostatistics to statistical programming? How has your background in biostatistics influenced your approach to statistical programming?

While I have a background as a biostatistician, I have always enjoyed programming. When I first started working as a statistical programmer, I realized my expertise in biostatistics was an incredible asset, especially for programming complex statistical models. I could fully understand these models and their results, detect potential issues, and easily discuss biostatistics topics such as the management of missing data with biostatisticians. Sometimes, I could even challenge them. To me, being a statistical programmer is the perfect combination of everything I like, and it allows me to play a central role in the analysis of clinical trials.

How have your managers or colleagues at Cytel supported your professional growth since you joined the company? From your perspective, what specific aspects of Cytel’s culture or environment contribute to making it an exceptional place to work?

I have been fortunate to receive close mentorship from my managers since I began my journey at Cytel. It empowered my continuous professional growth. My current manager Nicolas Rouillé (Senior Director Statistical Programming) always looks for opportunities to get me more involved in my role at Cytel. His trust and willingness to share his experience across various fields gave me the confidence to succeed in any challenge I might face. In turn, I strive to apply the same principles with my direct reports, to strengthen the team and the organization as a whole.

At Cytel, we foster a strong team spirit and have numerous experts across all functions. I’m always grateful to work in an environment where, every day, people demonstrate enthusiasm, courage, collaboration, and commitment to achieving a common goal — delivering high-quality results to clients and actively contributing to the improvement of patient care.

Could you discuss Cytel’s integrated project management approach, which aims to synchronize delivery among data managers, biostatisticians, and programmers? How has this approach benefited our clients?

Cytel provides end-to-end biometric solutions, including data management, programming, and biostatistics services. One example of the automation of cross-functional delivery is the implementation of the standard data library and CDASH during the eCRF design/development, and the generation of SDTM template programs. When eCRFs comply with CDASH standards, the corresponding STDM mapping in CDISC standards can be automated. The main benefit is that it enables us to increase our compliance with industry standards and improve the efficiency from data collection to reporting. CDISC compliance for analysis datasets is a key requirement from health authorities at the submission stage, which is why this automation benefits our clients directly.
Another cross-functional automation we developed at Cytel involves a tool that generates template output programs from standard mock shells and metadata. This collaboration between the biostatistics and programming teams has resulted in the production of high-quality deliverables.

Could you provide an example or project that illustrates how we deliver added value for our clients?

Recently, a client requested us to handle health authority questions for one of their Phase III oncology studies. We were contracted for biostatistics and programming services on very short timelines — what we call a rescue study. The scope wasn’t straightforward either, as we had to produce six complex efficacy ADaMs including multiple imputation rules and around 70 unique efficacy outputs presenting different statistical models.

We were able to successfully deliver a high-quality package to the client, on time, and received only minimal comments. Following this, the client informed us that they received a positive CHMP opinion for this submission. They expressed their gratitude for our collaboration and support during the submission process.

What strategies do you employ to ensure the quality and accuracy of deliverables, particularly when working on projects with tight timelines or complex data sets?

My team is composed of individuals with different seniority and experience levels, from junior programmer to associate director. When a complex project with tight timelines arises, my priority is an optimal resource assignment based on the availabilities as well as individual experience and knowledge. Sometimes a switch of resources across projects will lead to the best team setup.
When working on the project, we pay a lot of attention to writing specifications and performing programming and biostatistical review of ADaM datasets with a focus on the computational methods of complex derivations. We perform advanced quality controls or cross-checks against other outputs to ensure the accuracy of the results. Any findings related to data, such as missing data, data issues, or specific study data scenarios that can impact study results are shared with the client before proceeding with the delivery. It’s crucial to be proactive in these cases.
Lastly, the strong collaboration across biometric line functions is essential to delivering quality to clients, especially when timelines are short.

What combination of knowledge, skills, and technical competencies is essential for individuals to succeed as statistical programmers at Cytel? What qualities do you look for when hiring new members for your team?

Obviously, technical skills are incredibly important. We pay a lot of attention to the candidate’s proficiency in statistical programming languages and their experience in clinical data and industry standards. For senior roles, we also dive into their experience as team lead, which can include several topics of interest like resource assignments, quality controls, budget awareness and management, and communication with internal or external stakeholders.

In addition, we also assess the motivation of the candidate and their appetite to learn. This can easily counterbalance a potential lack of technical skills or experience. As hiring manager, I’m also very focused on interpersonal skills and the mindset of the candidate. Skills such as self-organization, proactivity, multi-tasking, and/or strong adaptability are ones I look for.

What advice would you give to aspiring statistical programmers or individuals aiming for roles within the field?

I would advise to first familiarize yourself with clinical trial fundamentals such as different phases of clinical trials, study designs (e.g., randomized controlled trials, observational studies), and endpoint definitions. Understanding the clinical trial process is crucial for effective programming. Additionally, studying the regulatory framework surrounding clinical trials, including Good Clinical Practice (GCP) and ICH guidelines, is essential. This knowledge is key for compliance and data integrity.

Then, it’s important to learn a relevant programming language such as SAS or R and gain a solid understanding of biostatistics and the statistical methods commonly used in clinical trials, such as survival analysis, mixed models, and meta-analysis. Acquiring in-depth knowledge of programming standards used in pharmaceutical industry such as CDISC standards would also be a plus.
However, do not forget to develop your soft skills. Good communication skills, team spirit, collaboration, and problem-solving skills are vital in programming roles.

My last piece of advice to candidates is to look for internships or entry-level positions that provide exposure to clinical data analysis or programming. Real-world experience is invaluable.

Lastly, what are your main interests outside of work?

I like spending time with my family. I have two young kids, a nine-year-old and a six-year-old. My wife and I like to visit new places with them, especially European cities. We also like to hike, and since the Basel area is at the intersection of three countries — France, Germany, and Switzerland — we have plenty of good spots to enjoy the fresh air.

I also like spending time in my garden, I play football with my former Novartis colleagues, and regularly go to the gym. I’m turning 40, so staying in shape is becoming a serious objective!

Thank you, Guillaume, for sharing your experience with us!

Career Perspectives: A Conversation with Ludivine De Marans

In this latest edition of our Career Perspectives series, we had the privilege of interviewing Ludivine De Marans, Statistical Programmer, based in our Geneva office. Ludivine shares her journey from statistician in the insurance industry to statistical programmer at Cytel; what is unique about the pharmaceutical sector from the perspective of a statistical programmer; and what key skills and qualities are important for those interested in working in the field.

 

Can you give us a little background on your career and your professional journey so far? What inspired you to pursue a career as a statistician/statistical programmer?

I’ve always had a passion for mathematics, and that’s where it all began. After completing my MSc in Mathematics, Computer Science, and Statistics, I started my career as a Statistician at an insurance company in France. Although my title was “Statistician,” the role combined statistical analysis and programming, whereas at Cytel, and the pharmaceutical sector, these roles tend to be split up. After five years of working within that field, I started working for Cytel as a Statistical Programmer.

 

What prompted your decision to transition into the pharmaceutical industry, and what attracted you to Cytel specifically?

I was looking to relocate from France to Geneva and came across an opportunity at Cytel. Although I wasn’t familiar with the pharmaceutical industry, I had five years of experience with SAS, the same software commonly used in pharma, so I decided to apply.

The shift to the pharmaceutical industry intrigued me because of the meaningful nature of the work — you’re directly involved in developing new therapies that help patients globally. Cytel offered me the chance to work in a new sector and country. It was challenging at first, but it has worked out well.

 

Having transitioned from a role as a statistician in the insurance industry to a statistical programmer in the pharmaceutical industry, what differences have you observed?

There were several differences I didn’t expect. First, the pharmaceutical industry is highly standardized, including all the processes for statistical analysis and programming. In my previous role, I was responsible for both the programming and quality control of my work. Here, we follow a “double programming” method, where another programmer replicates your work to compare data and results. Then, a biostatistician reviews it.

Another key difference is how statistical programmers are viewed. In insurance, statisticians are more of a support function, responding to internal corporate requests. There are fewer colleagues within the company, too. At Cytel, statisticians and statistical programmers are core services for clients, and we work directly with them. It’s a different experience providing a core service to clients, compared to offering internal support within the company. This means there’s more pressure, with fixed deadlines, compared to the more flexible internal timelines I was used to.

 

What has been the biggest challenge you’ve faced since moving to the pharmaceutical industry?

The biggest challenge was learning about the industry itself — its terms, acronyms, and standards, which were all new to me. I wasn’t aware of how standardized and highly regulated the pharmaceutical sector is. Delivering data in a specific, industry-approved format was also a new experience.

In addition, moving from a national company in France to a global organization where English is the main language was challenging. The industry terminology was the most difficult part, and even after more than five years, I’m still learning.

 

Have any colleagues or mentors at Cytel been particularly helpful in helping you adjust to your role?

Yes, my colleagues were incredibly helpful, especially in getting me familiar with the acronyms and industry standards. My managers were always approachable, and I could ask questions whenever needed. I did a lot of self-studying, but knowing I had a supportive team made a big difference.

 

In your opinion, what key skills and qualities are important for statistical programmers? Do these vary between industries, especially in terms of soft and hard skills?

I don’t think the core skills vary much between industries. Of course, technical skills like SAS programming are essential, but there are other important qualities as well. You need to be organized, a logical thinker, and rigorous. These are skills I was able to carry over from my previous job to Cytel, and they’ve been incredibly useful here.

 

Can you describe a project you’ve worked on that you’re especially proud of, and explain why?

One project that stands out involved a clinical trial with numerous outputs to produce and tight deadlines. Thanks to an excellent team and strong collaboration, we managed to deliver everything on time. A few weeks later, we learned that the trial results were positive, and that this molecule could significantly improve many lives. Knowing that my work contributed to something so meaningful made me feel incredibly proud.

 

Do you primarily work from Cytel’s Geneva office or remotely? What’s the balance, and what factors influenced this choice?

I mostly work from the Geneva office, which was especially important when I first started. Being in the office allowed me to meet colleagues and ask questions directly, which helped me adjust. Working from home can sometimes make it harder to connect with people and get immediate answers. I live close to the office, so commuting isn’t an issue, and being in the office helps me maintain a boundary between work and personal life. That said, I do work from home occasionally.

 

Lastly, what are some of your main interests and hobbies outside of work?

I have two small children, so most of my time outside of work is spent with my family, which is what helps me relax the most.

 

Thank you, Ludivine, for sharing your experience with us!

Patient Journey-Centric Study Designs in Clinical Trials

Contract Research Organizations (CROs) play a crucial role in the execution and management of clinical trials. As intermediaries between sponsors and research sites, CROs have a unique opportunity to champion patient journey-centric study designs. By prioritizing patient experience, CROs can enhance trial efficiency, improve data quality, and foster greater patient engagement and retention. Here, we share some key points from our perspective on integrating patient journey-centric study designs into clinical trials.

Read more »

Visualizing ADaM: A Practical Guide Through Examples

An important component of clinical trial data submission to the FDA is CDISC’s Analysis Data Model, or ADaM, which defines dataset and metadata standards. However, implementation of ADaM is not always straightforward, leading to inaccuracies and inconsistencies.

At this year’s CDISC EU+TMF Interchange in Berlin, I had the opportunity to present “visualizing ADaM: A Practical Guide Through Examples,” co-authored with Angelo Tinazzi, where I shared how visualizing ADaM provides a guided approach that can address these issues and streamline the process.

Here, we share some of the key takeaways.

 

Why a visual approach to ADaM?

Over the years, the CDISC and CDISC ADaM teams have released additional documents with handy ADaM use cases to the Implementation Guidance (IG), demonstrating how ADaM can be used to support the most common statistical methods or specific settings such as medical devices studies, while maintaining good traceability. Additionally, several CDISC Therapeutic Area User Guides (TAUGs) provide specific analysis examples addressing various ADaM requirements in those particular settings.

Typically, the ADaM datasets development process begins with study documents such as the Statistical Analysis Plan (SAP), accompanied by applicable CDISC ADaM guidance. By leveraging gathered knowledge and support from the company’s data governance structure, which includes tailored templates, guidance, and subject matter experts (SMEs), the statistical programmer initiates the design of ADaM datasets needed to support the analytical outputs outlined in the SAP. Despite continuous efforts to support team members with regular updates, we frequently encounter incorrect or inconsistent implementation of the standard, across multiple clients and therapeutic areas.

To enhance the implementation of ADaM by statistical programmers at Cytel, we have developed visual shells based on our standard SAP table shells. These visual shells incorporate annotations to illustrate the ADaM dataset and variables to be utilized, variables to group or categorize, as well as any filters and additional conditions. These visual shells are accompanied by sample ADaM datasets and corresponding standard specifications tailored for Cytel automation tools.

Furthermore, our development efforts extend to slide sets designed to train programmers through practical examples, ensuring a comprehensive understanding of ADaM’s application.

 

A simple example: A demographics table

The following example makes use of the standard ADSL dataset, with its standard variables, either copied from SDTM or derived in ADaM. The table shell is completed with details of variables to be selected and the rationale. For example, this demographic table filters for the safety population; as such, we expect to have in the ADSL dataset a variable containing the actual treatment received. In this case, we choose the TRT01AN variable, and the variable to filter for the applicable subjects, SAFFL. Furthermore, given the fact that our standard tools work well with numeric variables, in addition to character versions of variables such as SEX, AGEGR1, etc., we need to plan to include variables SEXN and AGEGR1N.

Figure 1: ADSL Dataset

 

ADaM Class: ADSL

Because the table is on the Safety Population subset (SAFFL=Y), the Actual Treatment Variable should be used as column group (e.g., TRT01AN)

Despite numeric version of variables such as SEX and RACE are permissible, it is a good practice to also add the numeric version to facilitate tool automation and the desired sorting in the outputs (not alphabetic) e.g., SEXN and RACEN.

In addition to AGE analyzed with continuous descriptive stats (copied from SDTM), the age is required to be analyzed in category, and the standard ADSL variable AGEGR1 and its numeric version AGEGR1N are added to ADSL.

 

Treatment-emergent adverse events table

For the analysis of treatment-emergent adverse events, we discussed two types of outputs.

In the first output, the requirement is to analyze the occurrence of the treatment-emergent adverse events and their incidence using a hierarchical medical dictionary, MedDRA, through which we summarize the occurrences by system organ class and preferred term.

 

Figure 2: ADAE Dataset

ADaM Class: OCCDS / Sub-Class: ADVERSE EVENT

In addition to the Safety Population subset (SAFFL=Y), and the appropriate treatment variables (e.g., TRT01AN), the Actual Treatment Variable should be used as column group (e.g., TRT01AN).

ADVERSE EVENT is a sub-class of the OCCDS class, as such the variables AEBODSYS and AEDECOD became “Required.”

For this type of analysis only the treatment emergent adverse event (TRTEMFL) should be used.

In the second output, our objective is to provide an overview of the types of adverse events that occurred. This includes determining the number of subjects who experienced at least one adverse event, the number who experienced at least one serious adverse event, identifying the most severe adverse event, and quantifying the subjects who experienced adverse events leading to either treatment or study discontinuation, among other metrics.

We could have used the same ADAE dataset created for previous output and applied selection and calculation in the analytical output program. However, to improve traceability, reproducibility (quality control), and to make the ADaM dataset analysis as ready as possible, we also have the option to create another ADAM dataset, ADAESUM, derived from ADAE, and applying a BDS structure. The annotated output shows both versions, with ADAESUM (BDS) and with ADAE (OCCDS).

Again, our example provides detailed explanations and an extract of the ADAESUM dataset.

 

Figure 3: ADAESUM Dataset

Option 1 – ADaM Class: OCCDS / Sub-Class: ADVERSE EVENT

Filters on specific variables, e.g., TRTEMFL, AESER, AEACN.

 

Option 2 – ADaM Class: BDS

In addition to the Safety Population subset (SAFFL=Y), and the appropriate treatment variables (e.g., TRT01AN), the Actual Treatment Variable should be used as column group (e.g., TRT01AN).

Each condition needed for the summary output can be represented by a specific PARAMN / PARAMCD / PARAM, with AVAL (AVALC) containing whether the condition for the subject was satisfied or in case of severity the AVAL will contain the maximum observed severity among all AE each subject had. PARAMN will be also used to display each condition in the order as per planned table shell.

 

Change from baseline with phantom baseline visit

In this last example, I presented a table summarizing the change from baseline at each visit for vital sign parameters. For each visit, we present summary statistics for the actual observed value and the change from baseline. The peculiarity of this example is the baseline definition, defined by SAP as the average between observed value at screening and day-0 visit.

 

Figure 4: ADVS Dataset

ADaM Class: BDS

Safety Population is used (SAFFL=Y), but results are presented without any split/group by treatment (we do recommend keeping TRT01AN in ADVS).

The Analysis Visit is derived in ADVS (AVISITN / AVISIT). This might be derived from SDTM VISITNUM / VISIT by applying some change in the wording to fulfill SAP requirement or apply some visit-windowing. As per the example dataset below the baseline visit (AVISITN=0 / AVISIT=Baseline) is a derived record / visit.

AVAL and CHG are used for observed and change from baseline respectively. CHG is calculated from AVISIT=Baseline.

Not all records / visit will be used in analysis (ANL01FL=Null) but kept in ADVS for traceability.

 

From the above ADVS dataset:

  • Lines 3 and 9 show the derive baseline visit with DTYPE variable, which identifies it and the method used for the derivation (AVERAGE). ABLFL is set to ‚ÄúY.‚Äù
  • Lines 1, 2, 7, and 8 will not be used in the analysis (ANL01FL=Null) because they occurred before the phantom derived ‚Äúbaseline‚Äù visit, but were kept in the ADaM dataset showing traceability, e.g., from which the baseline visit was derived.
  • Lines 5 and 11 will not be used in the analysis (ANL01FL=Null) because they were unscheduled visits. However, the records are kept in the ADaM dataset showing traceability so that reviewer is aware which information was not used in the analysis. Eventually, if AVISIT/AVISTN are derived using some windowing.
  • All records post baseline have the change from baseline (CHG) derived, including the unscheduled visits.

 

Key takeaways

By visualizing ADaM, the choice of the ADaM structure is guided and lets the programmer select the proper dataset structure, check the need of specific variables (e.g., population flags, treatment variables), and check the presence of the data required in the analysis (e.g., collected/derived parameters, hierarchical variables).

The visual approach is highly beneficial to ADaM newcomers, streamlining ADaM specifications and programming and standard outputs production, and it gives more time to focus on non-standard outputs, which are usually more challenging than standard ones.

Internally, this is another step to improve the Cytel automation tools suite (Lighthouse, ALPS, PRISM) and to move toward a more efficient process.

“Sharing is caring” — I feel this motto well captures my feeling when presenting at conferences. It is always a great experience: sharing what we implemented or how we overcame common challenges allows a good discussion with the attendees.

 

Interested in learning more?

Download Angelo Tinazzi’s new ebook, “The Good Data Submission Doctor on Data Submission and Data Integration to the FDA”:

The Role of Key Opinion Leaders in Rare Disease Clinical Trials

Written by Angela Vinken and Patti Arsenault

Key Opinion Leaders (KOLs), i.e. trusted, well-respected experts, are crucial in clinical research, especially in rare diseases. They have the expertise, network, and experience to add great scientific value to clinical research and can influence and shape the trajectory of clinical research within their professional communities.

The challenge, however, is how to use KOLs wisely. They are often clinicians first and researchers second, and their time is in high demand. Here, we discuss how KOLs can shape clinical trials, best practices for working with KOLs, and key considerations and potential challenges.

Read more »

The TOGETHER Trial Journey: Interview with Ofir Harari

The award-winning TOGETHER Trial was designed with the vision of ensuring that COVID-19 therapies are both effective and accessible to the majority of people, especially in the low- and middle-income countries. Members of the TOGETHER Trial, led by Principal Researcher Dr. Edward Mills (Cytel & McMaster), studied existing interventions as possible treatments for COVID-19. The TOGETHER Trial recently won the Society of Clinical Trials David Sackett Trial of the Year Award for 2021.

I interviewed Ofir Harari, Senior Research Principal (Statistics) at Cytel, who passionately worked on the TOGETHER Trial from its inception. Ofir has been working in the field of statistics and data analysis since 2007. His experience includes design and analysis of randomized and cluster-randomized clinical trials, Bayesian adaptive designs, statistical emulation, geospatial analysis, and network meta-analysis. At Cytel, Ofir leads projects in the area of real-world analytics. Prior to joining Cytel, Ofir was a postdoctoral fellow at the University of Toronto and Simon Fraser University. Ofir’s interest and expertise lie in the intersection of statistical methodology and software development.

Read more »

Standards and Open Source Hand-in-Hand: Leveraging Automation to Expedite Drug Market Request Review Process

How do you envision the future of data submission?

Last week, I had the privilege of presenting the topic “Standards and Open-Source Hand-in-Hand: Leveraging Automation to Expedite Drug Market Request Review Process” at PharmaSUG-China in Nanjing. It was an honor to be invited as a keynote speaker to this event.

Read more »

The Evolution of Open-Source Initiatives and New Standards Development for the Data Submission of the Future

In the first part of this post, I discussed the ongoing revolution, or maybe I should say evolution, we are living through with open-source initiatives and new standards development.

A good example to start with is the R-pilot initiative1 by the r-consortium, which has already Read more »