risk.assessr: R Package Validation for Regulatory Submission in Pharmaceutical Development
May 7, 2026
In pharmaceutical development, the reliability of statistical software is not a luxury; it is a regulatory requirement. For organizations leveraging R in regulated environments, this mandate means a rigorous approach to validation is needed. Tools like risk.assessr allow users to create a practical, data-driven process to meet regulatory requirements.
Validation in R
In pharmaceutical development, validation typically refers to systems validation. The system validation should incorporate all of the following elements:
- Accuracy
- Reproducibility
- Traceability
When assessing the accuracy of R packages, the R Validation Hub differentiates R packages by the following types:
- Base and recommended (core) packages: developed by the R Foundation and shipped with the basic installation and represent the highest tier of reliability.
- Contributed open-source packages: developed by anyone in the community and may vary significantly in their accuracy and robustness.
Validation using risk.assessr
Recognizing the need for a structured, risk-based approach to R package validation, we developed the open-source tool, risk.assessr. The risk.assessr package takes a risk-based approach to evaluate the potential risks linked to each R package.
The assessment considers:
- A package’s complexity and structure
- Unit test coverage
- Traceability
- Documentation quality
- License
- Popularity
- Package activity and maintenance
By extracting these risk-based metrics, risk.assessr allows users to make informed decisions about whether a package is suitable for use in regulated environments or in exploratory analysis.
Key metrics for package validation
Validation metrics are gathered by risk.assessr through specific functions that retrieve desired data. Table 1 lays out some of the key metrics and risk.assessr functions:
Table 1: Key Metrics
Risk analysis using risk.assessr
The power of risk.assessr lies in its risk analysis capabilities, which employ rule-based criteria. These risk criteria can be used to enforce stricter standards, accommodate internal tooling priorities, or meet compliance requirements. Users define threshold values for high, medium, and low risk across the metrics mentioned above or for metrics that they define themselves. These thresholds are stored in inst/config/risk-definition.json, allowing for centralized, version-controlled governance of validation standards.
The get_risk_analysis() function applies these rules to calculate risk ratings, transforming raw metrics into actionable, easy-to-understand intelligence. This approach recognizes that validation requirements vary by organization and use case — what constitutes acceptable risk for an exploratory analysis differs from risk tolerance for a regulatory submission.
Risk analysis: Reporting
risk.assessr generates two complementary reports that serve different audiences and purposes.
The generate_html_report() function produces a detailed report for developers and validation teams that translates the three level threshold risk values into a three-level visualization: red for high risk, yellow for medium risk, and green for low risk. This visual approach makes risk assessment immediately apparent and facilitates technical discussions about package suitability.
For validation team sign-off of R packages, write_summary_report() generates a concise one-page summary that produces three actionable recommendations: Approved, Rejected, or Remediation Needed. This report, typically generated by a Validation GitHub Action, provides a structured framework for validation teams to apply critical thinking and make final decisions about package inclusion in submission or other environments.
Final takeaways
risk.assessr can be a critical component of an easy-to-use, reliable, and detailed validation workflow. These workflows allow organizations to confidently create validated R environments for submission purposes and/or exploratory purposes. They also help maintain audit trails and compliance documentation.
Beyond FSP: Comprehensive Biometrics SupportSubscribe to our newsletter
Edward Gillian
Principal Statistical Programmer
Edward Gillian is Principal Statistical Programmer at Cytel. Edward has 10 years’ experience working with R and seven years working with R in pharma and won the Cytel FSP Leadership Award Q3-2025 for his work with risk.assessr. Edward has a PhD in English linguistics and lives in Gorzów Wielkopolski, Poland.
Read full employee bioClaim your free 30-minute strategy session
Book a free, no-obligation strategy session with a Cytel expert to get advice on how to improve your drug’s probability of success and plot a clearer route to market.
