Trustworthy AI in Action: Predicting Stroke Risk Transparently with Claims-Based Machine Learning
In recent years, deep learning and large neural networks have garnered most of the attention in the machine learning (ML) community. Their ability to model complex, high-dimensional data is indeed impressive. But in healthcare — where decisions can have serious consequences and interpretability is paramount — simpler, transparent models like logistic regression still have an important role to play.
Not every problem requires a black box. When it comes to predicting disease risk using structured data, such as insurance claims, traditional models can offer accuracy and insight.
Claims databases: An untapped resource for disease risk prediction
Claims databases are an increasingly valuable source of real-world data (RWD). Unlike clinical trial data, which is highly controlled but limited in scale and scope, administrative claims datasets cover millions of lives over multiple years, reflecting real patient behavior and care patterns.
These databases include information on diagnoses, procedures, prescriptions, and demographics — elements that, while lacking granular clinical detail, can still reveal important patterns in disease progression and risk. The scale of these datasets allows for robust statistical modeling, even for rare outcomes.
The case for explainable machine learning in claims-based risk prediction
When working with claims data, models like logistic regression, Lasso, or Ridge regression are not just sufficient — they are often ideal. These models:
- Produce coefficients that quantify the relationship between features and outcomes.
- Allow for transparent understanding of why a prediction was made.
- Are easier to validate and communicate to clinicians, payers, and regulators.
In contrast, deep learning models often deliver slightly higher accuracy at the cost of interpretability — a trade-off that may not be acceptable in regulated healthcare environments.
A real-world example: Predicting stroke risk with claims data
In a recent study, Cytel used data from over 2.5 million insured individuals to predict the risk of stroke hospitalization. Using only claims-based features such as age, medication use, comorbidities (e.g., diabetes, hypertension), and health service utilization, we compared the performance of several models, including:
- Logistic Regression
- Regularized linear models (Lasso and Ridge)
- XGBoost (a state-of-the-art ML algorithm)
The results? All models achieved similar predictive performance, with area under the ROC curve (AUC) values around 0.81. Logistic regression — simple, explainable, and well-established — performed on par with XGBoost, demonstrating that advanced complexity wasn’t necessary to achieve meaningful predictive power.
Transparency enables trust and action
What sets models like logistic regression apart is their explainability. Stakeholders can see precisely how risk factors like atrial fibrillation, hypercholesterolemia, or age contribute to predicted stroke risk. This level of clarity is essential not only for clinicians making decisions, but also for data governance, compliance, and patient communication.
In a time when “black box” AI models are under increasing scrutiny, explainable models offer a pragmatic path forward — especially when paired with large-scale real-world datasets like claims data.
Keep it simple, keep it transparent
Healthcare doesn’t just need powerful algorithms — it needs trustworthy ones. As our study shows, standard machine learning models remain highly relevant, especially when applied to well-structured real-world data. Claims databases, in particular, offer a rich foundation for developing these models and making preventive healthcare smarter, earlier, and more accessible.
FDA OCE Project Frontrunner: Accelerating First-Line Oncology Drug Development
The U.S. Food and Drug Administration’s Oncology Center of Excellence (OCE) launched Project Frontrunner to shift the paradigm in oncology drug development. Traditionally, novel oncology drugs gain approval for use in patients with later-stage disease and who have exhausted other treatment options. Project Frontrunner challenges this model by encouraging sponsors to pursue initial drug approvals in the earliest feasible disease setting, particularly first-line or treatment-naïve populations.
The conventional late-line strategy for oncology drug development offers fewer regulatory hurdles and facilitates faster enrollment. However, it delays access to potentially life-extending or curative therapies for patients with early-stage disease. Moreover, the biology of tumors in heavily pretreated patients often differs significantly from earlier stages, limiting generalizability. Project Frontrunner seeks to reverse this trend, thereby aligning trial design with patient-centric outcomes.
Here, I discuss the key elements of Project Frontrunner, the statistical complexities of first-line trial design, and the potential impact on sponsors.
Key elements of Project Frontrunner
- First-line indication targeting: Encourages drug developers to pursue marketing applications based on trials in treatment-naïve populations, not just refractory or relapsed disease settings.
- Regulatory support and early engagement: The FDA offers early scientific engagement with sponsors through Type B and Type C meetings, helping optimize development plans for first-line indications.
- Use of randomized controlled trials (RCTs): Promotes the use of RCTs in early-stage disease rather than single-arm studies in late-stage patients, aiming for more robust and generalizable evidence.
- Expedited programs compatibility: Supports use of breakthrough therapy designation, priority review, and accelerated approval, even when targeting earlier lines of therapy.
Practical implications for trialists
- Trial design complexity: Sponsors must design larger, more rigorous trials, often needing comparator arms, which may increase cost and duration but improve scientific robustness.
- Patient recruitment considerations: Recruiting treatment-naïve patients can be more competitive and ethically challenging, requiring careful protocol development and site coordination.
- Strategic endpoint selection: Trialists must select endpoints that reflect long-term clinical benefit (e.g., progression-free survival, overall survival), rather than short-term surrogate markers typically used in late-line settings.
Statistical complexities in first-line trial design
Designing oncology trials for first-line indications — as encouraged by Project Frontrunner — brings increased statistical and methodological complexity compared to traditional late-line trials. The rigor demanded by earlier-stage settings requires careful planning to ensure validity, power, and regulatory acceptability.
Randomized comparators and control integrity
Trials typically require active control arms rather than historical controls. Selecting an appropriate standard-of-care comparator and maintaining blinding (where feasible) becomes essential to minimize bias and strengthen inference.
Longer time horizons for endpoints
In first-line disease, progression-free survival (PFS) and overall survival (OS) require longer follow-up, increasing risk of loss to follow-up and requiring more robust methods for censoring and handling missing data.
Multiplicity adjustments and hierarchical testing
With multiple endpoints — such as PFS, OS, objective response rate (ORR), and quality of life — multiplicity becomes a critical issue. Sponsors may need hierarchical testing strategies or gatekeeping procedures to control Type I error.
Interim analysis and adaptive design considerations
Sponsors may wish to incorporate group-sequential designs or adaptive features (e.g., sample size re-estimation), but these add statistical complexity and must be pre-specified with strong rationale to be acceptable to regulators.
Subgroup analyses and biomarker stratification
Treatment-naïve populations may be heterogeneous. Stratification by biomarkers or disease subtype may be necessary, but raises statistical power concerns and increases the risk of false discovery if not pre-specified and adjusted.
Likely impact on sponsors
Project Frontrunner presents both opportunities and challenges for drug developers aiming to target earlier lines of oncology treatment. Below are key advantages and disadvantages for sponsors engaging with this program:
Advantages
- Market leadership and differentiation: Gaining approval for a first-line indication can position a therapy as the standard of care, offering strategic advantage over drugs only approved for late-line use.
- Extended commercial exclusivity: Earlier approval typically translates into longer duration of market exclusivity, enhancing revenue potential before generics or biosimilars enter the market.
- Clinical value and branding: Drugs proven effective in first-line settings may be perceived as more effective and versatile, strengthening the sponsor’s brand and clinical reputation across stakeholders, including physicians and payers.
Disadvantages
- Higher development costs and risk: Trials in earlier-stage patients typically require larger sample sizes, randomized designs, and longer follow-up, increasing overall trial costs and investment risk.
- Increased regulatory scrutiny: Early-line trials are subject to higher evidentiary standards, with greater emphasis on demonstrating long-term clinical benefit (e.g., overall survival), making approval more difficult.
- Competitive recruitment environment: Enrolling treatment-naïve patients is often slower and more competitive, as these patients may have multiple treatment options and may be hesitant to join experimental arms.
Final thoughts
Project Frontrunner represents a bold step by the FDA to reshape oncology drug development. While it demands more rigorous trial designs and greater investment from sponsors, it aligns closely with patient-centric goals: bringing promising therapies to those who need them most, earlier in their disease journey. For sponsors willing to embrace these challenges, the program offers a chance to lead in an increasingly competitive oncology landscape.
James Matcham, VP Strategic Consulting, and Pranav Yajnik, Senior Consultant, will be hosting a Cytel webinar on August 20, 2025, where they will provide an overview of Project Frontrunner and its implications for oncology drug development. They will also explore, using a case study, how innovative trial design strategies can lead to faster, more robust pathways to market for oncology therapies.
Blending Power and Flexibility: How AI-Generated R Code is Reshaping Clinical Trial Design
In today’s fast-evolving clinical research landscape, designing robust and efficient trials is more critical than ever. As statistical designs grow in sophistication, biostatisticians are increasingly relying on both commercial platforms and open-source tools to meet unique modeling needs. But this hybrid approach also comes with challenges, particularly for those new to advanced simulation software or lacking programming experience.
At Cytel, we’ve been exploring how artificial intelligence (AI) can help bridge this gap. At the 2025 Joint Statistical Meetings (JSM), we will present on our latest innovation: AI-powered R code generation for clinical trial design, a feature embedded in our East Horizon™ platform. This assistant, called RCACTS (R Coding Assistant for Clinical Trial Simulation), represents a significant step forward in making custom trial design faster, more accessible, and more reliable.
Why talk about this now? The open-source imperative
While commercial clinical trial design software offers rapid design development through validated and user-friendly workflows, it doesn’t always address the full complexity of real-world problems. Trial statisticians often face challenges in areas such as oncology, rare diseases, and adaptive designs that require tailored statistical tests, unique outcome generation models, or alternative randomization techniques.
This is where open-source tools like R become invaluable. R allows statisticians to write custom code to simulate complex trial designs, perform Bayesian analyses, or integrate evolving regulatory guidance. Over the years, a vibrant ecosystem of R packages has emerged, offering a high degree of transparency, flexibility, and academic rigor.
Yet this flexibility comes with trade-offs: code development can be time-consuming, error-prone, and requires significant programming expertise. As a result, many biostatisticians find themselves switching between validated commercial workflows and custom R functions, leading to a process that is often fragmented and inefficient.
Recognizing this, Cytel’s East Horizon platform has introduced R integration points, enabling users to inject custom code directly into validated simulation workflows. This integration delivers the best of both worlds: the speed and structure of commercial software with the creativity and control of open-source.
Enter AI: Speed, simplicity, and smarter coding
Our next logical question was: can AI make this process even easier?
The answer, increasingly, is yes. With recent advances in generative AI, particularly large language models (LLMs), it’s now possible to assist in the generation of R code for simulation-based design tasks. At Cytel, we’ve harnessed OpenAI’s GPT-4o via API, securely deployed within Microsoft Azure, to create RCACTS, a coding assistant purpose-built for biostatisticians.
Unlike generic AI tools that produce standalone R scripts, RCACTS generates R code specifically tailored for the East Horizon simulation engine. It ensures that the generated functions:
- Match expected input/output structures,
- Include pre-defined parameters as shown in our internal statistical package CyneRgy,
- Are immediately ready for integration and testing within a live trial design workflow.
With RCACTS, users can simply describe what they want in plain English and receive functioning R code that can be integrated into East Horizon.
Who benefits? Everyone from newcomers to experts
One of the major advantages of this AI-enhanced workflow is lowering the barrier to entry. For a new user unfamiliar with Cytel’s R integration or syntax requirements, writing compatible code from scratch can be daunting. RCACTS significantly reduces the learning curve by providing validated function templates, sensible defaults, and clear parameterization, all supported by generative AI.
At the same time, experienced statisticians benefit by spending less time on repetitive coding tasks, debugging, or remembering function signatures. This allows them to focus on higher-level design questions, such as: What analysis method is most robust? How sensitive is the design to different outcome distributions? What dropout patterns pose the greatest risk?
Our assistant supports a wide range of trial design elements:
- Simulating patient responses: Binary, Continuous, Time-to-event, and Repeated-measure endpoints.
- Analyzing simulated data: Statistical analysis for these endpoints.
- Randomization: Flexible randomization of patients across treatment groups.
- Enrollment and dropout modeling: Custom mechanisms for realistic patient enrollment and dropout scenarios.
- Treatment selection: Supporting multi-arm multi-stage (MAMS) trial designs.
Balancing innovation with responsibility
Of course, like any AI solution, there are caveats. AI-generated code must be carefully reviewed for correctness, appropriateness, and regulatory readiness. RCACTS includes a built-in testing functionality to flag structural or syntactic errors, but statistical validation remains the user’s responsibility. Also note that all data interactions adhere to Azure OpenAI’s stringent data protection policies to ensure security and compliance.
There’s also a broader concern: will over-reliance on AI limit the creativity and deep statistical thinking that define our profession? At Cytel, we view AI not as a replacement for expertise, but as a tool to amplify it. Our goal is to give statisticians more time and mental space to explore, iterate, and innovate rather than reduce them to prompt engineers.
Looking ahead
The future of clinical trial design lies in intelligent integration: combining the strengths of validated commercial tools, flexible open-source frameworks, and AI-powered coding assistance. With East Horizon and RCACTS, we believe we’re building the blueprint for this future, with a platform that supports both scientific rigor and operational speed.
As the field continues to evolve, biostatisticians will need tools that not only keep up with complexity but also support creativity, collaboration, and efficiency. AI-generated R code, embedded within a powerful simulation engine, is one such tool and is already transforming how we approach design flexibility in clinical trials.
Catch us at JSM 2025 to learn more about how AI is transforming the future of clinical trial design within Cytel.
Career Perspectives: A Conversation with Camila Pazos
In this edition of our Career Perspectives series, we had the pleasure of speaking with Camila Pazos, Senior Director, Business Development at Cytel. With a career spanning from oncology research to business development, Camila drive has remained the same: to help bring scientific advances closer to the people who need them most.
In this interview, Camila discusses her career trajectory, industry trends, and how she expertly bridges scientific expertise with strategic thinking.
Can you give us a little background on your career so far?
My curiosity for science sparked early. I was that kid who never stopped asking questions. I vividly remember being 12, bursting into the kitchen during breakfast, absolutely thrilled to share that the human genome had just been fully decoded. Even at that age, it felt monumental.
I always knew I wanted to work in healthcare ― it felt like a calling. For a while, I considered going into medicine. But after going through a very thorough career orientation program, I realized something crucial: I didn’t just want to diagnose problems, I wanted to discover solutions. That insight led me to research, where I quickly felt at home.
Reproductive medicine and oncology have always been my two main interests. So, when I had the chance to pursue a PhD in gynecological cancer research after graduating as a molecular biologist and geneticist, it felt like destiny. During my PhD, I realized that while I loved scientific discovery, I was equally passionate about ensuring those discoveries reached people.
What truly motivated me was the desire to see those discoveries make a real-world impact. I began to realize that even the most groundbreaking research would remain abstract unless it could be effectively translated, packaged, and delivered to the patients who needed it. That’s what ultimately led me to transition from academia into the pharmaceutical industry.
Stepping into market access gave me a whole new perspective. It was the perfect intersection of science, strategy, and communication. I found myself negotiating with payers, developing launch plans, and articulating the value of therapies ― all while staying rooted in evidence and outcomes. In many ways, I was already operating at the edge of commercial strategy.
Over time, I naturally transitioned into more commercial roles, which allowed me to bridge scientific depth with strategic business thinking. Today, I serve as Senior Director of Business Development at Cytel, working across trial design, statistical programming, RWE, HEOR, and market access. This shift significantly expanded my impact. Instead of focusing on a single product or pipeline, I could now support multiple pharmaceutical companies across their R&D and commercialization strategies. It’s been incredible fulfilling to operate at this intersection where science meets strategic execution.
What do you like best about your role, and about working at Cytel?
What energizes me most about my role is its inherently interdisciplinary nature. Every day, I collaborate with biostatisticians, clinicians, RWE experts, health economists, and market access strategists to solve some of the most complex challenges in drug development. That diversity of thought and expertise keeps me constantly learning and growing. I truly thrive in that intellectually rich, collaborative environment.
What sets Cytel apart — and what drew me to the company in the first place — is its unwavering commitment to scientific rigor. The depth, integrity, and thought leadership our teams bring to every project is inspiring. Cytel’s mission to accelerate innovation and improve decision-making in healthcare aligns closely with my own values. I’ve always believed in bringing science closer to the patient, and here, I get to do that in a meaningful and scalable way.
I also love the unique vantage point this role gives me. Working with multiple pharmaceutical companies across different therapeutic areas gives me a broad view of industry trends and emerging shifts. It’s fascinating to get that big-picture perspective and be able to contribute to shaping strategies that drive both innovation and access.
You were recently promoted to Senior Director of Business Development. Congratulations! What has this new role added to your perspective on Cytel’s mission and your own career goals?
Thank you! Through this role, I have gained a deeper appreciation for Cytel’s mission to deliver quantitative insights throughout drug development and commercialization, whether through advanced analytics, trial design, or market access. I’m now more involved in shaping how we position these offerings globally, aligning our commercial strategy and innovation priorities.
To me, this function marks a meaningful step toward leadership and strategic influence—enabling not only my own growth but also the opportunity to mentor colleagues and contribute to shaping Cytel’s future direction.
You’re known for bridging scientific expertise with strategic thinking. How do you approach aligning technical depth with commercial impact?
I start with ensuring I truly understand the scientific problem. Is it a statistical challenge or real-world data nuance? From there, I contextualize it in terms of commercial strategy by asking myself the following questions: “How will this data shape payer decisions? What will clinicians need to see?”
This helps me build frameworks that speak to both scientific rigor and business outcomes. Additionally, I cultivate strong scientific partnerships with internal experts, so commercial proposals are not just technically sound but tightly aligned with market needs.
What, according to you, are some of the current pain points of the industry? Are there any trends in the clinical trial industry you’ve noticed coming up?
One of the biggest challenges the industry faces today is the growing complexity and scrutiny of evidence generation. Demonstrating efficacy is no longer sufficient. You also need to show real-world effectiveness, cost-effectiveness, comparative value, and alignment with patient-centered outcomes ― all in a way that satisfies regulators, payers, and patients alike.
Integrating real-world evidence, adaptive designs, and payer-relevant data into a unified evidence strategy is becoming essential but also incredibly demanding. In Europe, the introduction of the EU Joint Clinical Assessment (EU JCA) is increasing pressure for earlier harmonization of clinical and economic evidence in development. This presents logistical and strategic challenges due to still-varying national HTA requirements.
In clinical trial design specifically, there’s a real push to develop more adaptive and simulation-based designs that enable earlier go/no-go decisions and optimize resource use. These designs can significantly reduce time and cost, but only when executed with strong statistical and operational planning.
Another area of increasing importance is the role of Data Monitoring Committees (DMCs). As trials become more complex and high-stakes, DMCs must remain fully independent to ensure scientific integrity and patient safety. Maintaining that independence while integrating their insights into real-time development decisions is a delicate balance, and a growing operational challenge for sponsors.
We’re also seeing a sharp increase in the demand for health equity and diversity in clinical evidence. Trials must reflect real-world populations across socioeconomic, racial, and geographic dimensions — not only as a moral imperative, but because it impacts access, reimbursement, and long-term outcomes.
In terms of emerging trends, I’ve noticed:
- AI and machine learning are increasingly being explored for trial optimization, evidence synthesis, and predictive analytics, though regulatory frameworks are still evolving.
- The growing emphasis on early alignment between regulatory and payer needs, particularly with the rise of conditional approvals and accelerated pathways.
- And finally, the move toward a more patient-centered model, which is reshaping everything from endpoint selection to economic modeling and value demonstration.
You’ve worked across RWE, HEOR, market access, and now clinical trial design, statistical programming, Data Monitoring Committees and biostatistics. How do you stay on top of these trends?
Staying up to date in this industry requires a mindset of continuous learning and intellectual curiosity. I regularly read peer-reviewed journals, follow regulatory updates, and attend key conferences and webinars covering the latest developments in HEOR & RWE, biostatistics, clinical trial innovation, and evidence synthesis.
At Cytel, I benefit greatly from our internal training programs and knowledge-sharing platforms. But most importantly, I get to collaborate closely with brilliant experts across the organization who constantly expand my understanding and challenge my thinking.
I also try to approach every conversation and every project with a humble attitude. No one can be an expert in everything. I’ve learned that asking questions, listening deeply, and remaining open to new perspectives are some of the most valuable tools we have to stay sharp and relevant.
Ultimately, having worked across different domains gives me a cross-functional lens, from early development to market access, but I never stop learning from the people around me.
You’ve studied and worked in Argentina, Chile, the United States, and Spain. How has this international experience shaped your approach to business and leadership?
Living and working across Argentina, Chile, the U.S., and Spain has had a profound impact on how I approach both business and professional relationships. It taught me to navigate diverse cultural perspectives, adapt to different communication styles, and understand the intricacies of various healthcare systems.
This international experience has made me more flexible, empathetic, and context aware. I’ve learned to listen actively, respect local nuances, and tailor my communication and strategies to align with varying decision-making processes and organizational dynamics. Whether I’m engaging with a global pharma client or collaborating with an internal team, I aim to foster inclusive, respectful relationships where diverse perspectives are genuinely valued.
Ultimately, understanding the subtleties of different markets and work cultures allows me to build stronger, more effective collaborations. It’s not just about speaking the same language — it’s about understanding the context behind the words. That’s what truly drives meaningful connections and successful outcomes in a global industry.
As a remote employee, how do you maintain a healthy work life balance? What strategies work for you, and do you feel supported by Cytel in this regard?
Remote work offers incredible flexibility, especially as a parent of two young children with their own routines and needs. But it also comes with real challenges, particularly when working across global time zones. I’ll admit that I sometimes struggle to set strict boundaries. It’s a skill I’ve had to consciously develop. I want to be available for both clients and my family, which can make it hard to truly disconnect.
One of the most difficult aspects of remote work is not being physically close to your colleagues. You miss those impromptu hallway conversations, quick brainstorms, or the organic moments of connection that happen in an office. So, we found ways to stay meaningfully connected through structured check-ins, virtual coffee chats, and internal knowledge-sharing sessions that help maintain a strong sense of collaboration and cohesion, even from a distance.
I truly value the benefits of working from home and wouldn’t go back to a full-time office setup. The flexibility it provides — and the time saved from commuting — has had a huge impact on how I manage my time and energy. I’ve happily exchanged commuting hours for morning workouts, and being EU-based, I can use the first part of the day for deep focus work. This quiet window allows me to clear through incoming requests from the U.S. overnight, and handle Asia-Pacific communications efficiently. I also try to build in a midday break to recharge, before shifting into U.S. overlap hours, which are typically filled with client meetings and team calls.
I genuinely believe that remote work, when well-structured and supported as it is at Cytel, enables people to be more focused, effective, and productive.
What are your main interests outside of work?
I’m passionate about yoga and meditation! They help me stay grounded and cultivate a daily practice of gratitude, which is essential for mindfulness and balance. I also enjoy painting, a passion inspired by my grandmother, who had a background in fine arts and introduced me to painting on canvas with oils and acrylics, as well as drawing, ceramics, and craftsmanship in general.
Spending quality time with my kids is a huge part of my life outside of work. Whether it’s a day at the beach building sandcastles and swimming during the summer or enjoying cozy indoor playtime when it’s cold outside, those moments recharge me and remind me of what matters most.
Finally, what’s one piece of career advice you wish you had received earlier?
It’s okay to pivot in your career. Your PhD or technical training doesn’t confine you to one path. The skills you develop — analytical thinking, problem-solving, and curiosity — are transferable. Always embrace exploration and be proactive about learning new things. Don’t underestimate the power of soft skills like communication, collaboration, and leadership. They often open more doors than technical skills alone do.
Also, if you’re a woman in science or an immigrant, be prepared to face additional challenges. You’ll likely need to consistently prove your expertise in environments that may have unconscious biases. Adapting to new cultures and building networks from scratch can be tough. But remember, these challenges build resilience and help you develop a strong professional voice. Use those experiences to advocate for yourself and others, and to create more inclusive spaces in your field.
Thank you, Camila, for sharing your experience!

Leveraging Mobile and Wearable Technology for Outcomes Research in Depression
As mobile and wearable technologies become increasingly integrated into daily life, their applications have expanded far beyond convenience and lifestyle. In the field of outcomes research — particularly within mental health — these technologies are opening new frontiers for understanding and monitoring clinical endpoints. A notable case is depression, where continuous digital monitoring can provide rich insights into both the course of illness and treatment impact.
This post draws on our findings from a recent systematic review and poster presentation to examine how mobile and wearable tools are currently deployed in depression monitoring and how this aligns with broader outcomes research goals.
Digital monitoring as a tool for mental health outcomes
Over the past five to six years, depression has seen a marked rise across youth and adult populations globally, underscoring the need for scalable and effective monitoring strategies. In parallel, smartphones and wearables have become ubiquitous, capable of capturing passive, longitudinal health data. These digital tools offer unprecedented potential for outcomes research by providing real-time behavioral and physiological markers relevant to depression.
To map the current landscape, we conducted a comprehensive literature review focused on how smartphones and wearables are used to monitor depression in research contexts. This synthesis aimed to highlight prevailing methods, feature usage, and the extent to which demographic variability is accounted for — critical considerations in health outcomes analysis.
Key findings from the literature
We reviewed 140 studies and identified 22 that met our inclusion criteria. The following themes emerged:
Study characteristics
- Recency: Most studies were published in 2024, reflecting the field’s rapid acceleration.
- Geography: The U.S. and Pakistan emerged as leading contributors.
- Sample Size: Studies included an average of 465 participants, suggesting moderately powered observational designs.
Demographic reporting
- Gender and age: Captured in 20 of the 22 studies.
- Ethnicity: Reported in just 9 studies.
- Education and marital status: Only 4 studies reported these variables — yet both are key social determinants of health and influence depression outcomes.
Monitoring technologies and features
- Smartphones were used in 20 of the 22 studies, highlighting their dominance.
- Key features monitored included:
- Mood tracking: 20 studies
- Movement (accelerometer data): 10 studies
- Heart Rate Variability (HRV): 5 studies
- Word usage tracking: 4 studies
- Sleep patterns: 2 studies
Clinical assessment tools
Self-reported clinical scales were commonly used as outcome anchors:
- PHQ-9 (Patient Health Questionnaire-9): 6 studies
- GAD-7 (Generalized Anxiety Disorder-7): 7 studies
(See our original poster for a visual breakdown of these features and tools.)
Implications for outcomes research
From an outcomes research perspective, these technologies offer compelling advantages:
- Continuous and passive monitoring: Enables longitudinal capture of clinically relevant endpoints like mood, behavior, and sleep — reducing bias from intermittent self-reporting.
- Scalability and reach: Mobile-based data collection can extend to underserved and geographically dispersed populations, improving study generalizability.
- Early signal detection: Passive data streams can flag deterioration or improvement earlier than clinical visits alone, offering potential for timely interventions.
However, a consistent limitation observed in the literature is the underreporting of demographic variables — especially education and marital status. This omission constrains subgroup analysis and limits insights into how different populations experience depression and respond to interventions. In outcomes research, such data are essential for contextualizing and stratifying results across socioeconomic or cultural dimensions.
The path forward
As wearable and mobile sensors become more refined, their integration into real-world data frameworks will likely become standard practice in outcomes research. But to truly capitalize on this potential, researchers must enhance demographic reporting and examine interactions between digital phenotypes and traditional health indicators across diverse populations.
These tools not only offer more granular tracking of mental health status — they also help researchers and health systems better understand the dynamics of treatment effectiveness, burden of illness, and quality of life over time.
Interested in learning more?
This blog summarizes findings from the poster presentation, “Exploring Mobile and Wearable Technology for Early Depression Detection and Monitoring,” presented by Lyuboslav Ivanov and Manuel Cossio at Cytel and Universitat de Barcelona.