Digital Health for Pandemic Preparedness: A Critical Review of Innovations, Equity Gaps, and Governance in Infectious Disease Surveillance

Author(s):
Abdolreza BabamahmoodiAbdolreza BabamahmoodiAbdolreza Babamahmoodi ORCID1,*, Farhang BabamahmoodiFarhang BabamahmoodiFarhang Babamahmoodi ORCID1, Majid MarjaniMajid MarjaniMajid Marjani ORCID2
1Department of Infectious Diseases, Antimicrobial Resistance Research Center, Mazandaran University of Medical Sciences, Sari, Iran
2National Research Institute of Tuberculosis and Lung Diseases (NRITLD), Shahid Beheshti University of Medical Sciences, Tehran, Iran

International Journal of Infection:Vol. 12, issue 2; e169286
Published online:Jul 31, 2026
Article type:Review Article
Received:Jun 07, 2026
Accepted:Jul 07, 2026
How to Cite:Babamahmoodi A, Babamahmoodi F, Marjani M. Digital Health for Pandemic Preparedness: A Critical Review of Innovations, Equity Gaps, and Governance in Infectious Disease Surveillance. Int J Infect. 2025;12(2):e169286. doi: https://doi.org/10.5812/iji-169286

Abstract

Context:

Recent infectious disease emergencies, including SARS-CoV-2, mpox (2022 - 2024), highly pathogenic avian influenza A(H5N1) in US dairy cattle (2024 - 2025), and the Marburg virus in Rwanda (2024), have strained traditional surveillance systems. Digital health has been proposed as a paradigm shift; however, most reviews have focused on diagnostic accuracy while neglecting equity and governance.

Evidence Acquisition:

We conducted a structured thematic narrative review of peer-reviewed and grey literature published from January 2010 to March 2025, reported in accordance with PRISMA 2020 and PRISMA-ScR. PubMed, Scopus, and Web of Science were searched from February 5 to 18, 2025. Search keywords combined three concept blocks: (1) Digital-health technologies (artificial intelligence, machine learning, mHealth, eHealth, telemedicine, electronic health records, wearables, wastewater surveillance, genomic surveillance, natural-language processing); (2) infectious diseases (outbreak, pandemic, epidemic, COVID-19, Ebola, Zika, mpox, influenza, antimicrobial resistance, Marburg); and (3) public-health functions (surveillance, detection, forecasting, contact tracing, governance, equity). Full Boolean syntax is provided. Records were deduplicated in EndNote X20 and screened by 2 blinded reviewers in Rayyan (Cohen κ = 0.81). Empirical studies were appraised using the Mixed Methods Appraisal Tool 2018, and reviews were appraised using Joanna Briggs Institute checklists. The synthesis followed Braun and Clarke's reflexive thematic analysis and was interpreted through sociotechnical systems and data justice lenses.

Results:

Of 1498 records, 142 studies met the inclusion criteria; 48% (69/142) were rated as high quality. The evidence base was heavily skewed regionally (high-income countries [HICs], 41%; sub-Saharan Africa or South Asia, 19%). No technology was universally superior. Artificial intelligence (AI) studies were concentrated in HICs (18/32, 56%), whereas short-message-service (SMS) mHealth studies were concentrated in low- and middle-income countries (LMICs; 15/25, 60%). Recurring barriers included algorithmic bias, fragmented interoperability, donor-dependent sustainability, and slow benefit-sharing of pathogen data with LMICs.

Conclusions:

Across 142 included studies the synthesis supports a measured conclusion: Digital health is necessary but not sufficient for pandemic preparedness. The patterns most consistently observed in the evidence — regional skew, recurring governance failures, donor-dependent sustainability, and design choices that determine equity outcomes — together indicate that equity, sustainability, and accountability are constituents of effectiveness rather than obstacles to it. The binding constraint is increasingly political and institutional will rather than analytic uncertainty.

1. Context

Infectious diseases account for a substantial share of global mortality, with low- and middle-income countries (LMICs) bearing a disproportionate burden (1, 2, 3). According to the Global Burden of Disease Study, communicable diseases were responsible for approximately 13.7 million deaths globally in 2019, with the burden disproportionately concentrated in sub-Saharan Africa and South Asia (1). The World Health Organization (WHO) has identified pandemic preparedness as a defining challenge of the next decade, and recent emergencies have made it clear that conventional surveillance — based on case-by-case clinical reporting through hierarchical channels — is insufficient to meet the speed required for modern outbreak control (4, 5, 6). The accelerating tempo of recent emergencies, including SARS-CoV-2, mpox clade IIb (2022 - 2024) and clade Ib (2024), highly pathogenic avian influenza A(H5N1) crossing into US dairy cattle (2024 - 2025), and Marburg virus in Rwanda (2024), has exposed structural fragility in surveillance systems that depend on hierarchical case reporting (4, 5). Reporting delays of 1 to 3 weeks remain documented even in well-resourced systems (6, 7). These delays reflect social, infrastructural, and governance constraints that no single digital tool can overcome.
Digital health, defined by WHO as the systematic application of digital technologies to improve health outcomes, is therefore proposed as a paradigm shift for pandemic preparedness. The field encompasses artificial intelligence (AI) and machine learning (ML), mobile and short-message-service (SMS)-based tools (mHealth), electronic health records (EHRs), genomic and wastewater surveillance, telemedicine, wearables, and natural language processing (NLP) of social media (8, 9). However, the past decade, from the misfire of Google Flu Trends to the documented function creep of national contact-tracing apps, has demonstrated that technical capability and public health benefit are not synonymous.
Most existing reviews focus on diagnostic or predictive accuracy and rarely treat equity and governance as first-order analytic categories. This review addresses that gap with a tripartite focus: 1) how well digital tools perform under real outbreak conditions; 2) which populations and states benefit and which are made more vulnerable; and 3) what governance arrangements, from algorithmic impact assessment to the 2025 WHO Pandemic Agreement and the 2024 amended International Health Regulations (IHR), are needed to align innovation with public health values. We interpret the evidence through a sociotechnical systems lens (10) and a data justice framework (11), treating digital interventions as configurations of code, infrastructure, institutions, and people, rather than as neutral instruments.

1.1. What This Review Adds

This synthesis offers 4 contributions distinct from existing reviews of digital health and pandemic preparedness. First, the methodological contribution is that we apply joint PRISMA 2020 and PRISMA-ScR reporting to a thematic narrative review, with full database-specific search syntax (section 2.2), transparent reflexivity (section 2.4), and per-study quality weighting (Supplementary File 4). Second, the substantive contribution is that we present, to our knowledge, the first cross-tabulation of the digital health evidence base by technology and region (Figure 1D), showing that AI was predominantly studied in HICs (18/32, 56%) and SMS-based mHealth was predominantly studied in LMICs (15/25, 60%). This finding pertains to the political economy of the field as much as to the technologies themselves. Third, the theoretical contribution is that we integrate sociotechnical systems and data justice frameworks in a single synthesis, providing an analytic structure that other reviews can adopt. Fourth, the practical contribution is a feasibility-graded recommendations table (Table 1), which operationalizes the synthesis for policy use rather than leaving it at the level of general principles.
Table 1.Tiered Policy and Practice Recommendations for Digital Health in Pandemic Preparedness, with HIC/LMIC Implementation Feasibility
LevelConcrete ActionsImplementation Feasibility (HIC/LMIC)
Individual/communityPlain-language consent and grievance channels; opt-out without penalizing care; community oversight boards; user testing with women, older adults, and disability groupsGenerally feasible in both contexts; the principal cost is staff time
Institutional (facilities, NGOs)Predeployment algorithmic impact assessments; staff training in digital epidemiology; data-quality audits; offline-first usability as a procurement criterionFeasible in HICs; in LMICs, requires donor or pooled-procurement support
National public health agenciesStepwise interoperability roadmap (eg, HL7 FHIR for priority notifiable diseases first, not a blanket mandate); sustained financing line items in core budgets; sandbox programs for AI tools before clinical deploymentHighly feasible in HICs; in LMICs, blanket FHIR mandates are not currently feasible; phased adoption tied to capacity-building grants is more realistic
Regional and globalOperationalize the WHO Pandemic Agreement (2025) and amended IHR (2024) provisions on rapid pathogen-data sharing with binding benefit-sharing; harmonize risk-tiered AI obligations along EU AI Act linesPolitically demanding; treaty implementation, not text, will be the binding constraint
Evidence map: A, distribution by technology domain; B, by geographic setting; C, by study design; D, cross-tabulation of technology × region. Panel D quantifies the cross-distribution: AI/ML studies are concentrated in HIC laboratories (18/32, 56%), whereas mHealth studies are concentrated in LMIC contexts (15/25, 60%). HIC = high-income countries; LMIC = low- and middle-income countries; Multi-region = comparative or multi-regional studies.
Figure 1.

Evidence map: A, distribution by technology domain; B, by geographic setting; C, by study design; D, cross-tabulation of technology × region. Panel D quantifies the cross-distribution: AI/ML studies are concentrated in HIC laboratories (18/32, 56%), whereas mHealth studies are concentrated in LMIC contexts (15/25, 60%). HIC = high-income countries; LMIC = low- and middle-income countries; Multi-region = comparative or multi-regional studies.

2. Evidence Acquisition

2.1. Study Design and Reporting Standards

This study is a structured thematic narrative review positioned explicitly between systematic and narrative traditions (12). It incorporates reproducible elements, including a registered Boolean search, dual screening, formal quality appraisal, and PRISMA-style flow reporting; however, the synthesis is interpretive and theoretically driven. Reporting follows the PRISMA 2020 statement (13) and the PRISMA Extension for Scoping Reviews (14), adapted for thematic synthesis. The protocol was deposited in our institutional repository before screening. We do not claim pooled effect estimates or systematic generalizability across all digital health interventions; rather, we provide a transparent, quality-weighted synthesis of recurring patterns.

2.2. Sources and Search Strategy

PubMed, Scopus, and Web of Science Core Collection were searched from February 5 to 18, 2025. Three concept blocks were combined: digital health technologies AND infectious disease terms AND public health surveillance/governance/equity terms. Filters were January 1, 2010, to March 31, 2025, and document types were Article and Review. Grey literature was searched purposively across WHO, the European Centre for Disease Prevention and Control, the US Centers for Disease Control and Prevention, GISAID, Africa CDC, and the Global Digital Health Partnership. The full database-specific Boolean syntax is provided in Supplementary File 1.

2.3. Eligibility Criteria, Screening, and Software

Studies were eligible if they reported the design, implementation, evaluation, or governance of a digital tool applied to a human infectious disease; were peer-reviewed empirical research, systematic or scoping reviews, or formal policy analyses; and were published between January 1, 2010, and March 31, 2025. We excluded purely technical performance papers without a public health context, opinion pieces without empirical grounding, conference abstracts without full text, and studies on noninfectious conditions. The English-language restriction was applied only at full-text screening; non-English records were logged separately, and 39 potentially relevant non-English full texts were excluded. Study selection used three software platforms in a transparent pipeline: EndNote X20 (Clarivate) for de-duplication of records imported from databases; Rayyan (rayyan.ai) for blinded title-and-abstract and full-text screening with masked decisions until the unblinding step; and NVivo 14 (Lumivero) for thematic coding of included studies. Inter-rater agreement on a 20% calibration sample was Cohen κ = 0.81, (substantial agreement). The PRISMA-style flow of study selection is shown in Figure 2.
PRISMA-style flow diagram for study identification, screening, and inclusion. From 1498 records identified, 142 studies were included in the final synthesis. Boxes are native Word tables and may be edited directly; the diagram can be copied and pasted into a separate file as needed.
Figure 2.

PRISMA-style flow diagram for study identification, screening, and inclusion. From 1498 records identified, 142 studies were included in the final synthesis. Boxes are native Word tables and may be edited directly; the diagram can be copied and pasted into a separate file as needed.

2.4. Quality Appraisal, Thematic Synthesis, and Reflexivity

Empirical studies were appraised using the Mixed Methods Appraisal Tool 2018; reviews and policy analyses were appraised using Joanna Briggs Institute checklists appropriate to the study design. Quality appraisal influenced the synthesis through three mechanisms specified in the protocol: (1) themes whose evidentiary support came primarily from studies scoring ≥ 80% on the relevant tool were designated *high-confidence* and led the narrative; (2) themes that relied substantially on studies scoring < 60% were retained but flagged in the Results as *exploratory*, with the superscript convention (low) used at the claim level where applicable; and (3) at consensus meetings, disagreements about theme definition were resolved by giving greater weight to higher-scored studies, with the audit trail recorded in the protocol. Synthesis followed Braun and Clarke's 6-phase reflexive thematic analysis (15), consisting of familiarization, initial coding, theme generation, theme review, theme definition, and reporting, and was interpreted through a sociotechnical systems perspective (10) and a data justice framework foregrounding invisibility, disengagement, and antidiscrimination (11).
Expert consensus was achieved through a structured three-stage process: independent coding by two researchers; reconciliation meetings facilitated by a third senior author at which coding disagreements were resolved with reference to the per-study MMAT/JBI scores and the protocol decision rules; and a final review by all co-authors. Disagreements that persisted beyond two reconciliation rounds were escalated to the full author team for a structured majority decision; the audit trail of consensus decisions is included in Supplementary File 1. The author team, composed of researchers based primarily in Iran with training in epidemiology, health informatics, public health policy, and bioethics, acknowledges 3 positionalities that may have shaped the synthesis: limited proficiency in non-Roman-alphabet languages, policy-advisory exposure, and the absence of a low-income-country-resident author.

3. Results

The Results section is organised in five subsections. We first describe the search yield and characteristics of the included evidence base, then present technology-specific findings (Table 2), then evaluate recent outbreak case studies with critical appraisal of attribution claims (Table 3). We then turn to two cross-cutting findings — the regional structure of the evidence base, and recurring governance failures — that we critically appraise in the light of the sociotechnical and data-justice framings introduced in section 1.
Table 2.Comparative Profile of Digital Health Technology Domains in Infectious Disease Surveillance and Response
DomainReported StrengthsReported LimitationsEquity/Governance Risk
Syndromic surveillance (eg, HealthMap, ProMED)Early signals from nonclinical data; cross-border coverage; low marginal costLow specificity; vulnerable to media bias; limited diagnostic confirmationVisibility skewed toward English-language and high-internet regions
Artificial intelligence/machine learningOutbreak forecasting; image-based diagnosis; rapid scale where data are availableBlack-box opacity; brittleness under distribution shift; high compute costUnderrepresentation of LMIC populations in training data; uneven audit access
mHealth and SMS platformsHigh reach on basic phones; effective during the Ebola response (mHero, Liberia)Limited information density; literacy and language barriers; donor dependencyEquitable when designed with offline modes and local languages
Electronic health recordsLongitudinal clinical data; supports cohort and pharmacovigilance studiesUneven interoperability; coding heterogeneity; data-quality gapsVendor lock-in and breach risk under data concentration
Genomic and wastewater surveillanceHigh-resolution transmission mapping; species agnostic; early variant detectionResource-intensive sequencing; uneven global capacity; data-sharing latencyUnresolved benefit-sharing for LMIC contributors (Nagoya; PIP Framework)
Telemedicine and teletriageCare continuity during lockdowns or in remote settings; reduces in-facility transmissionBandwidth dependent; reimbursement and licensure barriersDeepens divides where broadband and devices are unequally distributed
Wearables and passive sensingContinuous physiological data; presymptomatic signals reportedLimited clinical validation; samples skewed toward wealthy usersCommercial data may not reach public health systems; consent is unclear
Social media and natural language processingReal-time sentiment and rumor tracking; vaccine-confidence signalsMisinformation amplification; bot manipulation; restricted platform APIsSurveillance of speech raises civil-liberty concerns
Table 3.Recent Outbreak Case Studies With Critical-Appraisal Notes and Evidence Grades
Outbreak/CaseDigital Tool ReportedCritical Appraisal NoteEvidence Grade
COVID-19 (South Korea, 2020)Integrated contact-tracing system using mobile phone, card-transaction, and CCTV data (27)Subnational R(t) reduction was reported; the privacy cost was substantial and is not generalizable to settings without similar legal infrastructureModerate
COVID-19 (India, 2020 - 2022)Aarogya Setu mobile contact-tracing app (31)Approximately 150 million downloads; mandatory employment-linked use blurred consent; civil society analyses report function creepLow to moderate
Mpox global emergencies (2022 - 2024)GISAID-based genomic sharing; Nextstrain phylogenetics (30)Rapid clade IIb characterization; sequencing capacity was heavily concentrated in high-income laboratories; benefit-sharing remains unresolvedModerate
H5N1 in US dairy cattle (2024 - 2025)Wastewater surveillance and USDA/CDC genomic dashboards (32)First detection through wastewater; cross-sector One Health linkage was reactive rather than systematicModerate
Marburg-Rwanda (2024)Lightweight digital case management and contact tracingOutbreak declared over within approximately 75 days; the principal drivers were national leadership, post-COVID infrastructure, and Sabin investigational vaccines. Attribution to digital tools alone overstates the evidenceLow to moderate
West Africa Ebola (2014 - 2015)mHero SMS platform connecting approximately 5000 frontline health workers (28, 29)Communication-delay reductions of approximately 40% were reported; attribution should be considered partial; sustainability declined after donor exitModerate

3.1. Search Yield and Study Characteristics

Of 1498 records identified (PubMed, 612; Scopus, 488; Web of Science, 357; grey literature, 41), 1162 unique records advanced to title-and-abstract screening, 338 advanced to full-text screening, and 142 were included (Figure 2). Forty-eight percent (69/142) of included studies met the high-quality threshold (≥ 80% on the relevant tool). Randomized and quasi-experimental studies and systematic reviews scored highest, whereas modeling studies and narrative commentaries scored lowest. The evidence base was heavily skewed regionally: 41% (58/142) of studies originated in HICs (North America and Western Europe), 24% (34/142) in East Asia, 19% (27/142) in sub-Saharan Africa or South Asia, and 16% (23/142) were multiregional studies (Figure 1).

3.2. Performance and Uses by Domain

Across the 142 included studies, 8 technology domains emerged recurrently (Table 2). We critically appraise each below rather than presenting performance metrics uncritically. Syndromic platforms such as HealthMap and ProMED-mail provided early signals at low marginal cost but were vulnerable to media bias (16, 17). AI tools showed strong performance for forecasting and diagnostic support in data-rich settings but were brittle under distribution shift, as illustrated by the post hoc deterioration of Google Flu Trends (18) and COVID-era performance decay of clinical AI (19). mHealth platforms, particularly SMS-based tools usable on basic phones, showed the most favorable equity profile among studies meeting quality thresholds: of the 15 LMIC mHealth studies, 7 supported a positive equity claim, whereas 8 were inconclusive or negative (20, 21). Critically, this 7/15 figure is not a majority of all LMIC mHealth studies, and we caution against generalizing from the high-quality subset. Genomic and wastewater surveillance matured rapidly during COVID-19 and have since been extended to mpox, H5N1, and antimicrobial resistance (22, 29). Telemedicine and wearables offered care continuity and presymptomatic signals, respectively, although clinical validation outside research cohorts remains limited (23). Social media analytics enabled rapid sentiment tracking but also amplified misinformation (24, 25).

3.3. Recent Outbreak Case Studies: Critical Appraisal

Five recent emergencies were selected for in-depth critical appraisal because they collectively exemplify the contemporary digital pandemic-preparedness landscape across regions, pathogens, and governance regimes. Where digital tools were claimed to have driven outbreak control, we examine competing causal explanations and assign an explicit evidence grade (Table 3). South Korea's integrated tracing system was associated with subnational R(t) reduction at substantial privacy cost (27); the privacy cost was in our reading not adequately weighed against the modest public-health benefit, particularly given the system's reliance on legal infrastructure that does not exist in most jurisdictions. India's Aarogya Setu reached approximately 150 million downloads, but mandatory employment-linked use blurred consent, and civil society analyses reported function creep (31); we read this case as illustrating how scale, voluntariness and governance interact: an app that becomes effectively mandatory through employment requirements is no longer voluntary in any meaningful sense. The 2022 - 2024 mpox emergencies were managed primarily through GISAID-based genomic sharing and Nextstrain phylogenetics, with rapid clade IIb characterization (30). Sequencing capacity remained heavily concentrated in high-income laboratories, leaving Nagoya Protocol obligations and Pandemic Agreement Article 12 commitments unresolved at the moment of operational use. The 2024 - 2025 H5N1 incursion was first detected through wastewater surveillance and USDA/CDC genomic dashboards, but cross-sector One Health linkage was reactive rather than systematic (32). The 2024 Marburg outbreak in Rwanda was declared over within approximately 75 days, with a case fatality of approximately 23%, below historical Marburg averages. The principal drivers of containment were national leadership, post-COVID-19 rapid-response infrastructure, and timely deployment of Sabin investigational vaccines, and attribution to the digital component alone overstates the evidence.

3.4. Equity Gaps and the Structure of the Evidence Base

Beyond technology-specific findings, the cross-tabulation of evidence by technology and region (Figure 1) is itself a finding. AI studies were concentrated in HIC laboratories (18/32, 56%), whereas mHealth studies were concentrated in LMIC contexts (15/25, 60%). This pattern, a high-tech/low-tech research dichotomy, is consistent with funding flows and research capacity at least as much as with the intrinsic suitability of the tools. We critically interpret this pattern: the cross-tabulation cannot, by itself, determine causality, and alternative explanations include genuine technical constraints (e.g. AI's data and compute requirements) and publication bias toward HIC AI studies. Our reading emphasises the structural barriers, drawing on the qualitative evidence reviewed in section 3.5; we present this as an *interpretive hypothesis* rather than a directly evidenced finding. Three other equity-relevant findings were recurrent: documented algorithmic bias in clinical AI deployed in HICs, including cost-as-proxy bias underestimating Black patients' care needs and dermatology-AI underperformance on darker skin; the persistent digital divide in lower-resource regions; and abandonment of donor-funded mobile applications after the acute phase, as exemplified by the post-Ebola West Africa cohort, in which fewer than 20 of 58 apps remained operational 2 years later (29).

3.5. Governance Gaps

Six recurrent governance failures were documented across the case literature: 1) function creep and coercive consent, exemplified by Aarogya Setu; 2) algorithmic bias that was unaudited in routine deployment; 3) donor-driven pilotitis, with sustainability failures after donor exit (29); 4) pathogen-data sovereignty failures, in which rapid sharing has not been matched by binding benefit-sharing for source countries (32); 5) vendor opacity that limits independent audit; and 6) surveillance asymmetry that systematically underrepresents low-internet and non-English regions. Read through Heeks and Shekhar's data justice framework (11), these are not isolated technical failures but distributional consequences of who designs, owns, and audits the data infrastructure.

4. Conclusions

4.1. Three Propositions

The synthesis supports three propositions. First, digital health is necessary but not sufficient for pandemic preparedness; technological capability does not translate into public health benefit without aligned governance, financing, and trust. Second, the high-tech/low-tech dichotomy may itself be partly a product of the unequal evidence base shown in Figure 1. The cross-tabulation cannot, by itself, determine causality; alternative explanations include genuine technical constraints and publication bias toward HIC AI studies. We present the political-economy reading as an interpretive hypothesis rather than a directly evidenced finding. Third, equity outcomes are predictable from early design choices, including language support, offline modes, and consent architecture, and are very difficult to retrofit once a system is deployed at scale.

4.2. Recurring Tensions

Several recommendations involve concrete trade-offs that warrant explicit recognition. Privacy versus speed: Granular tracing data are more difficult to limit to public health uses. Centralization versus sovereignty: GISAID-based sharing accelerates response but concentrates risk and reward, leaving benefit-sharing unresolved. Transparency versus intellectual property: Vendors of clinical-grade AI resist disclosing model internals, yet independent auditing is impossible without disclosure. Mandates versus feasibility: Universal HL7 FHIR adoption is technically attractive but not currently feasible in many LMICs. Donor speed versus domestic durability: Donor financing accelerates deployment but creates the sustainability decline documented in section 3.4. We do not propose that these trade-offs have correct resolutions; rather, we propose that explicit deliberation about them is the missing institutional layer.

4.3. Tiered Policy and Practice Implications

Translating findings into action requires recommendations across multiple system levels. Table 1 organizes actions into individual/community, institutional, national, and regional/global tiers and pairs each with a feasibility assessment for HIC and LMIC contexts. The 2025 WHO Pandemic Agreement and amended IHR (26) open a meaningful but narrow window: Implementation, rather than text, will determine whether they advance equitable digital preparedness or entrench existing asymmetries. The most consequential single action, in our reading, is sustained domestic financing of national digital public health infrastructure, because almost every other recommendation depends on it.

4.4. Strengths and Limitations of the Review

4.4.1. Strengths

This review has four principal strengths. First, it applies joint PRISMA 2020 and PRISMA-ScR reporting standards to a thematic narrative review, an unusual combination of rigor and interpretive flexibility. Second, dual blinded screening with substantial interrater agreement (Cohen κ = 0.81) and formal Mixed Methods Appraisal Tool/Joanna Briggs Institute quality appraisal raise the methodological floor above that of a conventional narrative review. Third, the integrated sociotechnical systems and data justice analytic frame is rare in this literature and may provide a template for other syntheses. Fourth, the cross-tabulation of evidence by technology and region (Figure 1D) is, to our knowledge, the first such quantitative mapping in this field and surfaces a structural feature—the political economy of where AI versus mHealth is studied—that prior reviews have described only qualitatively.

4.4.2. Limitations

The review also has substantial limitations that readers should consider. As a thematic narrative review, it is interpretive rather than aggregative; we did not pool effect sizes because outcomes were heterogeneous across technologies, outbreaks, and metrics. Despite grey-literature inclusion, implementation failures are systematically underreported, and the field’s effectiveness is therefore probably overestimated. Specific tool examples will require updating within 2 to 3 years. Heterogeneity across pathogens limits pooled inference. Most importantly, English-only full-text screening introduces systematic bias: 39 potentially relevant non-English full texts were excluded (12 Chinese, 9 Spanish, 7 Portuguese, 6 French, and 5 other languages). Readers should interpret the regional pattern in Figure 1 with caution because it is partly an artifact of the search strategy itself. Reviewer reflexivity is reported in section 2.4; positionalities most exposed to interpretation have been flagged. Single-domain reviews of wearables, telemedicine, and wastewater surveillance, and reviews led by LMIC-resident authors with multilingual coverage, are priority next steps.

4.5. Conclusions

Across 142 included studies, three findings stand out: 48% of the evidence base meets high-quality thresholds; the evidence is regionally skewed (41% HICs and 19% LMICs), with AI predominantly studied in HICs and SMS-mHealth in LMICs; and six recurring governance failures—namely function creep, unaudited algorithmic bias, donor-driven pilotitis, pathogen-data sovereignty failures, vendor opacity, and surveillance asymmetry—appear consistently across cases. On the strength of these findings, this synthesis supports a measured conclusion: Digital health is indispensable to pandemic preparedness, but its benefits depend on governance, financing, and design choices made early in the system lifecycle. No single tool is decisive on its own. Equity, sustainability, and accountability are not obstacles to performance; over realistic time horizons, they are constituents of it. The most useful next steps are practical: structured implementation of the 2025 Pandemic Agreement and 2024 amended IHR, sustained domestic financing of national digital public health infrastructure, phased rather than blanket interoperability mandates, mandatory algorithmic impact assessment for high-risk health AI, and binding benefit-sharing mechanisms for pathogen data originating in LMICs. A substantial body of evidence now points toward what is needed; the binding constraint is increasingly political and institutional will rather than analytic uncertainty.

Footnotes

References


Crossmark
Crossmark
Checking
Share on
Metrics

Purchasing Reprints

  • Copyright Clearance Center (CCC) handles bulk orders for article reprints for Brieflands. To place an order for reprints, please click here (   https://www.copyright.com/landing/reprintsinquiryform/ ). Clicking this link will bring you to a CCC request form where you can provide the details of your order. Once complete, please click the ‘Submit Request’ button and CCC’s Reprints Services team will generate a quote for your review.
Search Relations

Author(s):

Related Articles