Digital Therapeutic Care and Decision Support Interventions for People With Low Back Pain: Systematic Review

Background Low back pain (LBP) is the leading cause of worldwide years lost because of disability, with a tremendous economic burden for health care systems. Digital therapeutic care (DTC) programs provide a scalable, universally accessible, and low-cost approach to the multidisciplinary treatment of LBP. Moreover, novel decision support interventions such as personalized feedback messages, push notifications, and data-driven activity recommendations amplify DTC by guiding the user through the program while aiming to increase overall engagement and sustainable behavior change. Objective This systematic review aims to synthesize recent scientific literature on the impact of DTC apps for people with LBP and outline the implementation of add-on decision support interventions, including their effect on user retention and attrition rates. Methods We searched bibliographic databases, including MEDLINE, Cochrane Library, Web of Science, and the Physiotherapy Evidence Database, from March 1, 2016, to October 15, 2020, in accordance with the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines and conducted this review based on related previously published systematic reviews. Besides randomized controlled trials (RCTs), we also included study designs with the evidence level of at least a retrospective comparative study. This enables the consideration of real-world user-generated data and provides information regarding the adoption and effectiveness of DTC apps in a real-life setting. For the appraisal of the risk of bias, we used the Risk of Bias 2 Tool and the Risk of Bias in Non-Randomized Studies of Interventions Tool for the RCTs and nonrandomized trials, respectively. The included studies were narratively synthesized regarding primary and secondary outcome measures, DTC components, applied decision support interventions, user retention, and attrition rates. Results We retrieved 1388 citations, of which 12 studies are included in this review. Of the 12 studies, 6 (50%) were RCTs and 6 (50%) were nonrandomized trials. In all included studies, lower pain levels and increased functionality compared with baseline values were observed in the DTC intervention group. A between-group comparison revealed significant improvements in pain and functionality levels in 67% (4/6) of the RCTs. The study population was mostly homogeneous, with predominantly female, young to middle-aged participants of normal to moderate weight. The methodological quality assessment revealed moderate to high risks of biases, especially in the nonrandomized trials. Conclusions This systematic review demonstrates the benefits of DTC for people with LBP. There is also evidence that decision support interventions benefit overall engagement with the app and increase participants’ ability to self-manage their recovery process. Finally, including retrospective evaluation studies of real-world user-generated data in future systematic reviews of digital health intervention trials can reveal new insights into the benefits, challenges, and real-life adoption of DTC programs.


Background
Low back pain (LBP) is the leading cause of worldwide years lost because of disability, with a global point prevalence of 9.4% and a reported lifetime prevalence of up to 84% [1,2]. Moreover, LBP is responsible for most absences from work as well as productivity losses, which ultimately results in a tremendous societal and economic burden [3]. Current clinical guidelines recommend a multimodal treatment approach for people with nonspecific, nonacute LBP, including remaining physically active, exercising and receiving educational therapy, and using psychosocial interventions [4,5].
Digital therapeutic care (DTC) programs provide a scalable, universally accessible, and low-cost approach to deliver these key components of a multimodal treatment. Using smartphone or browser-based apps, people with LBP can proactively self-manage their recovery process through remote physical and mindfulness exercises and in-depth explanatory educational material. Initial research investigating a DTC app to self-manage LBP has shown an overall positive effect on pain levels and functional disability [6]. In this virtually unsupervised approach, motivational factors, coping behavior, and self-management abilities play a critical role in patient literacy and empowerment with regard to adherence to the treatment program [7]. Thus, novel add-on personalized decision support interventions provide the possibility of guiding the user through the program and achieving sustainable behavior change through, for instance, tailored feedback messages, push notifications, and data-driven activity recommendations [8]. However, the benefits of a DTC program with add-on decision support interventions remain unclear and require further investigation [9].
Moreover, low user retention and high attrition rates are unresolved challenges, with reported nonengagement levels of up to 70% [10,11]. In this regard, user retention describes the adherence to, and overall response rate of, the DTC program [12]. This involves the sustained use of individual treatment modules. Engagement in the program can be measured, for instance, by the number of completed exercises or the time spent on the educational material [11]. Alternatively, the attrition rate focuses on the dropout of participants and, thus, their discontinuation of the DTC program [13]. In the treatment of people with LBP, both user retention and attrition rate play a critical role in understanding the causal dependencies with regard to the long-term impact of digital therapeutic interventions.
Previous systematic reviews focused on investigating the impact of DTC apps or decision support interventions in a controlled clinical trial-based environment, which determines the efficacy of the intervention under considerably ideal conditions [14]. In contrast, the intervention's effectiveness provides information on health-related outcomes in a real-world setting from people using the app either on their own initiative or after receiving a physician's prescription. Evidence regarding the difference in outcomes between a controlled trial setting and real-world use is lacking because previous systematic reviews only included randomized controlled trials (RCTs) because they represent the gold standard [9,14,15]. However, in future data-driven research on digital health interventions, retrospective evaluations could generate new insights into the effectiveness and engagement of DTC programs. In fact, the quickly evolving regulatory environment in favor of digital ecosystems advocates research platforms and databases to facilitate the evaluation of real-life user data. Finally, Germany's newly introduced Digital Healthcare Act allows the reimbursement of the cost of digital health apps by the statutory health insurance providers once the app is listed in the Digital Health Applications directory [16,17]. For this purpose, manufacturers are obliged to provide scientific evidence in the form of at least retrospective comparative studies proving that their digital health app yields positive health care effects [16]. This approach directly enables the consideration of real-life user-generated data and provides information regarding the adoption and effectiveness of the digital health app in a real-world setting.

Related Work
Various systematic reviews have elaborated on the impact of digital therapeutic interventions for people with LBP [9,15]. Nicholl et al [9] performed a comprehensive review with the most substantial overlap to our research question investigating digital support interventions for the self-management of LBP. Their work is part of the European Union (EU)-funded selfBACK project, which aims to develop an app that provides tailored, algorithm-based digital decision support interventions for the self-management of LBP [18]. The authors identified 6 completed RCTs but could not conclude under what circumstances which type of digital support intervention was effective for people with LBP. Because of the variability of study interventions and the homogeneous participant cohorts, which consisted predominantly of White, well-educated, and middle-aged women, it became clear that further studies are necessary to evaluate the benefits of digital support interventions for broader populations.
In a more recent review, Hewitt et al [15] investigated the impact of digital health interventions in a broader context of musculoskeletal conditions. In their review, the authors included 19 studies, of which 9 reported statistically significant reductions in musculoskeletal pain and 10 reported statistically significant improvements in functional disability. However, because of the consideration of predominantly stand-alone interventions and missing relatedness to LBP specifically, a recent systematic literature review dedicated to a holistic DTC program is, to the best of our knowledge, currently lacking.
It is worth mentioning that 2 systematic reviews have investigated apps that aim to support people with LBP with self-management, monitoring, or decision support interventions and are available on the iTunes and Google Play stores [19,20].
The first review, from 2017, found 61 smartphone apps, whereas the more recent one, from 2020, identified 74 apps available to download for smartphone users. The high and still increasing number of smartphone apps also underlines the need for an updated review from a scientific, clinical trial-based perspective.

Objective
The aim of this review is to evaluate recently published clinical evidence regarding the efficacy and effectiveness of digital therapeutic interventions for people with LBP. Moreover, we seek to synthesize the characteristics and components of the respective digital therapeutic programs, the type of delivery and interactivity with the user, and the extent of the deployed decision support interventions. Thereby, we aim to extract overall retention and attrition rates of the therapeutic care apps and summarize how current decision support interventions contribute to overall engagement levels and possibly influence health-related outcome measures.

Study Design
Following the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) statement, we performed a systematic literature review to identify and analyze recent scientific evidence regarding digital therapeutic and decision support interventions for people with LBP [21]. Notably, this systematic review was not preregistered in an international prospective registry such as PROSPERO. The new field of DTC apps and decision support interventions for LBP has rapidly emerged in scientific research over the past years. New nomenclature has arisen from ongoing software implementations and the increase in the number of innovational digital therapy features. These developments required an explorative approach to defining the inclusion and exclusion criteria for a profound systematic review to ensure that all relevant studies could be included. Therefore, we chose a snowballing search method and, subsequently, extended our ongoing search to a systematic review. Nonetheless, being aware of potential biases that may result from the lack of a prospective preregistration, we have presented our findings using a narrative approach, with the primary goal of summarizing recent technological improvements and implications in the field of digital therapy for LBP.

Search Strategy
We searched the bibliographic databases (1) MEDLINE through PubMed, (2) Cochrane Central Register of Controlled Trials in the Cochrane Library, (3) Web of Science Core Collection, and (4) the Physiotherapy Evidence Database and included Englishand German-language literature published in peer-reviewed journals. In addition, we screened the reference lists and tracked the citations of all included studies for eligibility. This review's search concept is based on 2 main pillars: (1) LBP and (2) digital therapeutic and decision support interventions. These search terms were extended with specific terminology and synonyms using Boolean operators and the respective Medical Subject Headings and are aligned with the updated method guideline for systematic reviews provided by the Cochrane Back and Neck group [22]. The detailed search queries for the corresponding databases are presented in Multimedia Appendix 1.
The final search was conducted on October 15, 2020. All collected studies were saved in a reference management software program, and duplicates were removed. In the first iteration, the titles and abstracts of the remaining studies were screened by 2 reviewers (DL and AMW) independently. Any disagreements would lead to the inclusion of a study for full-text screening. Subsequently, full-text screening was also conducted by 2 independent reviewers (DL and AMW). This time, the studies on which the reviewers disagreed were assessed for eligibility by a third reviewer (SW) and resolved through discussion.

Inclusion Criteria
We have summarized our inclusion and exclusion criteria in Textbox 1. In brief, we included all publications, with the primary aim of investigating the efficacy or effectiveness of a multidisciplinary DTC program with respect to health-related outcomes for people with LBP. Furthermore, our presearch and small pilot review of related work showed that prior systematic reviews had evaluated our research questions or comparable ones before 2016. Therefore, our systematic review complements the benchmark work of Nicholl et al [10], who have adequately elaborated the time frame until March 2016; therefore, we have included published studies from March 1, 2016, to October 15, 2020. Our approach is underpinned by our focus on the significant technological improvements in the field of decision support interventions as a new feature in DTC apps that have become available in recent years. Because of these emergent advancements and the changing terminology, the continuation of, and comparison with, the work of Nicholl et al [10] are not within the scope of this review. Textbox 1. Inclusion and exclusion criteria according to the population or patient problem, intervention, control, and outcomes (PICO) concept.

Inclusion criteria
• Population: People aged >16 years with low back pain.
• Intervention: Any interactive digital and internet-based (health) app that provides digital treatment therapy through an electronic device, that is, computer, tablet, or smartphone. Digital treatment includes access to a digital exercise program, including exercise instructions (eg, video-guided). Moreover, the app contains at least one intervention that addresses the biopsychosocial factors of low back pain, for example, through digital educational material or a digital psychological intervention in the form of cognitive behavioral therapy, or enables self-management, for example, through digital decision support interventions.
• Control: Treatment as usual or any other nondigital form of therapy regarding exercises and educational material for people with low back pain or older versions of the investigated digital therapeutic app or baseline measures.
• Outcomes: Any health-related primary outcome measure that is related to pain or functional disability. Secondary outcomes might include psychological factors (eg, depression), physical activity, medication use, health care resource use, health care costs, or digital therapy program adherence and retention rates.
• Study design: Randomized and nonrandomized controlled trials (including pilot randomized controlled trials); observational analytical studies, either prospective or retrospective; or intraindividual single-arm comparison studies.

Exclusion criteria
• Patient problem: Unspecified chronic pain or other musculoskeletal disorder conditions, for example, neck or knee pain.
• Intervention: Digital health apps using a fully automated text-based health care chatbot; smartphone-based standing posture, sitting posture, or range-of-motion recording or human activity recognition; self-referral decision support interventions; smartphone use only for a 6-minute walking test; internet interventions that include only a reminder or pain monitoring or reporting systems; stand-alone digital cognitive functional therapy; exercise therapy through DVD, CD, or a console, for example, Nintendo Wii; or other website-based interventions.
In our review, we also included study designs with an overall lower scientific evidence level than RCTs of at least retrospective comparative studies for the following reasons: first, because of Germany's newly introduced Digital Healthcare Act, German manufacturers of digital health apps are obliged to provide scientific evidence in the form of at least retrospective comparative studies proving that their app yields positive health care effects [16,17]. Therefore, we adopted this selection criterion of scientific evidence for this review to elaborate on the feasibility of a framework that considers real-world evidence for regulatory decisions.
Second, although RCTs remain the gold standard for providing the highest clinical evidence, the optimal control conditions in digital health intervention trials require further investigation [23]. Choosing treatment as usual as the control group in prospective RCTs might lead to a so-called "app-physician competition bias" [23]. The physicians' awareness of the controlled study design, for example, when competing against a digital therapeutic app for LBP, may cause them to update their knowledge regarding the newest guidelines and treatment recommendations. Thus, the consideration of divergent control groups and retrospective, cohort study designs might be useful for digital therapeutic apps, which will be evaluated with regard especially to the number of associated biases and confounders.

Data Synthesis
Data of all included studies were extracted by 2 independent reviewers who were randomly selected from a pool of 5 reviewers (DL, AW, TS, SW, and AMW) for each included study regarding the following outcomes: characteristics of included studies, characteristics of the participants, characteristics and components of the digital therapeutic interventions as well as retention rates, and data related to primary and secondary outcome measures. Because of the heterogeneity of the included studies, it was not feasible to conduct a meta-analysis. Despite making assumptions of the apparent similarity of most of the included studies in this review, we decided not to conduct a statistical meta-analysis because it could further compound possible biases regarding meaningful clinical recommendations and is therefore not justified. We have included a broad range of different DTC apps to narratively describe the progress made in this enormously increasing field of digital health. Our primary goal of following a narrative approach in the data synthesis is to provide information to researchers, manufacturers, and decision-makers on the status of scientific research in DTC. Thus, we focused on creating an overview of recent technological improvements, for example, decision support interventions that accompany digital therapy for people with LBP. Because of this focus, we did not extensively narrow the inclusion and exclusion criteria concerning the study design, that is, the time frame of follow-up measures, the comparator group, or the outcome measurements, including different tools and scales. Moreover, combining only a subgroup of our review's included studies into a meta-analysis would potentially have led to misleading conclusions, especially because we have only included studies published from March 1, 2016, to October 15, 2020.

Quality Appraisal
For the assessment of the methodological quality of the included studies, we used 2 separate tools to adequately elaborate on the RCTs as well as the observational studies [24]. We chose the Risk of Bias 2 (RoB 2) Tool for assessing "risk of bias in randomized trials" [25], which is based on an earlier version of the Cochrane Collaboration tool for assessing the risk of bias in randomized trials [26], and the Risk of Bias in Non-Randomized Studies of Interventions (ROBINS-I) Tool for assessing the risk of bias in observational studies [27]. Quality assessment was performed independently for each included study by 2 reviewers who were randomly selected from a pool of 5 reviewers (DL, AW, TS, SW, and AMW) for each included study. Studies on which the reviewers disagreed were assessed by a third reviewer (TS or SW) independently and resolved through discussion.

Search Results
We retrieved 1388 citations in total, and after removing 359 duplicates, we screened 1029 publications that were potentially eligible for inclusion in this review. Of the 1029 studies, 96 remained after title and abstract screening for full-text assessment. In the end, of these 96 studies, we included 12 in this systematic review. No additional publications were identified by screening the reference list or Google Scholar's Cited by option of included studies. The iterative steps of our literature search and the reasons for excluding several studies are shown in a PRISMA flow diagram ( Figure 1).

Study Population
The detailed characteristics of the study participants are listed in Table 1. Overall, the reviewed studies included 10,275 participants. The variation in the total number of study participants was significant, ranging from 41 participants in an RCT [33] to 6468 in a retrospective cohort study [34]. In most of the studies, the number of participants ranged from 93 to 180 [29][30][31][32][33]35,38]. Of the 12 studies, 1 (8%) [33] included only people who were aged between 30 and 55 years by addressing only office workers, whereas the other 11 (92%) included participants aged ≥18 years up to the age of 65 years, 80 years, or without an upper-bound specification. In 50% (6/12) of the studies [28,[30][31][32][33][34], the mean age in the intervention group was between 40 and 43 years. The highest reported mean age was 51.11 years [29], and the lowest was 33.9 years [35]. Regarding the sex of the participants, women were overrepresented in 67% (8/12) of the studies, peaking at 72.9%, 67%, and 65% in 38% (3/8) [28,31,33] of these studies. Of the 12 studies, 1 (8%) [34] reported no significant difference in the female-to-male ratio, 1 (8%) [32] did not report any information, and 1 (8%) [ [28] in the intervention group, including 17% (2/12) [31,32] of the studies with participants with BMI <25 kg/m 2 (people of normal weight). The ethnicity and comorbidities of participants were not reported in any of the included studies.

Risk-of-Bias Assessment
The results of the risk-of-bias assessment of the included studies are presented in Tables 2 and 3. We used the RoB 2 Tool for the included RCTs (6/12, 50%; Table 2) and the ROBINS-I Tool for the nonrandomized studies (6/12, 50%; Table 3). In the RoB 2 analysis, the studies were assessed using predefined signaling questions and were accordingly categorized using standardized wording, that is, low risk, some concerns, or high risk of bias. Similarly, in the nonrandomized trials, the risk of bias was judged to be low, moderate, serious, or critical.   [39] The overall risk of bias in a study was determined based on the highest level of risk in at least one domain, that is, the study was judged to be at high risk of bias when at least one domain was considered high. The RoB 2 Tool encompasses 5 domains, whereas the ROBINS-I tool encompasses 7. Of the 6 RCTs, 6 (100%) were appraised as having low risk or some concerns regarding potential biases, predominantly regarding bias due to deviations from intended intervention and outcome measurement. Notably, of the 6 RCTs, 1 (17%) achieved double-blinding of participants and assessors by providing a placebo version of the same app, which included only advice about general nutrition as a control. In the 6 nonrandomized trials, the overall methodological quality was low and associated with a greater risk of bias: 1 (17%) provided sound to moderate evidence for a nonrandomized trial, whereas 5 (83%) exhibited a serious risk of bias and thus have some important problems across domains. The major biases occur because of confounding in the selection of participants and because of missing outcome data. In detail, these include different durations of the observational period between groups; missing or significantly different demographic compositions between groups; a retrospective recall of preintervention outcome measures, for example, pain level; predefined inclusion criteria that consider only users who have already completed a certain number of exercises in the first 2 weeks after registration; or the inclusion of only users of the pro version of an app that costs €9.99 (US $11.56) per month. Bias due to missing data arose when incomplete data were provided, either because of a high attrition rate or because of a fragmentary analysis of an app's user database.

Digital Therapeutic Key Components
We have summarized all investigated DTC apps, including their key components, recommended timing and use frequency, and implemented decision support interventions, in Table 4. The DTC apps involved multiple key components that address the clinical guideline-based recommended multimodal treatment for people with LBP. In all included studies, participants had access to in-app exercise therapy either in the form of videos [28][29][30][31][32][34][35][36][37][38][39] or picture-based instructions [33]. As another key component, educational material was provided in 92% (11/12) of the apps and involved back pain-specific reading material and papers or rehabilitation plans. The third key component comprised psychosocial interventions that address stress and individual behavioral traits that could influence LBP, that is, in the form of cognitive behavioral therapy, personal health and behavioral coaching, or mindfulness and relaxation techniques in 58% (7/12) of the studies [28,30,31,[34][35][36][37]. The timing and frequency at which the user was required to engage with the app varied between studies, described in detail in Table  4. All DTC programs were fully digital, that is, they were either smartphone-based or browser-based. c Defined as completing at least one exercise session or reading 1 educational paper in weeks 9-12. d RE: reported elsewhere; see Toelle et al [31] and Huber et al [35]. e Involves studies including the Kaia app; all information on the type of therapeutic components and applied interventions was extracted as described within the respective publication. includes users who signed in after that date. j A day was classified as an active day when the user logged into the app and completed at least one module.

Personalized Decision Support Interventions
In all included studies, different kinds of decision support interventions were deployed to guide and accompany the user through the DTC program and to increase engagement with the app. Basic reminders in the form of push notifications were implemented most often [29][30][31][32][33]37,38], followed by a health coach chat function for motivational and reinforcing purposes [28,31,[34][35][36][37], peer-group support through interactive discussion feeds or forums [28,30,34], a points-based rewards system [30,38], feedback messages after achieving improvements [37], and a tailored self-management plan that prompted suggestions on personalized activity goals and education sessions [39]. The applied decision support interventions encompassed a broad spectrum of behavior change techniques, including reminders, peer support, motivational messages, goal setting, coping methods, and gamification.

Impact of DTC Apps
The impact of DTC apps and add-on decision support interventions was evaluated by considering the primary outcomes of pain and functional disability. In the included studies, the level of pain was measured using the Visual Analog Scale, the Numeric Rating Scale, and the Modified von Korff Pain Scale. The level of functional disability was measured using the Modified von Korff Disability Scale, the Roland Morris Disability Questionnaire, the Oswestry Disability Index, and the Modified Oswestry Disability Index. In 33% (4/12) [29,30,32,33] of the studies, both pain and functional disability were measured. In 67% (8/12) [28,31,[34][35][36][37][38] of the studies, only pain levels were reported using the Numeric Rating Scale or Visual Analog Scale, and in 8% (1/12) [39] of the studies, only the functional outcome was measured using the Roland Morris Disability Questionnaire. Overall, in all included studies, there was a positive care effect in the DTC intervention group compared with baseline values, that is, in lower pain levels and increased functionality. A between-group comparison within 67% (8/12) of the studies revealed no significant difference in pain levels in 2 RCTs [31,32]. It should be noted that in some studies [21,24,34,39], participants had ongoing access to treatment as usual in addition to the DTC app, which was not described in detail. The results of the primary outcome measures and the respective treatment groups are presented in Table 5.  [34] NRS f ↑ (↑) g Treatment as usual with consideration of the National guideline for the treatment of nonspecific back pain GP d -centered LBP e treatment: 1. Electronic case report form; 2. Treatment algorithm for guidelinebased clinical decision-making of GPs; 3. Teleconsultation between GPs and pain specialists for patients at risk for chronic back pain; 4. Kaia app Priebe et al [28] VAS ↑ (↑); ODI h ↑ (↑) Nonspecific usual care rehabilitation treatment Patients with LBP who underwent lumbar spinal surgery were provided with a mobile phone-based eHealth program app as part of their rehabilitation program Hou et al [29] MvK i (pain) ↑ (↑); MvK (disability) ↑ (↑); ODI ↑ (↑) A total of 3 digital education articles from the digital care program; treatment as usual Hinge Health Digital Care Program, including a new tablet, 2 Bluetooth wearable motion sensors, and one-on-one remote health coaching; treatment as usual Shebib et al [30] NRS ↑ (↔ j ) g A total of 6 individual physiotherapy sessions over 6 weeks and high-quality web-based education, including motivating messages Retrospective analysis of user data Priebe et al [37] RMDQ l ↑ No control group Provided with the selfBACK app; treatment as usual Sandal et al [39] a Main result of the intervention group after the last measurement in the study. Regarding adverse health events, in 75% (9/12) of the studies, no evidence of harm was reported after the implementation of DTC. Participants in a study [33] reported temporary discomfort; in another study [31], a patient was diagnosed with a lumbar disk herniation, which was declared an incidental finding. In the remaining study [29], 9 patients reported mostly mild, self-limited joint and back pain; of note, the patients underwent spinal surgery before starting the DTC. We have presented the results of additional secondary outcome measures, the time and frequency of measurements, and the mode of administration of surveys in Multimedia Appendix 2 [28][29][30][31][32][33][34][35][36][37][38][39].

Principal Findings
This systematic review investigated the efficacy and effectiveness of DTC and add-on decision support interventions for people with LBP. Our analysis shows that all included studies observed positive health effects in the intervention group compared with baseline measures. In 67% (4/6) of the RCTs, between-group analysis indicated superior primary outcomes of the DTC program. Moreover, different DTC apps proved to have potentially significant benefits for particular cohorts. In a study [29], patients who had undergone spinal surgery shortly before starting the DTC and did not live close to the clinic received the DTC app as part of a remote rehabilitation program. Another study [33] explicitly targeted office workers aged 30-55 years to investigate the benefits of a digital care app on quality of life and functionality at work. In another study [28], the researchers aimed to prevent the development of chronic LBP by stratifying patients classified as high risk based on the STarT Back questionnaire through a general practitioner and, thus, providing them with a DTC app as early as possible to prevent a worsening condition [37,40]. Overall, no evidence of harm was reported, except for mild pain and a presumably incidental finding.
Notably, these results must be interpreted with caution when considering that 33% (4/12) of the studies did not include a control group, and in 63% (5/8) of the studies that included a control group, it did not have recommended treatment according to current clinical guidelines. Most trials included small to medium sample sizes, which applies to 67% (8/12) [29][30][31][32][33]35,38,39] of the studies with <200 participants and 42% (5/12) [29,[31][32][33]39] of the studies with <85 participants in the intervention group. Overall, the study population was mostly homogeneous, with predominantly female, young to middle-aged participants of normal to moderate weight, limiting the transferability of the studies' outcomes to other patient cohorts. The lack of long-term follow-up is another limitation in 83% (10/12) [28,[30][31][32][33][34][35][37][38][39] of the studies. Moreover, overall user engagement and retention rates were reported to be medium to high, which we cannot ascertain in some cases because of unclear reporting. For instance, some studies reported their overall retention rates based on the mean days on which the participant completed at least one module or based on completing at least one therapy module in the last 3 weeks of the study, both of which are not attributable either to perpetual engagement or the use of differentiated key therapeutic components [34,37]. This adds to the difficulty of objectively measuring the actual number of completed therapeutic modules. To circumvent these challenges as well as the self-reporting biases, some DTC apps take advantage of wearable motion sensors or use an analytics platform to track interaction with the app [34,39].
Add-on decision support interventions accompanied DTC to enhance digital treatment by increasing user engagement and self-management capabilities in all investigated apps. Nonetheless, in most of the studies, rather basic rule-based decision support interventions were implemented, such as alert reminders or similar motivational push notifications. A more advanced data-driven recommendation system based on machine-learning was reported in a single study [38]. Research on data-driven support interventions has already demonstrated higher retention rates and increased user satisfaction in the self-management of LBP [41,42]. Therefore, implementing more complex decision support interventions is essential for achieving sustainable behavior change and high user engagement over a longer period, especially in a noncontrolled real-life environment.
In this review, it was not feasible to appraise the direct impact of either the single DTC key components, for example, exercise, educational material, or psychosocial content, or the decision support interventions, for example, peer support, on the primary health outcomes. Subsequently, it remains unclear to what extent DTC needs to be prescribed to achieve a marginal positive health effect for individual patient cohorts in terms of duration and number of exercise or education modules. In this regard, the effectiveness of DTC apps on the distinct subgroups of patients with LBP stratified according to acute, subacute, or chronic pain levels remains unclear and requires further subgroup-specific research. Despite overall positive findings, our assessment of the methodological quality revealed that the risk of bias in the included studies was moderate to high, especially in the nonrandomized trials.

Correlation Among Retention, Attrition, and Health Outcome
A major unresolved research endeavor deals with the correlation between engagement levels in a DTC program and improvements in health-related outcomes. The studies in which this effect was examined more closely reported positive as well as negative findings. A positive correlation between higher user retention and a significantly better health outcome was found in 25% (3/12) [29,34,35] of the studies. In contrast, another 25% (3/12) [28,31,36] of the studies also concluded that there was no correlation between app use frequency and improved pain level or functional disability. The underlying rationale for participants to stay with the program or choose to discontinue is yet unknown and could be multidimensional. For instance, depending on whether a participant experiences sudden or early improvement in pain levels can be a driving factor for the decision to either quit or continue to reinforce the positive outcome [28]. Nonetheless, these contrary and contraintuitive findings should be analyzed in future trials by monitoring primary outcome levels more frequently and collecting valuable feedback from participants. This demand is also associated with the ongoing need for consistent reporting of user retention and attrition rates. The use of standardized metrics for subjective and objective use of DTC apps is necessary to gain more insights and enhance the comparability of studies [11].
Another interesting observation in this review is the divergence of attrition rates when comparing RCTs and retrospective evaluation studies, which specifically consider people who have downloaded the DTC app on their own initiative. The lowest attrition rates were observed in 2 RCTs [32,33] and a prospective trial [39]. In contrast, the studies [35][36][37] with the highest attrition rates were based on real-world evidence and retrospective user-generated data analysis. One apparent reason for low attrition might have been the user's awareness of being part of a trial or the participant's compensation for the RCT, which involved vouchers, money, or free access to the app after the conclusion of the study. In contrast, participants who self-reportedly downloaded the app and eventually also paid for it on a monthly basis tended to quit the program earlier.
Despite the fact that this observation was not adjusted based on the varying number of participants or the duration of the intervention, it shows that retrospective studies based on real-world evidence possibly provide insights into the real-life adoption and use of DTC apps [12]. In fact, in future data-driven research on digital health interventions, the analysis of homogeneous and structured data related to engagement and self-reported outcome measures could further advocate retrospective cohort evaluation studies. Data obtained from users who have downloaded a DTC app either on their own initiative or after receiving a physician's prescription could be provided to research platforms and databases and, thus, facilitate the evaluation of real-life adoption and effectiveness. These benefits and the quickly evolving regulatory environment in favor of digital ecosystems in the EU, such as the EU-funded Smart4Health project, underline the relevance and timeliness of this review's approach [43].

Rising Uptrend of DTC App-Based Clinical Trials
We found additional studies investigating the benefits of divergent internet interventions or apps to support digital treatment of LBP during our search process. For instance, we identified a study involving an app that enables continuous pain monitoring for people with LBP [44], a study investigating the use of a website to support people in their self-management of LBP [45], a study that aimed to examine the benefits of a DTC app on the depressive disorder in patients with LBP [46], and a publication that describes 2 case studies in which a virtual reality system delivers functional rehabilitation exercises to people with LBP [47]. Moreover, we found several other research projects investigating their app-based therapeutic programs at an early stage of their development in the form of proof-of-concept, qualitative acceptability studies or research protocols [18,[48][49][50][51][52]. This underlines our observation with regard to the exponential rise of clinical trials concerning DTC and decision support apps in the past 5 years.

Limitations
This systematic review includes some limitations. First, we only considered English-and German-language literature, which might have led to excluding other potential eligible studies. Moreover, we only included LBP-related studies and excluded those investigating DTC apps for other similar health conditions, for instance, neck pain, shoulder pain, or musculoskeletal pain in general. Another limitation is the validity of this review with regard to the level of evidence. We are aware that systematic reviews that include only RCTs provide the highest level of evidence; however, considering studies based on real-world user data as well turned out to be a feasible approach, which we consider inevitable for future systematic reviews of digital health app trials.
Furthermore, although most of the included studies in this review reported overall positive health effects, we are cautious about deriving any clinical implications based on our findings. Because of the explorative approach that involved waiving study preregistration, not including traditional search terms such as eHealth and mHealth, and the fact that we focused on studies published from March 1, 2016, to October 15, 2020, we cannot exclude a variety of biases that may have occurred. Therefore, we have refrained from providing essential clinical recommendations for regulatory decisions and do not recommend copying this search strategy, which supported the specific objective of this review exclusively. The aim of this paper is to evaluate recently published clinical evidence regarding the efficacy and effectiveness of digital therapeutic interventions for people with LBP. However, DTC apps, including the broad range of implemented decision support interventions, experience continual improvements with new features and amendments concerning both front-end and back-end of an app. These advancements require ongoing clinical trial-based evaluations regarding their impact on health outcomes, user retention, and attrition rates, especially in this new field of digital therapy. Further research is needed to clarify whether DTC apps are so unique that they need to be evaluated individually or clinical implications can be made based on an overarching systematic review.

Conclusions
This systematic review demonstrates the benefits of DTC for people with LBP with regard to both primary outcomes of pain and functional disability. There is also evidence that decision support interventions benefit overall engagement with the app and increase participants' ability to self-manage their recovery process. However, because of mostly homogeneous study populations and the unclear correlation between user retention and improvements in primary outcomes, no general conclusion can be drawn either on the optimal intervention duration or the required number of exercise modules for individual cohorts. Finally, including retrospective evaluation studies of real-word user-generated data in future systematic reviews of digital health app trials can reveal new insights into the benefits, challenges, and real-life adoption of DTC programs.