Accuracy of Heart Rate Measurement by the Fitbit Charge 2 During Wheelchair Activities in People With Spinal Cord Injury: Instrument Validation Study

Background Heart rate (HR) is an important and commonly measured physiological parameter in wearables. HR is often measured at the wrist with the photoplethysmography (PPG) technique, which determines HR based on blood volume changes, and is therefore influenced by blood pressure. In individuals with spinal cord injury (SCI), blood pressure control is often altered and could therefore influence HR accuracy measured by the PPG technique. Objective The objective of this study is to investigate the HR accuracy measured with the PPG technique with a Fitbit Charge 2 (Fitbit Inc) in wheelchair users with SCI, how the activity intensity affects the HR accuracy, and whether this HR accuracy is affected by lesion level. Methods The HR of participants with (38/48, 79%) and without (10/48, 21%) SCI was measured during 11 wheelchair activities and a 30-minute strength exercise block. In addition, a 5-minute seated rest period was measured in people with SCI. HR was measured with a Fitbit Charge 2, which was compared with the HR measured by a Polar H7 HR monitor used as a reference device. Participants were grouped into 4 groups—the no SCI group and based on lesion level into the T1 (cervical) group. Mean absolute percentage error (MAPE) and concordance correlation coefficient were determined for each group for each activity type, that is, rest, wheelchair activities, and strength exercise. Results With an overall MAPEall lesions of 12.99%, the accuracy fell below the standard acceptable MAPE of –10% to +10% with a moderate agreement (concordance correlation coefficient=0.577). The HR accuracy of Fitbit Charge 2 seems to be reduced in those with cervical lesion level in all activities (MAPEno SCI=8.09%; MAPET1=20.43%). The accuracy of the Fitbit Charge 2 decreased with increasing intensity in all lesions (MAPErest=6.5%, MAPEactivity=12.97%, and MAPEstrength=14.2%). Conclusions HR measured with the PPG technique showed lower accuracy in people with SCI than in those without SCI. The accuracy was just above the acceptable level in people with paraplegia, whereas in people with tetraplegia, a worse accuracy was found. The accuracy seemed to worsen with increasing intensities. Therefore, high-intensity HR data, especially in people with cervical lesions, should be used with caution.


Background
Spinal cord injury (SCI) is a result of a partial or complete disruption of the neuropathways in the spinal cord, causing loss of motor and sensory function and a disturbed autonomic nervous system (ANS). Wheelchair users with SCI have one of the lowest daily activity levels compared with other groups with chronic physical conditions [1], negatively affecting their daily activity energy expenditure. In addition, their resting energy expenditure is often decreased because of multiple factors, with a reduced fat-free mass as a major contributor [2][3][4][5]. Together with the reduced activity energy expenditure, this leads to a lower total daily energy expenditure. As a consequence, approximately 68% of the people with SCI are overweight or obese, associated with increased risks of cardiovascular disease and mortality [6,7]. Therefore, maintaining or achieving an active lifestyle is even more crucial in people with SCI than in the able-bodied population. There are several tools that can help to stimulate or maintain an active lifestyle. Currently, activity trackers are a popular way to get insight on and monitor one's personal activity level. Activity trackers include many features, such as estimations of activity levels, exercise intensity or daily energy expenditure, often based on recorded movement via accelerometry and heart rate (HR).
HR is one of the most important and often used physiological parameters, as it is directly related to oxygen consumption and energy expenditure. The delivery of oxygen-rich blood required in the circulation system is controlled by the ANS by modulating both the HR and stroke volume [8,9]. For this reason, HR is used to monitor exercise intensity or as a derivative to estimate, for example, maximal oxygen uptake (VO 2 max), or energy expenditure [10]. Over the last 4 decades, HR during exercise has mainly been measured using HR monitors that make use of a chest belt, transmitter, and receiver. Owing to the rapid development of sensor technology in recent decades, it is now possible to record and track HR in an even less invasive and easier way. One of the most popular and commonly used methods to determine HR in daily life is photoplethysmography (PPG), a simple and low-cost technique that can be integrated in a wrist-worn activity tracker [11,12].
PPG is a technique in which blood volume changes are detected in the microvascular bed of tissue by infrared light reflected from the tissue, such as the ear lobe, finger, or wrist [11]. The change in blood volume after a heartbeat is proportional to the reflected light, allowing pulse wave detection in the wrist, which can be used as a derivative to determine HR [13]. HR recording with this technique, however, is more susceptible to motion artifacts caused by hand-arm movements and blood flow dynamics and can, therefore, lead to a lower accuracy [14,15]. Studies have shown acceptable validity and accuracy (<10%) in HR recordings during sleep or across a 24-hour period in a free-living environment in able-bodied individuals with a mean absolute percentage error (MAPE) of <10% [16,17]. However, when tested during activities of higher intensities or dynamic situations, the accuracy dropped (MAPE>10%) [18][19][20]. Owing to the developments in HR recording with activity trackers, they are being included in clinical settings for medical purposes, such as mobile health monitoring, noninvasive medical surveillance, or even detecting first signs of health issues [21][22][23]. As information gathered by activity trackers is more often used for clinical and health purposes, the importance of accurate data is growing. However, as measurement techniques rely on physiological properties and responses, measurement outcomes can differ if physiological responses are altered, for instance, because of medical conditions. Therefore, it is important to investigate the accuracy of HR measurement within different populations, such as in people with SCI, as their physiological responses can be severely altered [24].

Objectives
The accuracy of HR determined by PPG depends on blood pressure changes which is, among other things, influenced by HR variability [25]. Both, the blood pressure of the upper limbs and HR are regulated by the ANS, of which the sympathetic outflow occurs between the first thoracic (T1) spinal cord segment and the fifth thoracic (T5) spinal segment. After an SCI, neural signal transmission is partially or fully lost at and below the lesion level. In case of an SCI at or above the T5 spinal cord segment, neural signaling and, therefore, the balance between the parasympathetic and sympathetic systems are often altered. Sympathetic hypoactivity usually occurs, resulting in possible low HR, low resting blood pressure, disturbed vascular regulation, and altered responses in these systems during rest or during physical activities [24]. Owing to the changes in HR response and blood pressure control, the accuracy of HR determined by PPG could be affected when a lesion occurs above T5. Because of possible impaired or altered vascular regulation, artifact-reducing algorithms may not apply and might subsequently compromise HR accuracy. The ANS is even more affected in cervical lesions, as the imbalance between the parasympathetic and sympathetic systems increases with lesion level [26]. Therefore, the aim of this study is to evaluate whether Fitbit Charge 2 can accurately record HR in wheelchair users with SCIs and to investigate how lesion level affects accuracy. In addition, the effect of intensity on accuracy is determined during wheelchair activities and strength exercise, as a higher intensity is expected during strength exercise compared with wheelchair activities and during wheelchair activities compared with rest. It is hypothesized that the HR accuracy of the Fitbit Charge 2 is lower in people with lesions at or above T5 because of the possible affected ANS, compared with people with lesions below T5 or without SCI. A further reduction in accuracy is expected in people with a cervical lesion compared with those with a lower lesion level or without SCI, because of an enlarged imbalance between the parasympathetic and sympathetic systems. Furthermore, the accuracy is expected to decrease with increasing intensities.

Study Design
Data on body composition and energy expenditure in people with SCI were collected in a larger cross-sectional study. All participants were invited for a one-time visit to the Amsterdam Nutritional Assessment Center laboratory of the Amsterdam University of Applied Sciences. HR of the participants was recorded during rest, wheelchair activities, and a 30-minute strength exercise block with both the Fitbit Charge 2 and Polar H7 HR monitor. All participants provided signed informed consent before participating. The study was approved by the medical ethical committee of Slotervaart Ziekenhuis-Reade (METc nr. P1805).

Participants
Overall, 48 participants were recruited to participate in this study, 38 (79%) with SCI and 10 (21%) without SCI. Recruitment took place through advertisements via the Dutch SCI patient association, social media, rehabilitation center Reade in Amsterdam, and the social network of the involved researchers. Participants were included if the following inclusion criteria were met: age between 18 and 75 years; chronic SCI (time since injury >1 year), not ventilator-dependent; and wheelchair-dependent for longer distances. Exclusion criteria were as follows: presence of a pacemaker, severe edema, progressive illness, pressure ulcers, metabolic diseases, severe comorbidities, psychiatric disorders, pregnancy, and insufficient understanding of the Dutch language to understand the study. Participants without SCI were selected based on the same inclusion and exclusion criteria, except for the SCI-related criteria. Personal and lesion characteristics were obtained through a questionnaire and interview. A conservative sample size target was chosen and set on ≥40 samples of each device for each group for each activity based on the method comparison guideline [27].
The participants were divided into 4 groups-the without SCI group and based on their lesion level they were divided into the cervical (>T1), high-thoracic (T1-T5), and midthoracic and lower (<T5) groups, to test the influence of lesion level on PPG accuracy. Heart and upper-body blood vessels are sympathetically innervated from segments T1-T5 and interact with the parasympathetic system to provide a balanced regulation of the cardiovascular system. In people with an SCI at T5 and above, sympathetic innervation is likely to be affected to a certain extent, which causes altered HR response and blood pressure regulation, possibly affecting PPG recordings compared with lower lesions. In addition, the lesion groups T5 and above were divided into the following lesion subgroups: lesion above T1 and lesion between T1-T5, with a larger imbalance in the ANS expected in the first group and thus a more severe cardiovascular dysfunction [28]. In people with an SCI above T1, arm function might be impaired, as well as a more severed impaired sympathetic innervation of the heart and upper-body vessels compared to lower lesions, which could lead to a lower HR accuracy in those with a cervical lesion [29].

Fitbit Charge 2
The Fitbit Charge 2 (2017 version, Firmware version 22.55.2, Fitbit Inc) is a commercially available activity tracker with multiple sensors, such as a 3-axis accelerometer, an altimeter, and a PPG sensor to record HR. In the Fitbit Charge 2 PurePulse, HR technology is used as an investigational device, which constantly reads the changes in the blood volume at the wrist. An algorithm converts these data into continuous HR data. The smartwatch was tightly positioned according to instructions on Fitbit on the wrist of participants on which normally a watch would be worn, usually the nondominant side. Intraday data collection was requested and approved by Fitbit for research purposes, allowing us to obtain the data on the highest possible sampling rate for the time period in which all activities were performed through an application programing interface. Output frequency of the HR data varied between 0.2 Hz and 0.06 Hz. Data collected by the Fitbit were transferred through Bluetooth Low Energy to the Fitbit App and downloaded.

Polar H7 HR Monitor
The Polar H7 chest strap HR monitor (Bluetooth Low Energy version, Polar Electro) was used as a reference device to measure HR; it is an accurate (intraclass correlation coefficient=0.98) alternative for a 3-lead electrocardiography (ECG), which is considered as the gold standard for measuring HR [30]. The strap was moistened to improve conduction between the skin and the sensor before it was secured tightly around the chest. HR recording was connected with a Cortex Metamax 3B (Cortex Biophysik GmbH) portable indirect calorimetry system, used in the larger study, which collects data at each full breathing cycle. Therefore, the output frequency of the Polar H7 HR data was determined by the breathing frequency of the participants during the protocol. The HR output given after each breathing cycle was the average HR measured over the entire breathing cycle.

Measurement Protocol
After ensuring that all sensors were positioned correctly, the measurement protocol started with a 5-minute seated rest, followed by wheelchair activities, consisting of eleven different wheelchair tasks executed for 1 minute, namely: (1) wheelchair propulsion on a low-resistance surface on a slow, (2) normal, and (3) high speed; (4) handcycling on an armcrank ergometer; (5) rummaging in a bag while being pushed; (6) setting the table; (7) doing dishes; (8) typing on a laptop; (9) maneuvering the wheelchair; (10) wheelchair basketball; and (11) transfer from wheelchair to chair and back. No 5-minute seated rest data were available for the participants without SCI, as this was added to the measurement protocol after finishing the measurements of the participants without SCI. All tasks were performed for 1 minute, as this represents real-life situations better compared with longer steady-state situations. All tasks were timed, logged, and recorded using a camera. Between each task, a rest period allowed the HR to recover close to the resting level to ensure variability in measured HR between tasks. If the participant was not able to perform a wheelchair activity independently because of their impairment, the task was not executed. After the activities were completed, a 30-minute upper-body strength exercise was performed. Exercises and resistances were chosen based on the participants' preferences and physical capabilities. All strength exercises were performed with sets of 8-12 repetitions, and each set was repeated 3 times in total. After each set, there was a rest period that lasted between 90 and 120 seconds before the next set was started.
The strength exercise block was not executed if the participant was not able to perform strength exercises because of an upper-body injury or impairment.

Missing Data and Synchronization
On the basis of expert evaluation, all data of 8% (4/48) individuals were excluded. Of the 4 individuals, data for 2 (50%) individuals were excluded because of poor Polar H7 HR monitor connection throughout the whole measurement, data for 1 (25%) were excluded owing to battery failure of the Polar H7 HR monitor, and data for 1 (25%) were excluded because of the loss of Fitbit Charge 2 data. In total, the HR data of 92% (44/48) of participants were analyzed. In addition, approximately 0.6% of the data were excluded from 13% (6/48) of participants because of invalid samples (temporary loss of Polar H7 HR monitor connection). In total, 21,732 valid HR samples from both devices were used for analysis. The data of the 2 devices with different sampling rates were synchronized by relating the HR monitored by the reference device (ie, Polar H7 HR monitor) to that of the investigational device (ie, Fitbit Charge 2) that was closest in time. Consequently, data were labeled with one of the three activity categories: rest, wheelchair activities (including resting time between the activities and before the strength exercises started), and strength exercises (including resting time between the exercises) based on logbook data and video recordings.

Statistical Analyses
All statistical analyses were performed in R (version 3.6.1; R Foundation for Statistical Computing) using R Studio (version 1.2.1335). To assess error, the mean difference between the Polar H7 HR monitor and Fitbit Charge 2 HR samples was calculated, resulting in the mean error. In addition, the mean absolute error (MAE) and the MAPE were evaluated. As stated by the American National Standards Institute, the accuracy of HR monitors should be within -10% to +10% of the input rate or -5 to +5 beats per minute (bpm), whichever is greater [31].
In alignment with these standards, we considered a MAPE of -10% to +10% as an acceptable error rate. Following Nelson and Allen [17], outliers were not removed to evaluate the accuracy of consumer use conditions. Bland-Altman plots with 95% limits of agreement (LoA) were produced using the BlandAltmanLeh R package [32]. The Bland-Altman plots and LoA are the suggested methods for analyzing the agreement between 2 measurement devices [33][34][35][36]. These plots were inspected to assess systematic biases over the entire HR range and to assess the magnitude of such biases and whether Fitbit Charge 2 overestimated or underestimated HR compared with the Polar H7 HR monitor. Finally, in line with previous wearable validation studies [17,33], Lin concordance correlation coefficients (CCCs) [37] were calculated using the DescTools R package [38]. These correlation coefficients provide information on the association and strength of the linear relationships between the reference device and investigational device. According to Nelson and Allen [17], the strength of agreement can be interpreted based on the following: CCC<0.5 indicates a weak association, CCC between 0.5 and 0.7 indicates a moderate association, and CCC>0.7 relates to a strong association.

Results
Descriptives Table 1 shows the demographic characteristics of the 77% (34/44) wheelchair users with SCI and 23% (10/44) participants without SCI included in the analyses. Table 2 shows the descriptive statistics for the 21,732 HR samples measured by the Polar H7 HR monitor and the Fitbit Charge 2. These samples were taken during rest (1168 HR samples over a 5-minute period), wheelchair activities (12,016 HR samples), and strength exercises (8548 HR samples). In addition, the distributions in the HR samples are displayed visually in the violin plots shown in Figure 1. The violin plot displays the mirrored density plot in addition to the box plot, which displays summary statistics, such as the median and IQR. As shown in Table 2, the range of the HR samples from Polar H7 was wider than the HR estimates produced by the Fitbit Charge 2. The differences in the range of HRs became more pronounced when the lesion was above T5. However, further investigation showed that the range produced by the Polar H7 and Fitbit Charge 2 was quite similar for people with SCI above T1.

Mean Absolute Error
Overall, the Fitbit Charge 2 had a mean percentage error rate of 12.99% for people with SCI (Table 3), which is too high considering the standard acceptable MAPE is -10% to +10%. The MAPE of people with a lesion below T5 and between T1 and T5 was comparable with 11.16% and 10.16%, respectively, but for people with a lesion above T1, the MAPE was considerably higher (20.43%    Table 4.

Principal Findings
This is, to our knowledge, the first study to assess the HR accuracy of Fitbit Charge 2 in people with SCI, or more specifically, to assess the effects of lesion level on PPG-based HR accuracy. With an overall MAPE of 12.99% for the Fitbit Charge 2, the standard acceptable error of -10% to +10% was not met, and the outcomes were worse than in earlier research in able-bodied populations [17,20]. As the intensity of the activity increased, the HR accuracy of Fitbit Charge 2 worsened, which is in line with previous research [18][19][20]. Moreover, there seems to be a clear effect of lesion level, as the highest lesion group (>T1) showed drastically lower accuracy on Fitbit HR recordings on all intensities, compared with lower lesion level groups. This could possibly contribute to a more severely affected sympathetic innervation. Compared with previous research in able-bodied individuals, our findings showed poorer outcomes for both MAPE and agreement rate during wheelchair activities and strength exercises. Previous research on the accuracy of HR measurements of the Fitbit Charge 2 that included similar activities (seated rest, activities of daily living, strength exercises) showed a MAPE range of 5.93% to 9.88% in able-bodied individuals. A similar range was found in this study in people without SCI (7.82%-8.39%) [17,20]. In all people with SCI, the MAPE range varied between 6.5% and 14.2%. During seated rest, our findings showed a stronger association (CCC=0.791) between the Fitbit Charge 2 and Polar H7 HR monitor compared with a moderate association in previous research (CCC=0.561) [17]; however, agreement and error in all other activities showed poorer results and worsened as intensity increased in people with SCI. The reduced accuracy with increasing intensities is in line with the literature [18,19] [17,20]). It could be argued that performing activities in a wheelchair could influence the agreement of HR recording in wrist-worn wearables in general as the CCC values in this study tend to be lower, even in people without SCI. To perform certain activities in a wheelchair, the wrist is often repetitively pressed and bumped against the rim of the wheel during propulsion, which could continuously affect the PPG connection as the pressure between the sensor and skin fluctuates [39]. This could, at least in part, explain the overall poorer accuracy of the Fitbit Charge 2 during wheelchair activities in people with and without SCI in this study compared with previous findings in able-bodied individuals. However, this would not explain the drastically decreased HR accuracy of the Fitbit Charge 2 in the higher lesion level (>T1) group. Therefore, it is very likely that a more severely imbalanced ANS negatively affects the accuracy [26].
It is remarkable that the T1-T5 group showed no clear difference from the <T5 group, as the sympathetic pathway is affected at lesion levels above T6 and an imbalance between the sympathetic and parasympathetic system is most likely present, which controls HR and blood pressure [24]. As there is a major difference between Polar H7 and Fitbit Charge 2 in the technique used to measure the obtained HR outcomes, it seems likely that this difference causes a drop in accuracy and agreement during the Fitbit Charge 2 HR recording. Because Fitbit Charge 2 HR recording is based on blood pressure differences, and autonomic control of the blood vessels in the upper body is controlled between segments T1 and T4, it was expected to observe differences in the T1-T5 group as well as in the >T1 group compared with the <T5 group. However, it appears that as long as there is some innervation left and not all sympathetic innervation of the blood vessels is affected, HR accuracy measured by PPG is only slightly reduced. The accuracy only seems to drop at lesion levels above T1, as there is possibly no sympathetic innervation left of the blood vessels in the lower parts of the upper limbs [40]. In addition, people with tetraplegia are more likely to show lower blood pressure compared with people with paraplegia or able-bodied individuals caused by reduced sympathetic activity [41]. Therefore, hypotension is a common phenomenon among people with tetraplegia, which could possibly influence the accuracy of PPG-based HR recordings as it deviates from the regular expected signal [42,43].
The severity of reduced sympathetic innervation is not necessarily related to neurological lesion completeness, which is often expressed using the American Spinal Cord Injury Association Impairment Scale score. This scale is based on the presence of motor or sensory function, where a complete injury is defined as the absence of both motor and sensory function below the lesion, and an incomplete lesion is defined as any reduced presence of motor or sensory function below the lesion [44]. However, research has shown that this classification does not necessarily include autonomic function, because sympathetic activity has been detected in athletes with complete cervical SCI lesions [45]. Although lesion level clearly influences the ANS and, therefore, Fitbit Charge 2 HR accuracy, the effect of completeness of the lesion on motor, sensory, and autonomic function remains unknown. Therefore, future studies should test autonomic function separately from neurological lesions in people with SCI to gain better insight on the effect of autonomic function on HR accuracy based on PPG signals.

Strengths and Limitations
A strength of this study was the relatively large sample size of people with SCI, in which the distribution among the different lesion level groups, which were based on physiological differences determined by the literature, was fairly even and the direct comparison between people with and without SCI [24,26,40]. Analyses were performed, when possible, according to the methodological approaches suggested by Nelson and Allen [17], van Lier et al [34], and Sartor et al [33]. Activities and exercises mimicked real-life situations, which increased the ecological validity. Participants with SCI performed the tasks in their own wheelchair, at their own speed in relatively short time bouts, representing real-life situations better than prolonged steady-state activities. A suitable wheelchair was provided to the participants without SCI. Outcomes were analyzed as a whole and divided by lesion group and rest, wheelchair activities, and strength exercises to gain insight on both the effect of intensity and lesion level on the accuracy.
However, there are some limitations to the design and analysis. The reference device used, a Polar H7 HR monitor, is not considered a gold standard. A 3-lead ECG HR monitor device would have served better as a reference device. However, the Polar H7 HR monitor shows a high correlation with a 3-lead ECG (Intraclass Correlation Coefficient=0.98) and is therefore a good alternative [30]. In addition, HR outcomes from both devices were provided without raw signals (raw ECG signals and interbeat intervals). Ideally, one would obtain all raw information as algorithms to convert raw signals into the reported HR are often confidential and unknown. Firmware versions were, therefore, reported to take into account any sealed changes in such algorithms and to allow for the replication of results. HR was collected at the highest possible sample rate for Fitbit Charge 2, as intraday time series access was provided by Fitbit for research purposes. As measurements were performed within a larger study on energy expenditure in people with SCI, the Polar H7 was connected to an indirect calorimetry device during measurements. The output provided by this device was given on a breath-by-breath basis, meaning the HR sample rate for the Polar H7 varied per minute and was determined by the breathing rate of the participant, which eventually provided a lower HR sample rate than preferred. The number of data points available for each activity to analyze reduced when the lesion level increased, as several participants were not able to perform certain wheelchair activities or strength exercises because of the severity of their impairment, present injuries, or risks. In addition, no information was collected on the environmental conditions or skin information that could possibly affect the PPG signal [33]. However, because all measurements were performed at the same location within the same rooms, temperature and light were similarly regulated during all the measurements. Unfortunately, no blood pressure data were collected during the measurement to strengthen our findings. Therefore, it is advisable to combine HR recordings together with continuous blood pressure data in future research to confirm our findings.

Practical Implementations
HR data obtained with the PPG technique during activities, especially during high intensities in people with a high lesion level (>T1), could provide inaccurate HR data in people with SCI. Therefore, it is advised to avoid using PPG-based HR measurements for medical purposes in people with SCI with a cervical lesion level (>T1). However, despite a possible discrepancy in HR recordings, outcomes can still be of value in situations where the consequences of inaccurate HR data are low, for example, to get a global impression of energy expenditure and exercise intensity during physical activities in daily life.

Conclusions
The overall accuracy of the Fitbit Charge 2 HR measurements in people with SCI did not reach the standard acceptable error of -10% to +10%. With increasing intensity, the HR accuracy of the Fitbit Charge 2 was further reduced in people with SCI compared with its HR accuracy in able-bodied individuals. In addition, HR accuracy is related to lesion level, where a high SCI lesion (>T1) negatively affects HR accuracy. Accuracy seems to worsen more in high lesion levels with increasing intensities. A clear reduction in accuracy was found in the lesion group >T1 during wheelchair activities and strength exercises. This suggests that PPG-based HR accuracy is affected in people with SCI, as blood pressure responses during activity are possibly altered because of an affected ANS. Therefore, PPG-based HR measurements during activities should be taken with caution in people with SCI, especially in those with cervical SCI lesions.