Study record · meta analysis · 2019
Accuracy of Wristband Fitbit Models in Assessing Sleep: Systematic Review and Meta-Analysis
Haghayegh S, Khoshnevis S, Smolensky MH, Diller KR, and Castriotta RJ
Journal of Medical Internet Research, 21(11), e16273 · 2019
Why this study matters to CircaTest
The most-cited meta-analysis of Fitbit accuracy. CircaTest references the 81-91% sleep/wake accuracy figure as the editorial baseline for any Fitbit claim, particularly because no peer-reviewed Charge 6 specific validation has been published. The very low specificity range (10-52%) on early models is the source of the well-known 'Fitbits overestimate sleep' criticism.
Abstract
BACKGROUND: Wearable sleep monitors are of high interest to consumers and researchers because of their ability to provide estimation of sleep patterns in free-living conditions in a cost-efficient way. OBJECTIVE: We conducted a systematic review of publications reporting on the performance of wristband Fitbit models in assessing sleep parameters and stages. METHODS: In adherence with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement, we comprehensively searched the Cumulative Index to Nursing and Allied Health Literature (CINAHL), Cochrane, Embase, MEDLINE, PubMed, PsycINFO, and Web of Science databases using the keyword Fitbit. RESULTS: 22 articles qualified for systematic review, with 8 providing quantitative data for meta-analysis. In reference to polysomnography (PSG), nonsleep-staging Fitbit models tended to overestimate total sleep time (range 7-67 min) and sleep efficiency (2-15%), and underestimate wake after sleep onset (range 6-44 min). Nonsleep-staging Fitbit models correctly identified sleep epochs with accuracy values between 0.81 and 0.91, sensitivity values between 0.87 and 0.99, and specificity values between 0.10 and 0.52. Recent-generation sleep-staging Fitbit models showed no significant difference in measured WASO, TST, and SE versus PSG, and higher sensitivity (0.95-0.96) and specificity (0.58-0.69) than nonsleep-staging models. CONCLUSIONS: Sleep-staging Fitbit models showed promising performance, especially in differentiating wake from sleep. However, although these models are a convenient and economical means for consumers to obtain gross estimates of sleep parameters and time spent in sleep stages, they are of limited specificity and are not a substitute for PSG.
Source: PUBMED · Licensed under CC-BY 4.0
Population
Age
varied across included studies
Reference standard
psg
Systematic review and meta-analysis pooling data from 22 individual Fitbit validation studies, 8 of which contributed quantitative effect sizes.
Devices and metrics
Multiple Fitbit wristband models (Charge, Charge HR, Charge 2, Alta, Alta HR, Inspire, Versa, Ionic)
All studies for this device →| Metric | Value | 95% CI | Note |
|---|---|---|---|
| Accuracy | 91% | — | Upper bound of sleep epoch detection accuracy across non-sleep-staging Fitbit models. |
| Accuracy | 81% | — | Lower bound of sleep epoch detection accuracy across non-sleep-staging Fitbit models. |
| Sensitivity | 99% | — | Upper bound sensitivity to sleep, non-staging models. |
| Sensitivity | 87% | — | Lower bound sensitivity to sleep, non-staging models. |
| Specificity | 52% | — | Upper bound specificity for wake, non-staging models. |
| Specificity | 10% | — | Lower bound specificity for wake, non-staging models — the well-known Fitbit weakness on detecting wakefulness. |
| Sensitivity | 96% | — | Upper bound sensitivity for sleep-staging models. |
| Sensitivity | 95% | — | Lower bound sensitivity for sleep-staging models. |
| Specificity | 69% | — | Upper bound specificity for sleep-staging models. |
| Specificity | 58% | — | Lower bound specificity for sleep-staging models. |
Cite this study
Haghayegh S, Khoshnevis S, Smolensky MH, Diller KR, and Castriotta RJ (2019). Accuracy of Wristband Fitbit Models in Assessing Sleep: Systematic Review and Meta-Analysis. Journal of Medical Internet Research, 21(11), e16273. https://doi.org/10.2196/16273
Source links
Added to the CircaTest meta-analysis on 2026-04-06. How CircaTest evaluates studies →