Study record · meta analysis · 2025
Performance of consumer wrist-worn sleep tracking devices compared to polysomnography: a meta-analysis
Lee YJ, Lee JY, Cho JH, and Choi JH
Journal of Clinical Sleep Medicine · 2025
Why this study matters to CircaTest
The most comprehensive recent meta-analysis of consumer wrist-worn sleep trackers vs PSG: 24 studies, 798 patients, 12 different brands including Fitbit, WHOOP, Garmin, Apple Watch, Empatica E4, and Xiaomi Mi Band 5. Headline finding is that across the entire device set, consumer wrist trackers UNDERESTIMATE total sleep time by ~17 minutes (95% CI -26 to -7) and UNDERESTIMATE sleep efficiency by ~4.7 percentage points, both statistically significant. This is the strongest published quantitative answer to the question 'how wrong are consumer trackers on average' across the wrist-worn category. Important limit: pooled across brands, no per-device breakdown extracted into this record.
Abstract
STUDY OBJECTIVES: The use of sleep tracking devices is increasing as people become more aware of the importance of sleep and interested in monitoring their patterns.…
Read the full abstract on the source →
Source: PUBMED · Excerpt for fair-use commentary; full abstract via the source link
Population
Sample size
n = 798
Age
varies across included studies
Reference standard
psg
Meta-analysis pooling 24 individual studies covering 798 participants total. Devices included Fitbit, Jawbone, myCadian, WHOOP, Garmin, Basis B1, Zulu, Huami Arc, Empatica E4, Fatigue Science Readiband, Apple Watch, and Xiaomi Mi Band 5. Searched up to March 2024.
Devices and metrics
Apple Watch (multiple generations across included studies)
All studies for this device →| Metric | Value | 95% CI | Note |
|---|---|---|---|
| Bias (minutes) | -16.85 min | -26.33–-7.38 | Pooled mean difference for total sleep time across ALL devices in the meta-analysis (not Apple Watch specific). Statistically significant underestimation. Per-device breakdown is in the full paper, not the abstract. |
Fitbit (multiple models across included studies)
All studies for this device →| Metric | Value | 95% CI | Note |
|---|---|---|---|
| Sleep efficiency | -4.69% | -7.08–-2.3% | Pooled mean difference for sleep efficiency across ALL devices in the meta-analysis (not Fitbit specific). Statistically significant underestimation. |
WHOOP strap (multiple generations across included studies)
All studies for this device →| Metric | Value | 95% CI | Note |
|---|---|---|---|
| Bias (minutes) | see source | — | Lee et al. pooled WHOOP among 12 brands without per-device breakdown in the abstract. Consult the full paper for the per-device breakdown. |
Garmin (multiple generations across included studies)
All studies for this device →| Metric | Value | 95% CI | Note |
|---|---|---|---|
| Bias (minutes) | see source | — | Lee et al. pooled Garmin among 12 brands without per-device breakdown in the abstract. Consult the full paper for the per-device breakdown. |
Xiaomi Mi Band 5
All studies for this device →| Metric | Value | 95% CI | Note |
|---|---|---|---|
| Bias (minutes) | see source | — | Lee et al. included the Mi Band 5 (an earlier generation than the Smart Band 9 CircaTest covers) but did not break out per-device figures in the abstract. |
Cite this study
Lee YJ, Lee JY, Cho JH, and Choi JH (2025). Performance of consumer wrist-worn sleep tracking devices compared to polysomnography: a meta-analysis. Journal of Clinical Sleep Medicine. https://doi.org/10.5664/jcsm.11460
Source links
Added to the CircaTest meta-analysis on 2026-04-06. How CircaTest evaluates studies →