AI-assembledErrors are possible. Verify critical claims against the linked primary source.

Study record · comparative · 2025

A performance validation of six commercial wrist-worn wearable sleep-tracking devices for sleep stage scoring compared to polysomnography

Schyvens AM, Peters B, Van Oost NC, Aerts JM, Masci F, Neven A, et al.

Sleep Advances, 6(2), zpaf021 · 2025

Why this study matters to CircaTest

Single most editorially important study in the CircaTest corpus. Six commercial wearables tested against PSG in a uniform protocol means the kappa values are directly comparable in a way most validation studies are not. Drives the head-to-head accuracy figures across CircaTest's comparison content. Limitations: tested previous-generation models (Series 8 not 10, Charge 5 not 6, original ScanWatch not 2) so the results are upper bounds for current models, not direct evidence.

Abstract

STUDY OBJECTIVES: The aim of this study is to assess the performance of six different consumer wearable sleep-tracking devices, namely the Fitbit Charge 5, Fitbit Sense, Withings Scanwatch, Garmin Vivosmart 4, Whoop 4.0, and the Apple Watch Series 8, for detecting sleep parameters compared to the gold standard, polysomnography (PSG). METHODS: Sixty-two adults (52 males and 10 females, mean age ± SD = 46.0 ± 12.6 years) spent a single night in the sleep laboratory with PSG while simultaneously using two to four wearable devices. RESULTS: The results indicate that most wearables displayed significant differences with PSG for total sleep time, sleep efficiency, wake after sleep onset, and light sleep (LS). Nevertheless, all wearables demonstrated a higher percentage of correctly identified epochs for deep sleep and rapid eye movement sleep compared to wake (W) and LS. All devices detected >90% of sleep epochs (ie, sensitivity), but showed lower specificity (29.39%-52.15%). The Cohen's kappa coefficients of the wearable devices ranged from 0.21 to 0.53, indicating fair to moderate agreement with PSG. CONCLUSIONS: Our results indicate that all devices can benefit from further improvement for multistate categorization. However, the devices with higher Cohen's kappa coefficients, such as the Fitbit Sense (κ = 0.42), Fitbit Charge 5 (κ = 0.41), and Apple Watch Series 8 (κ = 0.53), could be effectively used to track prolonged and significant changes in sleep architecture.

Source: PUBMED · Licensed under CC-BY 4.0

Population

Sample size

n = 62

Age

46.0 ± 12.6 years

Reference standard

psg

62 adults (52 male, 10 female), mean age 46.0 ± 12.6 years, single in-laboratory night with simultaneous PSG and 2-4 wearables.

Devices and metrics

MetricValue95% CINote
Cohen's kappaκ = 0.53Highest of six devices tested per the published abstract. Tested on Series 8, not Series 10. Stage-specific percentages are reported in the full paper but are not extracted into this record; consult the source link for per-stage breakdowns.

Fitbit Charge 5 (also Fitbit Sense)

All studies for this device →
MetricValue95% CINote
Cohen's kappaκ = 0.41Charge 5 kappa per the published abstract. The same study also reports Fitbit Sense at κ=0.42. No Charge 6 specific PSG validation has been published.
MetricValue95% CINote
Cohen's kappasee sourceThe published abstract states Cohen's kappa across all six devices ranged from 0.21 to 0.53 but explicitly names only Sense (0.42), Charge 5 (0.41), and Apple Watch S8 (0.53). The per-device kappa for the Withings ScanWatch is not stated in the abstract; consult the full paper for the breakdown.
MetricValue95% CINote
Cohen's kappasee sourceThe published abstract reports the kappa range across all six devices (0.21-0.53) but does not state the per-device kappa for Whoop 4.0. Consult the full paper for the breakdown.

Cite this study

Schyvens AM, Peters B, Van Oost NC, Aerts JM, Masci F, Neven A, et al. (2025). A performance validation of six commercial wrist-worn wearable sleep-tracking devices for sleep stage scoring compared to polysomnography. Sleep Advances, 6(2), zpaf021. https://doi.org/10.1093/sleepadvances/zpaf021

Source links

Added to the CircaTest meta-analysis on 2026-04-06. How CircaTest evaluates studies →