Review · 6 min read

Apple Watch Series 10: Sleep Tracking Data and Specifications

Apple Watch Series 10 sleep tracking review based on Schyvens et al. (2025) showing kappa 0.53, the best of 6 devices tested. FDA-cleared apnea notification. Full specs.

Summary of Findings · 2019-2025

Evidence summary: Apple Watch Series 10

3 outcomes have measured evidence in the CircaTest corpus, drawn from 6 peer-reviewed studies totaling 929 participants. Each card below answers one buyer question and shows the most representative finding. Hover any certainty badge for the verbatim GRADE definition. The full per-study breakdown is in the Sources panel below.

Outcome 01 of 03

Low

Sleep stage classification

How well does the device tell deep, light, REM, and wake apart?

Agreement

MODERATE

Best evidence

κ = 0.53

From Schyvens et al., 2025 (n = 62)Indirect

Highest of six devices tested per the published abstract. Tested on Series 8, not Series 10. Stage-specific percentages are reported in the full paper but are not extracted into this record; consult the source link for per-stage breakdowns.

Outcome 02 of 03

Low

Sleep vs wake detection

How well does the device know whether you're asleep or awake?

Agreement

ALMOST PERFECT

Best evidence

90%

From Walch et al., 2019 (n = 0)Indirect

Sleep-wake classification accuracy from a research-built classifier on raw Apple Watch data.

+ 6 additional sources contribute to this outcome · see Sources panel below

Outcome 03 of 03

Moderate

Total sleep time accuracy

How accurately does the device measure how long you slept?

Agreement

MODERATE

Best evidence

-16.85 min

95% CI -26.33 to -7.38 min

From Lee et al., 2025 (n = 798)Indirect

Pooled mean difference for total sleep time across ALL devices in the meta-analysis (not Apple Watch specific). Statistically significant underestimation. Per-device breakdown is in the full paper, not the abstract.

+ 3 additional sources contribute to this outcome · see Sources panel below

Each card answers one buyer question. The bold AGREEMENT label maps the underlying statistic to a normalized rubric (Landis & Koch 1977 cutoffs for kappa, standard percent thresholds for accuracy and per-stage agreement) so cards from different devices can be compared at a glance. The GRADE certainty rating is computed across all contributing studies for that outcome, not just the representative one shown. Methodology →

Forest plot · Bias (minutes)

Bias (minutes) for Apple Watch Series 10

Each dot is one peer-reviewed study. Dot size is proportional to the square root of the study's sample size. Horizontal lines show 95% confidence intervals where the source paper reported them. Studies marked in rust tested an earlier generation of this device.

-30-150+15+30Total sleep time bias vs polysomnography · minutesChoe & Kang, 2025n = 0Choe & Kang, 2025n = 0Choe & Kang, 2025n = 0Lee et al., 2025n = 798
Audit · sources & method6 studies · 929 participants · 2026-04-06

Every quantitative claim above traces back to one of the studies listed here. Click any study identifier to verify against the primary source. CircaTest does not own or modify any of these studies; we link out so you can audit the original.

  1. Apple watch accuracy in monitoring health metrics: a systematic review and meta-analysis

    Choe JP, Kang M · Physiological Measurement · 2025

    n = 0 · adult, varies across included studies · clinical · vs ECG

    Tested in this study as: Apple Watch (multiple series; meta-analysis covers the device family)

    Reported metrics for Apple Watch Series 10:
    • Bias (minutes): -0.12Mean bias for heart rate in beats per minute (NOT minutes; the metric kind is recorded as bias_minutes for schema consistency but the unit is bpm). Limits of agreement -11.06 to +10.81 bpm. None of the HR subgroups exceeded the 10% MAPE validity threshold.
    • Bias (minutes): 0.3Mean bias for energy expenditure in kcal/min (NOT minutes). Limits of agreement -2.09 to +2.69. ALL EE subgroups exceeded the 10% MAPE validity threshold, meaning the Apple Watch is NOT considered validated for energy expenditure measurement.
    • Bias (minutes): -1.83Mean bias for step count in steps/min (NOT minutes). Limits of agreement -9.08 to +5.41. Some subgroups exceeded the 10% MAPE threshold, indicating variability.

    CircaTest note: Most comprehensive published meta-analysis of Apple Watch accuracy: 56 studies, 270 effect sizes. Editorially load-bearing because it gives a definitive answer to two common reader questions: (1) Is Apple Watch heart rate accurate? Yes (mean bias -0.12 bpm, none of the subgroups exceed the 10% MAPE threshold). (2) Is Apple Watch energy expenditure accurate? No (every subgroup exceeds the 10% MAPE threshold). Important limitation for CircaTest's editorial focus: this meta-analysis covers HR, energy expenditure, and step counts, NOT sleep stage classification. For Apple Watch sleep accuracy, see Schyvens 2025 and Walch 2019.

    Full study record on CircaTest →
  2. Detection of sleep apnea using only inertial measurement unit signals from Apple Watch: a pilot study with machine learning approach

    Hayano J et al. · Sleep & Breathing · 2025

    n = 61 · clinical · vs polysomnography

    Tested in this study as: Apple Watch (IMU signals only; specific Apple Watch generation not stated in abstract)

    Reported metrics for Apple Watch Series 10:
    • Accuracy: see sourceHayano et al. did not specify which Apple Watch generation was used. CircaTest does not extrapolate the AUC 0.831 figure to Series 10 specifically; consult the full paper for the device generation tested.

    CircaTest note: Important because it validates Apple Watch IMU-only sleep apnea detection, which is methodologically distinct from Apple's own sleep apnea notifications feature (which uses combined sensors). Hayano et al. demonstrate that even ACCELEROMETER-ONLY data from the Apple Watch can detect apnea/hypopnea events at AUC 0.831 in a held-out test set. CircaTest cites this when discussing the underlying feasibility of consumer-wearable apnea screening. Caveat: Random Forest models are not the same as Apple's production algorithm; the AUC figure is for the research classifier, not for what an end user sees on a Series 10 or 11.

    Full study record on CircaTest →
  3. Performance of consumer wrist-worn sleep tracking devices compared to polysomnography: a meta-analysis

    Lee YJ et al. · Journal of Clinical Sleep Medicine · 2025

    n = 798 · varies across included studies · clinical · vs polysomnography

    Tested in this study as: Apple Watch (multiple generations across included studies)

    Reported metrics for Apple Watch Series 10:
    • Bias (minutes): -16.85 minPooled mean difference for total sleep time across ALL devices in the meta-analysis (not Apple Watch specific). Statistically significant underestimation. Per-device breakdown is in the full paper, not the abstract.

    CircaTest note: The most comprehensive recent meta-analysis of consumer wrist-worn sleep trackers vs PSG: 24 studies, 798 patients, 12 different brands including Fitbit, WHOOP, Garmin, Apple Watch, Empatica E4, and Xiaomi Mi Band 5. Headline finding is that across the entire device set, consumer wrist trackers UNDERESTIMATE total sleep time by ~17 minutes (95% CI -26 to -7) and UNDERESTIMATE sleep efficiency by ~4.7 percentage points, both statistically significant. This is the strongest published quantitative answer to the question 'how wrong are consumer trackers on average' across the wrist-worn category. Important limit: pooled across brands, no per-device breakdown extracted into this record.

    Full study record on CircaTest →
  4. A performance validation of six commercial wrist-worn wearable sleep-tracking devices for sleep stage scoring compared to polysomnography

    Schyvens AM et al. · Sleep Advances 6(2):zpaf021 · 2025

    n = 62 · 46.0 ± 12.6 years · healthy · vs polysomnography

    Tested in this study as: Apple Watch Series 8

    Reported metrics for Apple Watch Series 10:
    • Cohen's kappa: κ = 0.53Highest of six devices tested per the published abstract. Tested on Series 8, not Series 10. Stage-specific percentages are reported in the full paper but are not extracted into this record; consult the source link for per-stage breakdowns.

    CircaTest note: Single most editorially important study in the CircaTest corpus. Six commercial wearables tested against PSG in a uniform protocol means the kappa values are directly comparable in a way most validation studies are not. Drives the head-to-head accuracy figures across CircaTest's comparison content. Limitations: tested previous-generation models (Series 8 not 10, Charge 5 not 6, original ScanWatch not 2) so the results are upper bounds for current models, not direct evidence.

    Full study record on CircaTest →
  5. Detecting sleep using heart rate and motion data from multisensor consumer-grade wearables, relative to wrist actigraphy and polysomnography

    Roberts DM et al. · Sleep 43(7):zsaa045 · 2020

    n = 8 · healthy · vs polysomnography

    Tested in this study as: Apple Watch (Series 4-era; multisensor configuration)

    Reported metrics for Apple Watch Series 10:
    • Sensitivity: see sourceThe published abstract reports classifier sensitivity ranging from 0.883 to 0.977 ACROSS all multisensor wearable devices in the study. The abstract does not break sensitivity down per device. Consult the full paper for the per-device assignment.
    • Specificity: see sourceThe published abstract reports classifier specificity ranging from 0.407 to 0.821 across all multisensor wearable devices. Per-device breakdown not stated in the abstract.

    CircaTest note: Important because it directly compares Apple Watch and Oura Ring against ECG and PSG using identical methodology and machine-learning-built classifiers. The published abstract reports aggregated ranges across the device set (sensitivity 0.883-0.977, specificity 0.407-0.821, d' 1.827-2.347) but does not break these down per device, so this CircaTest record stores them as range-only with null per-device values. Anyone needing per-device numbers should consult the full paper at the PMC link.

    Full study record on CircaTest →
  6. Sleep stage prediction with raw acceleration and photoplethysmography heart rate data derived from a consumer wearable device

    Walch O et al. · Sleep 42(12):zsz180 · 2019

    n = 0 · adult (specific range not stated in PubMed abstract) · healthy · vs polysomnography

    Tested in this study as: Apple Watch (Series 2/3 era; raw sensor data via custom app)

    Reported metrics for Apple Watch Series 10:
    • Accuracy: 90%Sleep-wake classification accuracy from a research-built classifier on raw Apple Watch data.
    • Sensitivity: 93%Sleep epoch sensitivity.
    • Specificity: 59.6%Wake epoch specificity.
    • Accuracy: 72%Wake / NREM / REM 3-stage classification with all features.

    CircaTest note: Important because it demonstrates what an open, peer-reviewed Apple Watch sleep classifier can achieve from raw sensor data, independent of Apple's proprietary algorithm. The accompanying PhysioNet dataset is the most-used open dataset for wearable sleep classification research. Useful counterpoint to Apple's closed-algorithm white papers.

    Full study record on CircaTest →

Sources retrieved from PubMed, Europe PMC, and publisher pages. Abstracts shown on individual study records are reproduced under public-domain or fair-use license per their source. Identifiers above link to the original primary source. CircaTest is the curatorial layer; we do not modify the underlying studies.

Data Sources and Methodology

This review compiles data from peer-reviewed validation studies, manufacturer specifications, and aggregated user reports. No first-person testing was conducted. All sleep accuracy figures reference polysomnography (PSG) comparison studies.

The most rigorous independent accuracy data comes from Schyvens et al. (2025), published in Sleep Advances, which tested the Apple Watch Series 8 against PSG. Apple's own validation white paper (October 2025) provides additional per-stage figures. Hardware specifications are from Apple's published product page. Pricing was verified as of March 31, 2026.

Sleep Tracking Accuracy

SpecApple Watch S8 (Schyvens 2025)
Cohen's kappa vs PSG0.53 (best of 6)
Wake accuracy52%
Light sleep83%
Deep sleep51%
REM sleep69%
Apple white paper Deep62%
Apple white paper REM81%

The most rigorous independent test is Schyvens et al. (2025), which tested the Apple Watch Series 8 against PSG, finding a Cohen's kappa of 0.53 — the highest among six devices tested. Stage-specific accuracy from that study: Wake 52%, Light sleep 83%, Deep sleep 51%, REM 69%. Apple's own validation white paper (October 2025) reports higher per-stage figures: Deep sleep 62%, REM 81%.

There is no single "overall percentage" accuracy figure for the Apple Watch; the Cohen's kappa statistic is the appropriate summary. A kappa of 0.53 represents moderate agreement with clinical measurement, and was the best result among the six consumer devices in that study. Note that Schyvens tested the Series 8, not the Series 10. Independent peer-reviewed validation of the Series 10 specifically has not been published at the time of writing.

The device detects light sleep, deep sleep, REM sleep, and naps. The Apple Watch does not provide a consolidated sleep score. Instead, it presents sleep stage data directly. All sleep staging data is available without any subscription. The device also includes a smart alarm feature that can wake users during lighter sleep phases.

Sensor Suite

The Apple Watch Series 10 has the most extensive sensor array among consumer sleep trackers:

  • PPG (photoplethysmography): Optical heart rate sensor on the wrist for continuous heart rate monitoring
  • Accelerometer: Motion detection for sleep-wake classification
  • Temperature sensor: Wrist temperature monitoring for cycle tracking and overnight trends
  • SpO2 (blood oxygen): Blood oxygen measurement via red and infrared LEDs
  • ECG (electrocardiogram): Single-lead ECG for heart rhythm analysis
  • Altimeter: Barometric altimeter (primarily used for daytime activity)
  • GPS: Satellite positioning (primarily used for daytime activity)

The device does not include EDA (electrodermal activity) sensors. With seven sensor types, the Apple Watch has the broadest hardware sensor array of any sleep tracker reviewed, though not all sensors are actively used during sleep tracking.

Health Features

The Apple Watch Series 10 tracks:

  • Heart rate variability (HRV): Overnight and on-demand measurement
  • Resting heart rate: Continuous monitoring with sleep-specific reporting
  • Respiratory rate: Estimated from overnight sensor data
  • Blood oxygen: Overnight SpO2 monitoring with trend data
  • Body temperature: Wrist temperature deviation from baseline
  • Menstrual cycle prediction: Temperature-based cycle tracking
  • Sleep apnea notification: Available through watchOS, FDA-authorized (De Novo, September 2024)

The Apple Watch has FDA De Novo authorization specifically for sleep apnea notification (September 2024). This is a screening/notification tool, not a diagnostic device. Any results should be discussed with a healthcare provider. The device does not diagnose sleep apnea. Independent research by Hayano et al., 2025 demonstrated that per-epoch respiratory event detection is feasible from Apple Watch IMU (accelerometer) signals alone using a Random Forest classifier, reaching AUC 0.831 on a held-out test set of 20 subjects, though that research model is distinct from Apple's production notification algorithm.

The watch does not provide a readiness score, strain tracking, recovery score, or dedicated stress metric as standalone features in the way some competitors structure these.

Hardware and Form Factor

SpecApple Watch S10
Weight36g (46mm alu)
Battery18-36 hours
Water Resistance50m (WR50)
MaterialAluminum/titanium
DisplayAlways-on OLED
Charge Time75 min

The Apple Watch Series 10 weighs 36 grams (46mm aluminum). Key specifications:

  • Battery life: Approximately 18-36 hours
  • Charge time: 75 minutes to full
  • Water resistance: 50 meters (WR50)
  • Material: Aluminum or titanium options
  • Display: Always-on OLED display

The sub-two-day battery life is the shortest among devices in this category by a significant margin. Most competing devices offer 4-35 days between charges. This means the Apple Watch typically requires daily or near-daily charging, which creates a gap in 24-hour wear data. Users commonly report charging during morning routines to maintain overnight tracking coverage.

The always-on display provides at-a-glance access to time and complications but does emit light during sleep. Sleep Focus mode dims the display but does not fully disable it.

Cost of Ownership

The Apple Watch Series 10 uses a one-time purchase model:

  • Hardware: $399 (aluminum), higher for titanium
  • Subscription: None required. No subscription tier for health features.

All sleep tracking features, including sleep staging and sleep apnea notification, are included with the hardware purchase.

Two-year cost: $399 (no additional fees)

Pricing verified as of March 31, 2026.

Compatibility

The Apple Watch has a strict platform requirement:

  • iOS: Supported (required)
  • Android: Not supported
  • Apple Health: Native integration
  • Google Fit: Not supported
  • Strava: Syncs

This is an iOS-only device. An iPhone is required for setup and ongoing use. Android users cannot use the Apple Watch. Within the Apple ecosystem, health data syncs natively to Apple Health with no third-party app required.

Limitations

Several data points are worth noting:

  • Battery life: At 18-36 hours, the Apple Watch has by far the shortest battery life among sleep trackers. Daily charging is effectively required, creating potential gaps in continuous wear data.
  • No sleep score: Unlike most competitors, Apple does not aggregate sleep data into a single score. Users see raw stage data rather than a summarized metric.
  • No readiness or recovery framework: The watch does not provide a daily readiness, strain, or recovery score that contextualizes sleep within an activity framework
  • iOS only: Complete platform lock-in. No Android support.
  • Validation gap: The Schyvens et al. (2025) study tested the Series 8, not the Series 10. Apple's own white paper provides newer per-stage figures but is not independent peer review.
  • Watch form factor for sleep: At 36 grams with an always-on display, the watch is substantially larger and heavier than ring alternatives (2.3-6 grams) for overnight wear
  • 50-meter water resistance: Lower than ring competitors (100 meters), though sufficient for everyday use including swimming
  • Stage-specific accuracy varies: Light sleep detection (83%) is strong; wake (52%) and deep sleep (51%) detection are weaker.

Who the Data Profile Suggests This Fits

Based on the specifications and published data, the Apple Watch Series 10's data profile aligns with:

  • iPhone users who want sleep tracking integrated into a device they already wear for other purposes (notifications, fitness, apps)
  • Users who value having the broadest sensor array available, including ECG and FDA-authorized sleep apnea notification
  • Users who prefer viewing raw sleep stage data rather than a single aggregated score
  • Users who are comfortable with daily charging as part of their routine

The device's data profile is less aligned with users who prioritize multi-day battery life for uninterrupted tracking, Android users (not compatible), or users who want a dedicated, minimal sleep tracker with no display.

Products Mentioned

Apple Watch Series 10$399

Most sensors of any sleep tracker. Cohen's kappa 0.53 vs PSG (Series 8, Schyvens 2025), best of 6 devices tested. FDA-authorized sleep apnea notification. 18-36 hour battery. iOS only.

Not medical advice. This article is for informational purposes only and is not a substitute for professional medical advice, diagnosis, or treatment. Consumer device FDA clearances are for screening, not diagnosis. If you have health concerns, consult a qualified healthcare provider.