The science

No black box.
Just published methods.

NOOP turns your strap's raw heart rate, HRV and motion into three daily scores using open, peer-reviewed sport science. Every number is computed on your device, carries an honest confidence tier, and shows nothing at all rather than fake a value. These are independent approximations of published methods. They are not WHOOP's private algorithms, and NOOP is not a medical device.

12 cited methods On‑device maths Deterministic and testable No fabricated numbers

Before the maths, the protocol

How NOOP reads a strap you own

A WHOOP strap does not speak any standard health profile. To read the band you bought, NOOP talks to it directly over its own private Bluetooth Low Energy protocol, locally, never over WHOOP's servers. That protocol was understood through open community reverse-engineering and re-verified on real hardware. This is interoperability with hardware you own, not impersonation: NOOP reads what your device already records and keeps it on your machine.

The boundary, stated plainly

It is a companion, not a clone.

  • One direction, on your device. Bytes flow from the strap to your phone or Mac and stop there. NOOP has no network client anywhere in its data path.
  • Your hardware, your data. It reads the values your own strap already measured and stores them locally. Nothing is replicated to, circumvented from, or sent to any company's cloud.
  • Not affiliated with WHOOP or Oura. The names identify the hardware NOOP interoperates with, nothing more. It is not a medical device, and every value is an approximate estimate.

The reverse engineering

A hidden service, a checksummed frame

Alongside the two standard Bluetooth services every strap advertises, there sits a hidden, vendor-specific GATT service that carries the real physiological data. It only flows after a quiet bonding step. Two community projects mapped it first; NOOP ports their findings and re-verifies the sensor scales and field offsets against a real strap before trusting a single byte.

The hidden GATT service

The custom service exposes four channels: a command write, a command-response notify, an event notify, and the big fragmented data notify that carries the biometric payloads.

Custom service (WHOOP 4.0)
  61080001-8d6d-82b8-614a-1c8cb0f8dcc6
  • CMD write    → command frames
  • CMD notify   ← responses
  • EVENT notify ← wrist on/off, tap, battery
  • DATA notify  ← fragmented biometric frames

Two standard services act as a sanity check. The standard Heart Rate characteristic (180D / 2A37) streams heart rate and R-R at about 1 Hz without bonding, which makes it NOOP's reliable HR floor while the custom channels supply everything else.

The bond, with no PIN

The custom notify channels stay silent until the link is bonded. The discovery was that a single confirmed write is enough to trigger the operating system's just-works bonding. No PIN, no pairing screen.

// one benign confirmed write bonds the link
GET_BATTERY_LEVEL  → with-response write
// then, once: HELLO, SET_CLOCK, GET_CLOCK,
// stop the raw flood, read the data range

After bonding, NOOP runs a faithful handshake exactly once, sets the strap clock to UTC, then asks for the on-device history. A wrong-length clock write is acknowledged but never latched, a real bug found and fixed here.

The 0xAA frame envelope

Every custom-channel message is a length-prefixed, double-checksummed frame. A cheap header check guards the length so the reassembler can trust it; a full payload check guards the contents.

0xAA | len u16 | crc8 | type | seq | cmd
     | payload… | crc32 LE
crc8  guards the 2 length bytes
crc32 (zlib) guards type+seq+cmd+payload

Bluetooth delivers each frame in MTU-sized fragments, so a reassembler accumulates bytes, finds the 0xAA start, reads the declared length, and only emits a frame once every byte is present and both checksums verify. A bad-checksum frame is dropped, never decoded.

Credit where it is due. The WHOOP 4.0 service, the 0xAA CRC8/CRC32 envelope and the stream layouts come from the community project my-whoop; the WHOOP 5.0 service, the CRC16-Modbus header check and the static client hello come from the community project goose. Where a constant is a direct transcription, the source says so, and sensor scales and offsets are additionally re-verified on real hardware.

Two generations, one decoder

Harvard and puffin

WHOOP 4.0 (codename "Harvard") and WHOOP 5.0 (codename "puffin") differ in just a handful of places: their service UUIDs, their header checksum, where the inner record begins, and how a session starts. Every generation difference is funnelled through a single switch, so everything downstream of the decode is generation-agnostic.

AspectWHOOP 4.0 · HarvardWHOOP 5.0 · puffin
GATT service61080001…fd4b0001…
Header checkCRC8 over the length bytesCRC16-Modbus over the header
Inner record starts atbyte 4byte 8
Payload checkCRC32 (zlib)CRC32 (zlib), unchanged
Session startconfirmed-write bond, then hellostatic CLIENT_HELLO frame

The 5.0 header check

WHOOP 5.0 swapped the cheap length-byte CRC8 for a CRC16-Modbus over the first six bytes of the frame, and moved the inner record from byte 4 to byte 8.

0xAA | 0x01 | len u16 | hdr u16
     | crc16-modbus over frame[0..6]
     | type | seq | cmd | data… | crc32 LE
poly 0xA001, init 0xFFFF, reflected

The payload CRC32 is identical to 4.0. The whole 4-versus-5 difference is one branch on the device family.

Hardware-verified, not assumed

The 5.0 path is confirmed on a real strap. It needs an encrypted link first, so the session bonds before anything else, then mirrors the 4.0 flow on the new transport.

  • A full historical offload runs, the trim cursor walks forward, and every record decodes CRC-valid.
  • Live and historical heart rate matched the bonding-free 2A37 ground truth at 96 of 96 overlapping timestamps.
  • A WHOOP 4 and a WHOOP 5 on the same wearer agreed at corr 0.96 across roughly 28,000 samples.

Key the decode on the version byte

Within one generation a strap may run different firmware with a different record layout. NOOP never assumes one layout transfers to another.

  • Field offsets are read off real captured frames and cross-checked physiologically.
  • Unmapped regions are kept raw and labelled, never filled with an invented field.
  • Destructive commands (reboot, firmware load, trim) are excluded by design, so the app can never brick a device.

One direction, on your device

From Bluetooth bytes to your screen

The whole pipeline lives on your phone or Mac. Bytes are decoded, checksum-verified and reassembled, written to a local SQLite store, turned into scores by pure on-device analytics, and drawn. Nothing is ever sent off-device, because there is nowhere to send it.

📲BLE bytesRaw fragments off the strap's private GATT service, over Bluetooth
🧮DecodeReassemble fragments, verify CRC8/CRC32 or CRC16-Modbus, parse the frame
💾On-device SQLiteTyped 1 Hz rows in a local store, GRDB on Mac and iOS, Room on Android
📊Pure local analyticsDeterministic, database-free maths cleans HRV, stages sleep, scores the day
Your screenCharge, Effort and Rest, drawn on a device you own

The strap holds about 14 days of history on-device, and NOOP re-offloads it locally about every 15 minutes while connected, scoring each night the strap dumped. A chunk is only forgotten by the strap once NOOP has the data durably stored and has confirmed the acknowledgement, so an interrupted offload resumes exactly where it stopped. The expensive raw sensor flood is switched off on connect to spare Bluetooth airtime and strap battery. No step in this chain touches a server.

Three questions, three methods

Charge, Effort and Rest, each on a 0 to 100 scale, each traceable to its source.

How recovered are you?

Charge

Your overnight autonomic recovery, read from heart-rate variability and resting heart rate against your own rolling baseline.

Method. RMSSD and SDNN per the Task Force (1996) HRV standard, cleaned with the Malik (1989) local-median rule, scored against a two-pass personal baseline.

How hard did your heart work?

Effort

Your day's cardiovascular load. Easy days sit low, an all-out day approaches 100, and that stays genuinely rare.

Method. Heart-rate reserve (Karvonen 1957), accumulated as a training impulse with Edwards (1993) 5-zone or Banister (1991) exponential TRIMP, on a log scale. HRmax via Tanaka (2001).

How restorative was your sleep?

Rest

A composite over each staged night: how long, how efficient, and how settled your heart was through it.

Method. Sleep/wake from Cole-Kripke (1992) actigraphy, refined with HRV, heart rate and respiration into 4 stages. Honestly ~65 to 73% epoch agreement (Walch 2019); deep sleep is the weakest estimate.

Each score is a pure, deterministic function of your strap's raw streams. Same inputs, same outputs, every time, which is exactly what lets the maths be unit tested against fixed vectors. The display names changed (Recovery to Charge, Strain to Effort, Sleep Performance to Rest) but the stored data keys did not, so years of history keep working.

Inside a score

Every number, opened up.

Every number on this screen was decoded from your own strap, checksum-verified, and scored on your device. Tap any score and it unfolds into the exact terms that built it: each driver, its weight, and how far it sat from your personal baseline. Nothing is hidden behind a brand. If a term is missing, you see that too, and the remaining weights renormalise in front of you.

0 to 100one shared scale
3 tiersof confidence
0black boxes
A NOOP score broken down into its driver terms, each with its weight and its distance from the personal baseline.

Score one, in full

Charge, term by term

Charge is a robust z-score plus logistic composite, led by your heart-rate variability measured against your own rolling baseline. Higher HRV versus baseline means more Charge. These are our weights, openly documented, not WHOOP's private model.

DriverWeightDirection
HRV vs baseline0.55higher gives more Charge (dominant)
Resting HR vs baseline0.20lower gives more
Rest quality (last night's sleep)0.15higher gives more
Respiration vs baseline0.05lower gives more
Skin-temp deviation0.05further from baseline gives less

The robust z-score

Every driver is standardised against your personal baseline using a robust z, not a raw mean and standard deviation.

z = (value − mean) / (1.253 · spread)

The 1.253 converts an EWMA mean-absolute-deviation into an approximate Gaussian sigma, since the expected absolute deviation of a normal is about sigma divided by 1.253. For lower-is-better drivers (resting HR, respiration) the z is simply inverted.

The logistic squash

The weighted-mean z is squashed onto 0 to 100 with a logistic curve.

score = 100 / (1 + e^(−1.6 · (z + 0.20)))

The slope of 1.6 puts roughly plus or minus two z across the full red-to-green band. The offset anchors a z of zero to about 58 percent, matching the published population-average recovery. Missing terms drop out and the weights renormalise, so a thin day still reads honestly.

Bands: red below 34, yellow 34 to 67, green at or above 67. Resting HR is the lowest sustained floor of the night, taken as the minimum of five-minute non-overlapping bin means so a single dropped beat cannot define it.

Score two, in full

Effort, and the log scale that keeps a 100 rare

Effort turns every second of heart rate into a training impulse, weights time in harder zones more heavily, then compresses it logarithmically so easy days sit low and an all-out day approaches 100. It is an independent implementation of published exercise physiology, not a reproduction of WHOOP's Day Strain.

1. Heart-rate reserve

Karvonen (1957). Intensity is read as a share of your usable range, not raw bpm.

HRR = HRmax − RHR
%HRR = (HR − RHR) / HRR × 100

HRmax comes from the observed 99.5th percentile once there are 600-plus samples, floored by Tanaka (2001): HRmax = 208 minus 0.7 times age.

2. The training impulse

Each sample contributes a weighted dose of effort, by one of two published methods.

Edwards (1993): zone weight 1–5 × duration
  at the 50/60/70/80/90 %HRR cut-offs
Banister (1991): dur × x · 0.64 · e^(b·x)
  x = %HRR/100, b = 1.92 (men) / 1.67 (women)

3. The logarithmic ceiling

The accumulated impulse is compressed onto 0 to 100.

Effort = 100 · ln(TRIMP + 1) / ln(D),  D = 7201

D equals 7201 by design: the Edwards daily ceiling is top zone weight 5 held for a full 24 hours, that is 5 times 1440 equals 7200. So ln(7201) over ln(7201) is exactly 1, mapping that ceiling to precisely 100. The old 0 to 21 scale used the identical denominator, so the rungs never moved. A 100 today is as rare as a 21.0 once was.

A long walk with little cardio still counts: when cardio impulse is low but step and active-energy load is high, Effort is raised to a movement-derived floor. Imported WHOOP Day Strain (0 to 21) is rescaled by 100 over 21 on import, a lossless round trip, so everything on the Effort axis shares one scale.

74EFFORT

One all-out day should feel like one.

The log curve is the honest part. Hundreds of impulse units separate a gentle day from a brutal one, but a single number stays legible, and a 100 stays earned.

Score three, in full

Rest is four things at once

Rest is a 0 to 100 composite over each staged night, not a bare efficiency number. It blends four components, weighted toward the one that matters most: did you actually get enough sleep.

ComponentWeightWhat it measures
Duration vs your personal need0.50how long you slept against your own need (8 h default, refined by your recent average)
Efficiency (asleep / in‑bed)0.20how efficiently you slept once down
Restorative share ((deep + REM) / asleep)0.20how much of the night was restorative
Consistency0.10how regular your sleep and wake timing is

Rest consumes whatever stages each device can provide (motion on the 4.0, PPG and motion on the 5 and MG as it unlocks). The blend is similar in spirit to a sleep performance percentage, but it is our own, and it feeds straight back into Charge as the rest-quality driver.

Honest by default

Solid, Building, Calibrating

Every score carries one of three confidence tiers, so a sparse day reads truthfully instead of faking a number. When NOOP cannot compute a score honestly, it shows nothing at all.

Full inputs present

Solid

A trusted personal baseline (14-plus valid nights) and complete raw streams behind the number. This is the score at full confidence.

Enough to show, but thin

Building

Usable but with higher uncertainty: a provisional baseline (roughly 4 to 13 nights), or a 5/MG day backed mostly by PPG-derived heart rate. Shown, but flagged.

Still learning your body

Calibrating

The baseline is not usable yet (fewer than 4 valid nights), or there is no in-bed data for Rest, or no heart-rate window for Effort. The scorer returns nothing rather than a fabricated value.

Charge is the strictest, because HRV is its dominant driver and it needs several nights to learn your personal baseline first. Until then it stays in Calibrating, which is more honest than guessing.

A person resting in soft natural light.

Underneath every score

The HRV cleaning pipeline

HRV leads Charge, so a single bad beat cannot be allowed to swing it. Before any RMSSD, SDNN or pNN50 is computed, the raw R-R intervals pass through a deterministic three-stage clean, exactly per the Task Force (1996) standard.

  1. Range filter. Drop any interval outside 300 to 2000 ms, that is roughly 200 down to 30 bpm.
  2. Ectopic rejection (Malik 1989). Drop any beat more than 20 percent from a local median over a centred 5-beat window, removing physiologically impossible jumps.
  3. Sufficiency gate. Require at least 20 clean intervals before returning a trusted reading. Below that, the result is reported empty rather than shaky.

The app reports the sample counts before and after cleaning, so you can see the quality of every reading. Honest substitution: the reference pipeline used a Kubios-style classifier that is not available on-device, so NOOP uses the simpler, fully deterministic Malik rule. It does not model missed or extra-beat insertion the way Kubios does, and we say so.

Where the limits are real

Sleep staging, honestly

Without an EEG, four-class staging from a wrist signal has a known ceiling of about 65 to 73 percent epoch agreement (Walch 2019). We do not pretend otherwise. Light versus deep separation is the weakest link, so deep-minute estimates are the least reliable output we produce.

The sleep and wake spine

A gravity-stillness detector builds the night, cross-checked by a citable te Lindert 30-second Cole-Kripke index.

SI = 0.001 · Σ wₐ·Aₐ,  asleep iff SI < 1
weights [106, 54, 58, 76, 230, 74, 67]

A run must exceed 60 minutes and be heart-rate confirmed (mean HR within 1.05 of the day's median) before it counts as sleep.

Four classes, from the body

Per 30-second epoch, against session-relative percentiles:

  • Deep: still body, high parasympathetic tone (RMSSD at or above the 70th percentile), low HR, regular breathing.
  • REM: still body, activated cardiac signal, irregular respiration.
  • Wake: sustained motion with an activated heart.
  • Light: everything else, the honest default.

Physiology re-imposed

The label sequence is median-smoothed over 5 epochs, then sanity-corrected against what sleep actually does:

  • No REM in the first 15 minutes after onset.
  • No deep after the first third of the night, since deep is biased early.
  • Frequency-domain HRV (HF, LF/HF) is omitted on-device, so parasympathetic tone is RMSSD only. We say so plainly.

And the numbers around the edges

Two more figures, each cited, each labelled as the approximation it is.

Calories

A per-second blend of Keytel (2005) heart-rate energy expenditure and a revised Harris-Benedict resting rate, with sex-specific coefficients. Below a resting threshold the BMR rate is used, above it the HR-driven active rate. This is an estimate, not laboratory calorimetry, and it is labelled as such.

Fitness Age

A cardiorespiratory fitness comparison expressed in years, from the Nes (2011) non-exercise VO₂max model behind the HUNT study and NTNU/CERG. It answers one question: how does your fitness compare to a typical person, in years. It is not a biological or clinical age.

Why it needs no tape measure

The headline age comes from a self-consistent inversion: we solve for the age at which a fixed reference person (resting HR 65, activity index 5) would have your estimated VO₂max. The body term cancels, so the number is driven by your resting HR and reconstructed activity alone. Weight and waist only ever unlock the explicit VO₂max figure.

The Nes 2011 model

Men:   VO₂max = 100.27 − 0.296·age + 0.226·PA
              − 0.369·waist − 0.155·RHR   (SEE 5.70)
Women: VO₂max =  74.74 − 0.247·age + 0.198·PA
              − 0.259·waist − 0.114·RHR   (SEE 5.14)

RHR is a rolling 7-day median. PA is a physical-activity index NOOP reconstructs on-device from the training you actually did, mapped onto the same 0 to 7.5 scale the HUNT questionnaire produced, so the input is honest rather than self-reported.

Why we show a band, not a point

The model's standard error of estimate is roughly 5 ml/kg/min on the VO₂max itself. That is large, so the number is a direction of travel over weeks, not a lab measurement. We respect that in the UI:

  • Updated weekly, on Saturday, off 7-day medians.
  • A week needs resting-HR data from at least 4 of its 7 nights, or it reports not ready.
  • Shown inside a plus or minus 5-year band, never as false-precision.

The same maths, in your hand

Open, on every screen

The scoring lives in a small, deterministic, open package. Here is what it looks like once it reaches the app.

The NOOP Today screen with Charge, Effort and Rest rings under a dusk sky.
A score broken down into its driver terms and weights.
A rest and recovery day view.
The NOOP feature menu listing Coach, Workouts, Health, Lab Book, Stress, Breathe and Intervals.

Computed where it is captured.

Every calculation runs on your phone or Mac. Your heart rate, your HRV, your sleep. None of it is uploaded, because there is nowhere to upload it to.

End to end

From strap to score, without a server

The whole pipeline lives on your device. The recompute engine runs about every 15 minutes while connected, scoring each night the strap offloaded. Where a WHOOP export already covers a day, that export still wins.

🔗 Strap over Bluetooth, or a WHOOP CSV, or an Apple Health export
💾 Decoded into 1 Hz streams in a local SQLite store
🧮 The on-device engine cleans HRV, stages sleep, scores Charge, Effort and Rest
Persisted under your own device source, merged under any import

Where we are honest

The fine print, up front.

  • These are approximations. They aim at the same questions WHOOP answers, using open science, so they track in direction but will not match number for number. That is the point.
  • Not WHOOP's algorithms. We do not have WHOOP's proprietary scoring and do not pretend to. NOOP is independent and not affiliated with WHOOP or Oura.
  • Sleep staging has a ceiling. Without an EEG, four-class staging tops out around 65 to 73 percent epoch agreement, and deep-sleep minutes are the weakest output. Frequency-domain HRV is omitted on-device, so parasympathetic tone is RMSSD only.
  • Fitness Age is a comparison, not an age. It is a fitness comparison in years off a model with a large standard error, shown in a plus or minus 5-year band. It says nothing about your biological age, disease or longevity.
  • Calories are an estimate. A Keytel and Harris-Benedict blend, not laboratory calorimetry, and labelled that way wherever it appears.
  • Not a medical device. Nothing here is medical advice or a diagnosis. The illness early-warning is a wellness nudge from your own baselines, not a clinical screen.
  • Honest by default. Every score carries a Solid, Building or Calibrating tier. When NOOP cannot compute a number honestly, it shows nothing rather than fake one.
  • Readable and checkable. The maths lives in a small, deterministic, open package you can read, run and test yourself.

The spine of the project

No account. No cloud. No server to breach.

Every calculation on this page runs where the data is captured, on your own device. There is no account to create, no analytics, no tracking, no ads. Your most personal data, your heart and your sleep, never leaves your phone or Mac. You own it, and you can export it.

0servers, so there is nothing to breach
0accounts, trackers or ads
100%computed on your own device
1direction: bytes flow in, nothing flows out

Open and local by design, so the maths can be read, run and tested, and can outlive any one company.

The quieter case

The most sustainable wearable is the one you already own.

Wearables are typically obsolete in two to three years, glued shut and hard to repair, and replacement is driven by subscriptions and planned obsolescence rather than failure. About 30 percent of fitness trackers and smartwatches are abandoned. When you cancel a subscription, the band you bought can quietly become e-waste.

62 Mtof e-waste generated in 2022, up 82% since 2010, about 7.8 kg per person
82 Mtthe projected total by 2030 on current trends
22.3%formally collected and recycled, about US$62bn of recoverable materials lost
30%of fitness trackers and smartwatches are abandoned (Gartner)

NOOP keeps the strap you already bought useful for years, with no forced upgrade, no account, and no subscription that can brick it. On-device means no data-centre energy is spent per reading. Honest framing: NOOP is not a fix for the e-waste problem, it just refuses to add to it.

E-waste figures: UN / ITU / UNITAR Global E-waste Monitor 2024, ewastemonitor.info.

References

Task Force of the ESC and NASPE (1996). Heart rate variability: standards of measurement, physiological interpretation, and clinical use. Circulation 93(5):1043 to 1065.
Malik M, et al. (1989). Heart rate variability and the artefact-rejection rule used to clean R-R intervals.
Karvonen MJ, Kentala E, Mustala O (1957). The effects of training on heart rate. Ann Med Exp Biol Fenn 35(3):307 to 315.
Edwards S (1993). The Heart Rate Monitor Book. The 5-zone training-impulse summation.
Banister EW (1991). Modeling elite athletic performance. The exponential training-impulse method.
Tanaka H, Monahan KD, Seals DR (2001). Age-predicted maximal heart rate revisited. J Am Coll Cardiol 37(1):153 to 156.
Cole RJ, Kripke DF, et al. (1992). Automatic sleep/wake identification from wrist activity. Sleep 15(5):461 to 469. The actigraphy sleep/wake spine.
te Lindert BHW, Van Someren EJW (2013). Sleep estimates using microelectromechanical systems (MEMS). Sleep 36(5):781 to 789. The 30-second Cole-Kripke index and weights we cross-check against.
Walch O, et al. (2019). Sleep stage prediction from a consumer wearable. Sleep 42(12). The ~65 to 73% epoch-agreement ceiling we quote, and the difference-of-Gaussians HR-variability feature.
Lipponen JA, Tarvainen MP (2019). A robust algorithm for heart rate variability time series artefact correction (the Kubios method). J Med Eng Technol 43(3):173 to 181. The reference cleaner we approximate with Malik on-device, and say so.
Keytel LR, et al. (2005). Prediction of energy expenditure from heart rate. J Sports Sci 23(3):289 to 297. The active-energy term.
Roza AM, Shizgal HM (1984). The Harris-Benedict equation reevaluated. Am J Clin Nutr 40(1):168 to 182. The revised resting-rate term.
Nes BM, et al. (2011). Estimating VO₂peak from a nonexercise prediction model: the HUNT Study, Norway. Med Sci Sports Exerc 43(11):2024 to 2030. The Fitness Age source model and coefficients.
Kurtze N, et al. (2008). Validation of the HUNT1 physical-activity questionnaire that defines the activity index.
JAHA / Ball State (2020), PMC7428991. Independent reproduction of the Nes non-exercise VO₂max model.
CERG / NTNU. The Cardiac Exercise Research Group, the team behind the original Fitness Age work, corroborating the approach.

The full, source-linked write-up lives in the project wiki under Analytics, Architecture and BLE Reverse Engineering, with the maths in the open StrandAnalytics package and the protocol decode in WhoopProtocol. Read it on the repo. Questions or corrections, write to thenoopapp@gmail.com.