bio.rodeo
ModelsOrganizationsLeaderboardAbout
bio.rodeo

The authoritative source for evaluating biological foundation models. No hype, just honest analysis.

AboutFAQSubmit a modelContact
© 2026 Pulsatance. All rights reserved. ~
Built by Pulsatance
Biosignals foundation models
Biosignals

SelfPAB

NTNU

A transformer encoder self-supervised on 100,000 hours of dual-accelerometer data, used as a frozen feature extractor for human activity recognition.

Released: March 2024

SelfPAB (Self-supervised Pre-training for Accelerometer-Based human activity recognition) addresses a persistent bottleneck in wearable-sensor research: labeled accelerometer data is scarce and expensive to annotate, while unlabeled recordings from population-scale cohorts are abundant. Human activity recognition (HAR) models trained only on small labeled datasets tend to overfit and generalize poorly across sensor placements, populations, and protocols. SelfPAB tackles this by borrowing the pre-train-then-fine-tune recipe that transformed natural language and speech processing, applying it to raw motion signals from body-worn accelerometers.

Developed by Aleksej Logacjov and colleagues at the Norwegian University of Science and Technology (NTNU) and published in Applied Intelligence in 2024, SelfPAB is a transformer encoder pre-trained on roughly 100,000 hours of unlabeled dual-accelerometer recordings drawn from the HUNT4 population health study in Norway. The pre-training objective is masked spectrogram reconstruction: signals are converted to time-frequency spectrograms via a short-time Fourier transform (STFT), portions are masked, and the model learns to reconstruct them, yielding general-purpose motion representations without any activity labels.

Once pre-trained, the encoder is frozen and used as a feature extractor that feeds a lightweight supervised classifier on downstream HAR datasets. This places SelfPAB among the first demonstrations that large-scale self-supervised pre-training—and the data-scaling behavior familiar from language models—transfers effectively to dual-accelerometer human activity recognition.

#Key Features

  • Spectrogram-based masked reconstruction: Raw accelerometer channels are transformed into STFT spectrograms and the model is trained to reconstruct masked time-frequency regions, an objective adapted from self-supervised speech and audio models.
  • Population-scale unlabeled pre-training: Pre-training uses about 100,000 hours of unlabeled dual-accelerometer signals from the HUNT4 cohort, far exceeding the size of typical labeled HAR corpora.
  • Frozen feature extractor: The pre-trained transformer is reused without further weight updates, so downstream tasks only train a small classifier on top of fixed embeddings—reducing labeled-data and compute requirements.
  • Consistent downstream gains: SelfPAB improves macro F1 by roughly 7–14% over fully supervised baselines across five HAR benchmarks.
  • Demonstrated data-scaling behavior: Pre-training on increasing amounts of data (10 to 100,000 hours) yields monotonically improving downstream performance, mirroring scaling trends seen in transformer language models.

#Technical Details

SelfPAB is a transformer encoder operating on STFT spectrograms of dual-accelerometer signals (sensors placed on the lower back and thigh in the HUNT4 protocol). During self-supervised pre-training, the masked reconstruction task forces the network to model temporal and spectral structure across both sensors using approximately 100,000 hours of unlabeled HUNT4 data. The resulting frozen encoder is evaluated as a feature extractor on five downstream HAR datasets—HARTH, HAR70+, PAMAP2, Opportunity, and RealWorld—where a supervised head is trained on its embeddings. Across these benchmarks SelfPAB reports macro-F1 improvements of about 7–14% relative to supervised-from-scratch baselines, with downstream accuracy increasing as the volume of pre-training data grows. Pre-trained weights (upstream_model.ckpt, distributed via Git LFS) and training code are released under the MIT license; the HUNT4 pre-training data is access-controlled and must be requested from the HUNT databank.

#Applications

SelfPAB is aimed at researchers and practitioners working with body-worn accelerometers for physical-activity monitoring, epidemiology, and digital-health studies, where large labeled datasets are rare but unlabeled recordings are plentiful. By providing a reusable pre-trained encoder, it lets groups bootstrap accurate activity classifiers from modest labeled sets, supporting use cases such as quantifying movement and sedentary behavior in cohort studies, clinical activity assessment, and sleep-and-activity analysis. Because the encoder is frozen, it integrates into existing HAR pipelines as a drop-in feature extractor without demanding large-scale GPU fine-tuning.

#Impact

SelfPAB demonstrated that the self-supervised pre-training paradigm—and its characteristic data-scaling benefits—extends from language and audio to dual-accelerometer human activity recognition, an area historically dominated by small supervised models. By releasing pre-trained weights and code, the NTNU team lowered the barrier to building strong HAR systems and seeded follow-up work, including cross-sensor variants (MonoSelfPAB) and long-term spectrogram models (LTA2V) from the same group, as well as broader investigations of scaling laws in wearable activity recognition. Its main limitations are that the pre-training corpus reflects a single population and a specific dual-sensor placement, and that the most valuable asset—the HUNT4 raw data—remains access-restricted, so reproduction depends on the released checkpoints.

Citation

SelfPAB: large-scale pre-training on accelerometer data for human activity recognition

Logacjov, A., et al. (2024) SelfPAB: large-scale pre-training on accelerometer data for human activity recognition. Applied intelligence (Boston).

DOI: 10.1007/s10489-024-05322-3

Recent citations

Papers that recently cited this model.

Not enough citation data yet.

Top citations

The most-cited papers that cite this model.

Not enough citation data yet.

Citations

Total Citations17
Influential1
References48

GitHub

Stars19
Forks4
Open Issues1
Contributors2
Last Push1y ago
LanguagePython
LicenseMIT

Fields of citing research

Not enough data

Openness

bio.rodeo opennessFully open · usable and reproducible
78Open
Usability — can I run it?95
Reproducibility — can I retrain it?58
Model Openness Framework
Class III
Open Model

Tags

accelerometryfoundation_modelhuman_activity_recognitionrepresentation_learningself_supervisedtransformer

Resources

GitHub RepositoryResearch Paper