Stevens Institute of Technology
Generalist EEG-to-text foundation model that translates EEG segments into clinically grounded natural-language descriptions via spectro-spatial grounding and state-space reasoning.
NeuroNarrator is a generalist EEG-to-text foundation model that translates electroencephalography (EEG) recordings into segment-level, clinically grounded natural-language descriptions. Introduced in a March 2026 bioRxiv preprint (also on arXiv) by researchers at the Stevens Institute of Technology, it addresses a core challenge in clinical neuroscience: bridging continuous neural dynamics and the discrete, open-vocabulary language clinicians use to describe them.
Most prior EEG models frame interpretation as closed-set classification over a fixed label set. NeuroNarrator instead generates free-text clinical narratives, enabling open-vocabulary description of EEG findings. To make this possible, the authors assemble NeuroCorpus-160K, the first harmonized large-scale resource pairing more than 160,000 EEG segments with structured, clinically grounded natural-language descriptions, aggregated and standardized from 16 heterogeneous datasets.
The model sits at the intersection of biosignal foundation models (such as NeuroLM) and multimodal large language models, extending the EEG-representation-learning literature toward generative clinical interpretation rather than label prediction.
NeuroNarrator is a multimodal large language model with three main stages. First, a spectro-spatial grounding module uses a contrastive objective to project temporal EEG waveforms and spatial topographic maps into a shared semantic space. Second, a state-space formulation integrates historical temporal-spectral context to capture continuous neural dynamics. Third, the grounded representations condition a language model to generate clinical narratives. Training relies on NeuroCorpus-160K (160,000-plus EEG-clinical-text pairs harmonized from 16 datasets), with a rigorous subject-disjoint training and evaluation split. The authors report evaluations across diverse benchmarks and zero-shot transfer tasks; specific quantitative scores and parameter counts are not detailed in the abstract. No code, weights, or dataset are released with the preprint.
NeuroNarrator targets clinical neurophysiology workflows where EEG must be interpreted and documented—epilepsy monitoring, encephalopathy assessment, and routine EEG review. By producing draft natural-language descriptions of EEG segments, it could assist clinicians in documentation and triage, and serve as a foundation for downstream EEG-language tasks. Its open-vocabulary design also benefits researchers building generalist biosignal interpretation systems.
NeuroNarrator contributes the first large-scale, harmonized EEG-to-text corpus and a generative framework that reframes EEG interpretation from closed-set classification to open-vocabulary clinical narration. Its standardized, subject-disjoint benchmark could help structure a still-nascent area of biosignal foundation models. Because no code, weights, or dataset have been publicly released, the practical reproducibility and clinical adoption of the model remain to be demonstrated.