bio.rodeo
ModelsOrganizationsLeaderboardAbout
bio.rodeo

The authoritative source for evaluating biological foundation models. No hype, just honest analysis.

AboutFAQSubmit a modelContact
© 2026 Pulsatance. All rights reserved. ~
Built by Pulsatance
Biosignals foundation models
Biosignals

Large Connectome Model (LCM)

University of North Carolina at Chapel Hill

A 1.2B-parameter fMRI foundation model of brain connectomes that uses brain-environment interaction tokens and multitask learning for behavior and disease prediction.

Released: October 2025
Parameters: 1.2 Billion

The Large Connectome Model (LCM) is a foundation model for functional magnetic resonance imaging (fMRI) that learns general-purpose representations of the human brain connectome — the network of statistical dependencies between activity in different brain regions. Clinical neuroimaging studies are typically constrained by small cohorts, which limits the performance of task-specific deep learning models. LCM addresses this bottleneck by pretraining on a large, demographically diverse pool of healthy and patient scans and then transferring to downstream clinical tasks with limited labeled data.

The model's central idea is to treat the brain not in isolation but in the context of its environment. LCM tokenizes "brain-environment interactions" (BEI) — pairings of connectome features with demographic and environmental variables such as age, sex, and behavioral measures — and learns across many of these interactions simultaneously in a multitask landscape. This framing lets the model exploit weak supervisory signal from metadata that usually accompanies fMRI scans, turning otherwise unlabeled data into useful pretraining targets.

LCM was developed by Ziquan Wei, Tingting Dan, and Guorong Wu at the University of North Carolina at Chapel Hill (ACMLab) and published at AAAI-26. It is among the largest publicly released brain connectome foundation models, with pretrained weights and code made available for reproducibility.

#Key Features

  • Brain-environment interaction tokens: LCM encodes connectome features jointly with demographic and environmental variables, converting routinely collected metadata into pretraining signal rather than discarding it.
  • Multitask pretraining: The model is pretrained across multiple tokenized brain-environment interactions at once, learning shared representations that generalize across cohorts and clinical questions.
  • Semi-supervised finetuning: Downstream adaptation uses pseudo-labels derived from the pretrained BEI representations, improving performance when labeled clinical data is scarce.
  • Broad clinical coverage: A single pretrained backbone supports sex prediction, behavior recognition, and early diagnosis of Autism, Parkinson's disease, Alzheimer's disease, and Schizophrenia.
  • Open weights and code: Pretrained LCM weights and the full training pipeline are released, lowering the barrier for neuroimaging groups to fine-tune on their own cohorts.

#Technical Details

LCM uses a decoder-only Transformer architecture; the largest variant (LCM-Big) has 32 layers and approximately 1.2 billion parameters, combining multi-head self-attention over connectome features with cross-attention to the tokenized brain-environment interactions. It was pretrained on roughly 10,036 subjects drawn primarily from the Human Connectome Project Aging (HCPA) and Young Adult (HCPYA) cohorts, then evaluated across eight fMRI datasets including ADNI, PPMI, ABIDE, Taowu, Neurocon, and a schizophrenia cohort. Reported results include 86.30% accuracy (85.33% F1) for Alzheimer's diagnosis, 81.30% accuracy (84.18% F1) for Parkinson's disease, and 71.46% accuracy (72.50% F1) for Autism, with sex-prediction accuracy reaching up to 100% on smaller datasets. Pretrained weights are distributed via Google Drive linked from the GitHub repository.

#Applications

LCM is aimed at neuroimaging and clinical research groups that want strong baselines for connectome analysis without training large models from scratch. Because the pretrained backbone transfers across cohorts, researchers can fine-tune it on small disease-specific datasets for early diagnosis of neurodegenerative and psychiatric conditions, or use it for phenotype and behavior prediction in cognitive neuroscience studies. The brain-environment interaction framing is particularly useful for studies that collect rich demographic and behavioral metadata alongside resting-state or task fMRI.

#Impact

LCM extends the foundation-model paradigm that reshaped protein and genomics research into functional neuroimaging, a field where small sample sizes have long limited deep learning. By demonstrating that large-scale multitask pretraining on connectomes plus environmental context improves transfer to diverse clinical tasks, it offers a reusable backbone for the neuroimaging community and an open reference point for future brain foundation models. Its main limitations are typical of the area: pretraining draws heavily on healthy-adult HCP cohorts, downstream clinical datasets remain modest in size, and connectome construction depends on upstream preprocessing and atlas choices that can affect generalization.

Citation

Large Connectome Model: An fMRI Foundation Model of Brain Connectomes Empowered by Brain-Environment Interaction in Multitask Learning Landscape

Wei, Z., et al. (2026) Large Connectome Model: An fMRI Foundation Model of Brain Connectomes Empowered by Brain-Environment Interaction in Multitask Learning Landscape. Proceedings of the AAAI Conference on Artificial Intelligence.

DOI: 10.1609/aaai.v40i3.37198

Recent citations

Papers that recently cited this model.

Not enough citation data yet.

Top citations

The most-cited papers that cite this model.

Not enough citation data yet.

GitHub

Stars4
Forks1
Open Issues0
Contributors1
Last Push10mo ago
LanguagePython

Fields of citing research

Not enough data

Openness

bio.rodeo opennessClosed · low usability and reproducibility
26Closed
Usability — can I run it?24
Reproducibility — can I retrain it?15
Model Openness Framework
Unclassified
Restrictive license on core components

Tags

behavior_predictionbrain_connectomedisease_diagnosisfmrifoundation_modelmulti_taskphenotype_predictionself_supervisedtransformer

Resources

GitHub RepositoryResearch PaperResearch Paper