TissueNarrator reframes the analysis of spatially resolved transcriptomics as a language modeling problem. Spatial transcriptomics technologies such as MERFISH and CosMx SMI measure gene expression while preserving the physical location of each cell within a tissue section, producing rich maps of cellular state and neighborhood context. Making generative, predictive use of this data—rather than merely descriptive clustering—has remained difficult because cellular profiles, spatial coordinates, and metadata do not naturally fit a single model. TissueNarrator addresses this by encoding each tissue section as a sequence of "spatial sentences": ranked lists of expressed genes paired with spatial coordinates and metadata, rendered as text that a large language model can read and generate.

Developed by Jian Ma's lab at Carnegie Mellon University and released as a bioRxiv preprint in November 2025, the model adapts the open-weight Qwen3-4B-Base language model through parameter-efficient LoRA fine-tuning. Rather than training a bespoke architecture from scratch, TissueNarrator transfers the sequence-modeling capacity of a general pretrained LLM to the structured, spatially-aware language of tissue biology.

This positions TissueNarrator within a growing class of single-cell and spatial foundation models that borrow the tokenize-and-generate paradigm of natural language processing. Its distinguishing move is treating spatial context as part of the generated sequence, enabling a single model to simulate realistic profiles, reason about intercellular communication, and predict the consequences of perturbations.

Key Features

Spatial sentences: Tissue sections are serialized into text sequences combining ranked gene lists, spatial coordinates, and metadata, letting a standard LLM operate directly on spatial transcriptomics data.
Generative cell profiles: The model generates realistic cellular expression profiles conditioned on spatial and metadata context rather than only classifying existing cells.
Intercellular interaction prediction: TissueNarrator predicts cell-cell interactions and recovers known ligand-receptor signaling pathways from spatial neighborhoods.
In silico perturbation: It supports computational perturbation experiments, simulating how cellular states shift in response to gene-level changes.
Parameter-efficient adaptation: LoRA fine-tuning of Qwen3-4B-Base keeps training tractable while reusing a general-purpose pretrained language model.

Technical Details

TissueNarrator is built on Qwen3-4B-Base, a 4-billion-parameter transformer language model, adapted via low-rank adaptation (LoRA) so that only a small set of additional weights are trained. Inputs are constructed as spatial sentences—per-cell ranked gene expression combined with spatial coordinates and metadata—so that autoregressive next-token prediction becomes the mechanism for generating and reasoning about tissue. The authors evaluate the approach across three spatial profiling platforms: MERFISH, Perturb-FISH, and CosMx SMI, spanning tasks of profile generation, intercellular interaction inference, ligand-receptor pathway recovery, and in silico perturbation. A pretrained LoRA checkpoint fine-tuned on a MERFISH mouse-brain dataset is distributed via Google Drive, with optional per-dataset fine-tuning supported. Training and inference in the reported experiments used an NVIDIA GPU with roughly 48 GB of VRAM.

Applications

TissueNarrator is aimed at researchers studying tissue organization, cellular neighborhoods, and signaling in spatial transcriptomics data. By generating realistic cellular profiles and predicting intercellular interactions, it can help prioritize candidate ligand-receptor pathways, hypothesize the composition of cellular microenvironments, and screen perturbations computationally before committing to wet-lab Perturb-FISH or similar experiments. Because it adapts a general open-weight LLM with lightweight LoRA training, groups can fine-tune it on their own spatial datasets without the cost of building a model from scratch.

Impact

TissueNarrator is an early demonstration that general-purpose large language models can be repurposed as generative engines for spatial transcriptomics through a text-based encoding of tissue. By unifying profile generation, interaction prediction, pathway recovery, and perturbation simulation in one LLM-based framework, it broadens the toolkit for spatial biology beyond descriptive analysis toward generative, hypothesis-driven modeling. As a recent preprint, its benchmarks await peer review and broader independent validation, and practical adoption is constrained by the substantial GPU memory (~48 GB VRAM) required; the open MIT-licensed code and released checkpoint nonetheless lower the barrier for other labs to reproduce and extend the approach.

Key Features

Spatial sentences: Tissue sections are serialized into text sequences combining ranked gene lists, spatial coordinates, and metadata, letting a standard LLM operate directly on spatial transcriptomics data.

Generative cell profiles: The model generates realistic cellular expression profiles conditioned on spatial and metadata context rather than only classifying existing cells.

Intercellular interaction prediction: TissueNarrator predicts cell-cell interactions and recovers known ligand-receptor signaling pathways from spatial neighborhoods.

In silico perturbation: It supports computational perturbation experiments, simulating how cellular states shift in response to gene-level changes.

Parameter-efficient adaptation: LoRA fine-tuning of Qwen3-4B-Base keeps training tractable while reusing a general-purpose pretrained language model.

Technical Details

Applications

Impact

TissueNarrator

Key Features

Technical Details

Applications

Impact

Citation

TissueNarrator: Generative Modeling of Spatial Transcriptomics with Large Language Models

Recent citations

Top citations

Citations

GitHub

Fields of citing research

Openness

Tags

Resources

TissueNarrator

Key Features

Technical Details

Applications

Impact

Citation

TissueNarrator: Generative Modeling of Spatial Transcriptomics with Large Language Models

Recent citations

Top citations

Citations

GitHub

Fields of citing research

Openness

Tags

Resources

TissueNarrator

#Key Features

#Technical Details

#Applications

#Impact

Citation

TissueNarrator: Generative Modeling of Spatial Transcriptomics with Large Language Models

Recent citations

Top citations

Citations

GitHub

Fields of citing research

Openness

Tags

Resources

TissueNarrator

#Key Features

#Technical Details

#Applications

#Impact

Citation

TissueNarrator: Generative Modeling of Spatial Transcriptomics with Large Language Models

Recent citations

Top citations

Citations

GitHub

Fields of citing research

Openness

Tags

Resources

Key Features

Technical Details

Applications

Impact

Key Features

Technical Details

Applications

Impact