End-to-end single-cell multimodal analysis framework using deep parametric inference to integrate RNA and protein data into a unified latent space.
DPI (Deep Parametric Inference) is a comprehensive end-to-end framework for modeling and analyzing single-cell multimodal omics data. Developed by researchers at Xiamen University and published in Briefings in Bioinformatics in January 2023, DPI addresses a central challenge in modern single-cell biology: how to meaningfully integrate data from multiple measurement modalities — typically RNA expression and surface protein abundance — into a representation that captures cellular heterogeneity more faithfully than either modality alone.
The core idea behind DPI is to transform each data modality into a shared parameter space by inferring the distributional parameters (mean and variance) that best describe each measurement type under biologically motivated statistical models. RNA expression is modeled as a negative binomial distribution to account for overdispersion and dropout, while protein abundance data from CITE-seq and REAP-seq assays is modeled as a Poisson distribution reflecting its distinct noise characteristics. These modality-specific parameter estimates are then combined into a unified multimodal latent space using a variational autoencoder (VAE) framework.
Compared to approaches that concatenate modalities directly or that learn a single joint encoder, DPI's parametric strategy preserves information about the statistical structure of each data type before integration, which the authors argue produces higher-quality cell representations. Benchmarks across multiple CITE-seq datasets demonstrate improved clustering performance relative to Seurat v4 and TotalVI as measured by the Calinski-Harabasz and Silhouette indices.
DPI is implemented in Python and consists of three cooperative neural networks. The RNA parameter inference network and the protein parameter inference network each act as variational autoencoders: they infer the mean and variance of a standard normal latent distribution after learning to reconstruct the input data under their respective statistical assumptions. A third multimodal parameter inference network receives the concatenated mean and variance vectors from both modality-specific networks and learns a joint latent representation.
Preprocessing follows established single-cell conventions: RNA data is filtered to cells with at least 200 detected genes and genes present in at least 3 cells, with mitochondrial content capped at 20%. RNA counts are log-normalized and scaled to the [0, 1] range. Protein data is normalized using centered log-ratio transformation before scaling. The framework was validated on the 8,617-cell cord blood mononuclear cell (CBMC) dataset, PBMC5K, PBMC10K, and MALT10K datasets. On clustering benchmarks, DPI outperformed both Seurat v4 and TotalVI across Calinski-Harabasz and Silhouette scores, and identified 14 distinct cell subtypes in the CBMC dataset. The framework also demonstrated that the multimodal parameter space recovered dropout-affected gene expression distributions more accurately than individual modality reconstructions.
DPI is designed for researchers working with CITE-seq or REAP-seq data who need to move beyond unimodal analysis. It is particularly useful for characterizing cellular heterogeneity in complex tissue samples such as cord blood, peripheral blood, and bone marrow. The reference-query functionality makes it practical for multi-study analyses where batch effects would otherwise confound direct comparison. The perturbation prediction module provides a computational tool for hypothesis generation in studies of immune cell activation, differentiation, or disease progression — including the authors' demonstrated application to tracking COVID-19 disease progression in peripheral blood mononuclear cells (PBMCs).
DPI contributes a statistically principled approach to a rapidly growing problem in single-cell biology: the integration of heterogeneous measurement modalities that differ in their noise properties, dynamic range, and information content. By grounding integration in the parametric structure of each modality's distribution, DPI avoids assumptions of homogeneity that can distort joint embeddings. The framework's inclusion of vector field-based trajectory analysis and perturbation modeling extends its utility beyond simple dimensionality reduction into mechanistic hypothesis generation. As single-cell multiomics platforms expand to include ATAC-seq, spatial transcriptomics, and metabolomics alongside RNA and protein, frameworks like DPI that treat statistical heterogeneity explicitly will become increasingly relevant. The code and datasets are freely available on GitHub, lowering the barrier for adoption across research groups without specialized machine learning expertise.
Hu, H., et al. (2023) Modeling and analyzing single-cell multimodal data with deep parametric inference. Briefings Bioinform..
DOI: 10.1093/bib/bbad005