Structured factorization for single-cell gene expression data

in: Journal of Royal Statistical Society - Series C, 2026.

Citation: Canale, A., Galtarossa, L., Risso, D., Schiavon, L., Toto, G. (2026) Structured factorization for single-cell gene expression data, in Journal of Royal Statistical Society - Series C. doi: 10.1093/jrsssc/qlad068

Abstract: Motivated by the analysis of complex single-cell gene expression data we propose a Bayesian class of generalized factor models for high dimensional count data. The developed methodology allows us to incorporate external knowledge, such as biological pathways, into the model’s prior distribution. This approach promotes sparsity in the factor loadings facilitating their interpretation and that of the corresponding latent factors. We demonstrate the effectiveness of our model on single-cell RNA sequencing data obtained from cord blood mononuclear cells, revealing promising insights into the role of pathways in characterizing gene relationships and extracting valuable information about unobserved cell traits.

Link to the paper