论文标题
免疫学组成数据的统计分析指南
A Guideline for the Statistical Analysis of Compositional Data in Immunology
论文作者
论文摘要
由于产生了多个大型数据,因此对免疫细胞组成的研究对免疫学具有极大的科学兴趣。从统计的角度来看,这种免疫细胞数据应被视为组成。在组成数据中,每个元素都是正,所有元素总和到常数,可以将其设置为一个元素。标准统计方法不直接用于分析组成数据,因为它们不能适当处理组成元素之间的相关性。在本文中,我们回顾了用于组成数据分析的统计方法,并在免疫学的背景下进行了说明。具体而言,我们专注于使用对数比率转换的回归分析和具有Dirichlet分布的广义线性模型,讨论其理论基础,并用结直肠癌患者产生的免疫细胞分数数据来说明其应用。
The study of immune cellular composition has been of great scientific interest in immunology because of the generation of multiple large-scale data. From the statistical point of view, such immune cellular data should be treated as compositional. In compositional data, each element is positive, and all the elements sum to a constant, which can be set to one in general. Standard statistical methods are not directly applicable for the analysis of compositional data because they do not appropriately handle correlations between the compositional elements. In this paper, we review statistical methods for compositional data analysis and illustrate them in the context of immunology. Specifically, we focus on regression analyses using log-ratio transformations and the generalized linear model with Dirichlet distribution, discuss their theoretical foundations, and illustrate their applications with immune cellular fraction data generated from colorectal cancer patients.