论文标题
通过应用难以到达人群的规模,评估数据源在贝叶斯分析中的相对贡献
Evaluating the relative contribution of data sources in a Bayesian analysis with the application of estimating the size of hard to reach populations
论文作者
论文摘要
在分析中使用多个数据源时,重要的是要了解每个数据源对彼此和模型的分析和数据源的一致性的影响。我们建议使用信息框架的回顾性价值来解决此类问题。信息方法的价值在计算上可能很困难。我们说明了计算方法的使用,即使在相对复杂的设置中也可以应用这些方法。 在说明所提出的方法时,我们专注于估计难以到达人群的规模的应用。具体而言,我们考虑通过合并跨越五年多的所有可用数据源和乌克兰众多次国国家的可用数据来源来估计乌克兰注射吸毒者的数量。这种应用引起了公共卫生研究人员的关注,因为这种难以达到人口,在艾滋病毒传播中起着重要作用。我们采用贝叶斯分层模型,并根据绝对影响,预期影响和惊喜水平评估每个数据源的贡献。最后,我们应用信息方法的价值来告知未来数据收集的建议。
When using multiple data sources in an analysis, it is important to understand the influence of each data source on the analysis and the consistency of the data sources with each other and the model. We suggest the use of a retrospective value of information framework in order to address such concerns. Value of information methods can be computationally difficult. We illustrate the use of computational methods that allow these methods to be applied even in relatively complicated settings. In illustrating the proposed methods, we focus on an application in estimating the size of hard to reach populations. Specifically, we consider estimating the number of injection drug users in Ukraine by combining all available data sources spanning over half a decade and numerous sub-national areas in the Ukraine. This application is of interest to public health researchers as this hard to reach population that plays a large role in the spread of HIV. We apply a Bayesian hierarchical model and evaluate the contribution of each data source in terms of absolute influence, expected influence, and level of surprise. Finally we apply value of information methods to inform suggestions on future data collection.