论文标题
关于依赖措施解释的注释
Notes on the interpretation of dependence measures
论文作者
论文摘要
除了相关性和依赖性的经典区别外,许多依赖性措施在其应用和解释中还具有进一步的陷阱。本文的目的是通过明确讨论Pearson的相关性和多元依赖度量来提高和回顾对其中一些局限性的认识:距离相关性,距离多性相关及其Copula版本。讨论的方面包括依赖类型,经验度量的偏见,边际分布和维度的影响。 通常,建议使用适当的依赖度量而不是皮尔森的相关性。此外,一种无分配的度量(至少在某种意义上)可以帮助避免某些系统的错误。然而,在真正的多元设置中,只有相应独立测试的p值提供了始终具有不可解释的价值。
Besides the classical distinction of correlation and dependence, many dependence measures bear further pitfalls in their application and interpretation. The aim of this paper is to raise and recall awareness of some of these limitations by explicitly discussing Pearson's correlation and the multivariate dependence measures: distance correlation, distance multicorrelations and their copula versions. The discussed aspects include types of dependence, bias of empirical measures, influence of marginal distributions and dimensions. In general it is recommended to use a proper dependence measure instead of Pearson's correlation. Moreover, a measure which is distribution-free (at least in some sense) can help to avoid certain systematic errors. Nevertheless, in a truly multivariate setting only the p-values of the corresponding independence tests provide always values with indubitable interpretation.