论文标题
二进制回归是否可以进行无分布推断?
Is distribution-free inference possible for binary regression?
论文作者
论文摘要
对于带有二进制标签响应的回归问题,我们检查了在特征上构建标签概率的置信区间的问题。在我们没有有关基础分布的任何信息的环境中,我们理想地希望提供无分配的置信区间 - 也就是说,有效,没有关于数据分布的假设。我们的结果在任何无分配置信区间的长度上建立了明确的下限,并构建一个可以大致实现此长度的过程。特别是,该下限与样本量无关,并且对于所有分布而没有任何点质量,这意味着任何无分配程序都无法适应分布中任何类型的特殊结构。
For a regression problem with a binary label response, we examine the problem of constructing confidence intervals for the label probability conditional on the features. In a setting where we do not have any information about the underlying distribution, we would ideally like to provide confidence intervals that are distribution-free---that is, valid with no assumptions on the distribution of the data. Our results establish an explicit lower bound on the length of any distribution-free confidence interval, and construct a procedure that can approximately achieve this length. In particular, this lower bound is independent of the sample size and holds for all distributions with no point masses, meaning that it is not possible for any distribution-free procedure to be adaptive with respect to any type of special structure in the distribution.