论文标题
迈向基于不可知论特征的动态定价:线性策略与线性估值和未知噪声
Towards Agnostic Feature-based Dynamic Pricing: Linear Policies vs Linear Valuation with Unknown Noise
论文作者
论文摘要
在基于功能的动态定价中,卖方通过从以前的销售会议的二进制成果中学习(如果估值$ \ geq $价格,否则不出售”,则可以为一系列产品(由特征向量描述)设定适当的价格(由功能向量描述)。现有作品要么假设无噪声线性估值或精确的噪声分布,这在难以验证这些假设时限制了这些算法在实践中的适用性。在这项工作中,我们研究了另外两个不可知论的模型:(a)一个“线性策略”问题,我们旨在与最佳的线性定价政策竞争,同时又不对数据做出假设,以及(b)“线性噪声估值”问题,其中随机估值是线性的,是线性性的,无知的噪声和假设的噪音。对于以前的模型,我们显示了一个$ \tildeθ(d^{\ frac13} t^{\ frac23})$ minimax遗憾的是对数因素。对于后一种模型,我们提出了一种算法,该算法可以实现$ \ tilde {o}(t^{\ frac34})$遗憾,并将最著名的下界从$ω(t^{\ frac35})$提高到$ \tildeΩ(t^{\ frac35})$。这些结果表明,在虚弱的假设下,基于功能的动态定价可以进行无重格学习,但也揭示了一个令人失望的事实,即看似更丰富的定价反馈并不比遗憾的减少强盗反馈更有用。
In feature-based dynamic pricing, a seller sets appropriate prices for a sequence of products (described by feature vectors) on the fly by learning from the binary outcomes of previous sales sessions ("Sold" if valuation $\geq$ price, and "Not Sold" otherwise). Existing works either assume noiseless linear valuation or precisely-known noise distribution, which limits the applicability of those algorithms in practice when these assumptions are hard to verify. In this work, we study two more agnostic models: (a) a "linear policy" problem where we aim at competing with the best linear pricing policy while making no assumptions on the data, and (b) a "linear noisy valuation" problem where the random valuation is linear plus an unknown and assumption-free noise. For the former model, we show a $\tildeΘ(d^{\frac13}T^{\frac23})$ minimax regret up to logarithmic factors. For the latter model, we present an algorithm that achieves an $\tilde{O}(T^{\frac34})$ regret, and improve the best-known lower bound from $Ω(T^{\frac35})$ to $\tildeΩ(T^{\frac23})$. These results demonstrate that no-regret learning is possible for feature-based dynamic pricing under weak assumptions, but also reveal a disappointing fact that the seemingly richer pricing feedback is not significantly more useful than the bandit-feedback in regret reduction.