论文标题
广告活动的上下文强盗:一种扩散模型独立方法(扩展版本)
Contextual Bandits for Advertising Campaigns: A Diffusion-Model Independent Approach (Extended Version)
论文作者
论文摘要
在社交媒体中信息传播和广告方案的激励之中,我们研究了一个影响最大化问题,其中认为扩散网络或确定信息如何传播的模型知之甚少。在如此高度不确定的环境中,人们可以专注于多轮扩散活动,目的是最大化受影响或激活的不同用户的数量,从已知的很少有影响力的节点的基础开始。在活动期间,连续回合依次选择扩散种子,并在每个回合时以激活的节点的形式收集反馈。然后,将回合的影响(奖励)量化为新激活的节点的数量。总体而言,作为回合的奖励之和,必须最大限度地提高竞选活动的总价值。在这种情况下,可以在运行广告系列时使用探索探索方法来学习基础扩散参数的关键。我们描述并比较了上下文多臂匪徒的两种方法,具有影响者的剩余潜力的上等信仰界限,一种使用广义线性模型和良好的纤维估计器的剩余潜力(GLM-GT-UCB)(GLM-GT-UCB),另一个直接适应了Linucb算法(Linucb算法)(LogNorm-Linucb)。我们表明,他们在合成和现实世界中使用最先进的想法优于基线方法,同时表现出不同的和互补的行为,具体取决于部署的情况。
Motivated by scenarios of information diffusion and advertising in social media, we study an influence maximization problem in which little is assumed to be known about the diffusion network or about the model that determines how information may propagate. In such a highly uncertain environment, one can focus on multi-round diffusion campaigns, with the objective to maximize the number of distinct users that are influenced or activated, starting from a known base of few influential nodes. During a campaign, spread seeds are selected sequentially at consecutive rounds, and feedback is collected in the form of the activated nodes at each round. A round's impact (reward) is then quantified as the number of newly activated nodes. Overall, one must maximize the campaign's total spread, as the sum of rounds' rewards. In this setting, an explore-exploit approach could be used to learn the key underlying diffusion parameters, while running the campaign. We describe and compare two methods of contextual multi-armed bandits, with upper-confidence bounds on the remaining potential of influencers, one using a generalized linear model and the Good-Turing estimator for remaining potential (GLM-GT-UCB), and another one that directly adapts the LinUCB algorithm to our setting (LogNorm-LinUCB). We show that they outperform baseline methods using state-of-the-art ideas, on synthetic and real-world data, while at the same time exhibiting different and complementary behavior, depending on the scenarios in which they are deployed.