OffitTransFormer：学习服装表示时尚建议

论文标题

OffitTransFormer：学习服装表示时尚建议

OutfitTransformer: Learning Outfit Representations for Fashion Recommendation

论文作者

Sarkar, Rohan, Bodla, Navaneeth, Vasileva, Mariya I., Lin, Yen-Liang, Beniwal, Anurag, Lu, Alan, Medioni, Gerard

论文摘要

学习有效的服装级代表对于预测服装中项目的兼容性以及为部分服装检索补充物品至关重要。我们提出了一个框架，即使用拟议的特定任务令牌，并利用自我发挥的机制来学习有效的OutFit级表示，编码整个配件中所有项目之间的兼容性关系，以解决兼容性预测和满足项目的回收任务。为了进行兼容性预测，我们设计了一个服装令牌，以捕获全球服装表示并使用分类损失来训练框架。对于互补的项目检索，我们设计了一个目标项目令牌，该目标是考虑到目标项目规范（以类别或文本描述的形式）。我们使用拟议的设定服装排名损失训练框架，以生成给定服装的目标项目嵌入，目标项目规范作为输入。然后使用生成的目标项目嵌入来检索与其余的服装相匹配的兼容项目。此外，我们采用预培训方法和课程学习策略来提高检索绩效。由于我们的框架在服装级别上学习，它使我们能够比成对方法更有效地学习单个嵌入式捕获多个项目之间的高阶关系。实验表明，我们的方法在兼容性预测，填充和互补的项目检索任务上的最先进方法优于最先进的方法。通过用户研究，我们进一步验证了检索结果的质量。

Learning an effective outfit-level representation is critical for predicting the compatibility of items in an outfit, and retrieving complementary items for a partial outfit. We present a framework, OutfitTransformer, that uses the proposed task-specific tokens and leverages the self-attention mechanism to learn effective outfit-level representations encoding the compatibility relationships between all items in the entire outfit for addressing both compatibility prediction and complementary item retrieval tasks. For compatibility prediction, we design an outfit token to capture a global outfit representation and train the framework using a classification loss. For complementary item retrieval, we design a target item token that additionally takes the target item specification (in the form of a category or text description) into consideration. We train our framework using a proposed set-wise outfit ranking loss to generate a target item embedding given an outfit, and a target item specification as inputs. The generated target item embedding is then used to retrieve compatible items that match the rest of the outfit. Additionally, we adopt a pre-training approach and a curriculum learning strategy to improve retrieval performance. Since our framework learns at an outfit-level, it allows us to learn a single embedding capturing higher-order relations among multiple items in the outfit more effectively than pairwise methods. Experiments demonstrate that our approach outperforms state-of-the-art methods on compatibility prediction, fill-in-the-blank, and complementary item retrieval tasks. We further validate the quality of our retrieval results with a user study.

下载PDF全文

下载文献需遵守相关版权规定

论文标题