论文标题
单峰与多模式暹罗网络,用于完成服装
Unimodal vs. Multimodal Siamese Networks for Outfit Completion
论文作者
论文摘要
在线时尚购物的普及不断增长。向客户提供有效建议的能力变得越来越重要。在这项工作中,我们专注于时尚服装挑战,这是Sigir 2022电子商务研讨会的一部分。挑战集中在填写空白(FITB)任务中,这意味着预测丢失的服装,给定不完整的服装和候选人列表。在本文中,我们专注于在任务上应用暹罗网络。更具体地说,我们探讨了如何将来自多种模式的信息(文本和视觉模式)组合起来会影响模型在任务上的性能。我们在挑战组织者提供的测试分配中评估了我们的模型,并通过我们在开发阶段创建的黄金任务进行了测试分配。我们发现,使用视觉和视觉和文本数据同时展示了该任务的有希望的结果。我们通过建议进一步改进我们方法的方向来结束。
The popularity of online fashion shopping continues to grow. The ability to offer an effective recommendation to customers is becoming increasingly important. In this work, we focus on Fashion Outfits Challenge, part of SIGIR 2022 Workshop on eCommerce. The challenge is centered around Fill in the Blank (FITB) task that implies predicting the missing outfit, given an incomplete outfit and a list of candidates. In this paper, we focus on applying siamese networks on the task. More specifically, we explore how combining information from multiple modalities (textual and visual modality) impacts the performance of the model on the task. We evaluate our model on the test split provided by the challenge organizers and the test split with gold assignments that we created during the development phase. We discover that using both visual, and visual and textual data demonstrates promising results on the task. We conclude by suggesting directions for further improvement of our method.