对象功能的预测性和生成性神经网络

论文标题

对象功能的预测性和生成性神经网络

Predictive and Generative Neural Networks for Object Functionality

论文作者

Hu, Ruizhen, Yan, Zihao, Zhang, Jingwen, van Kaick, Oliver, Shamir, Ariel, Zhang, Hao, Huang, Hui

论文摘要

即使没有任何环境，人类也可以预测对象的功能，因为他们的知识和经验将使他们“幻觉”涉及对象的相互作用或使用场景。我们开发了预测性和生成性深卷积神经网络，以复制这一壮举。具体而言，我们的工作着重于人类对象或对象对象相互作用的人造3D对象的功能。我们的网络在场景上下文的数据库上进行了训练，称为交互上下文，每个网络由中心对象和一个或多个代表对象功能的周围对象组成。给定一个孤立的3D对象，我们的功能相似性网络（FSIM-NET）（三胞胎网络的变体）经过训练，可以通过推断功能性交互环境来预测对象的功能。 FSIM-NET与生成网络（IGEN-NET）和分割网络（ISEG-NET）相辅相成。 Igen-net采用一个具有功能标签的单个素化的3D对象，并合成了Voxelized环境，即，在视觉上展示相应功能的相互作用上下文。 ISEG-NET根据其相互作用类型将相互作用的对象进一步分为不同的组。

Humans can predict the functionality of an object even without any surroundings, since their knowledge and experience would allow them to "hallucinate" the interaction or usage scenarios involving the object. We develop predictive and generative deep convolutional neural networks to replicate this feat. Specifically, our work focuses on functionalities of man-made 3D objects characterized by human-object or object-object interactions. Our networks are trained on a database of scene contexts, called interaction contexts, each consisting of a central object and one or more surrounding objects, that represent object functionalities. Given a 3D object in isolation, our functional similarity network (fSIM-NET), a variation of the triplet network, is trained to predict the functionality of the object by inferring functionality-revealing interaction contexts. fSIM-NET is complemented by a generative network (iGEN-NET) and a segmentation network (iSEG-NET). iGEN-NET takes a single voxelized 3D object with a functionality label and synthesizes a voxelized surround, i.e., the interaction context which visually demonstrates the corresponding functionality. iSEG-NET further separates the interacting objects into different groups according to their interaction types.

下载PDF全文

下载文献需遵守相关版权规定

论文标题