解开身份和姿势以识别面部表达

论文标题

解开身份和姿势以识别面部表达

Disentangling Identity and Pose for Facial Expression Recognition

论文作者

Jiang, Jing, Deng, Weihong

论文摘要

面部表达识别（FER）是一个具有挑战性的问题，因为表达成分始终与其他无关的因素（例如身份和头部姿势）纠缠在一起。在这项工作中，我们提出了一个身份，并构成了分离的面部表达识别（IPD-fer）模型，以了解更多的歧视性特征表示。我们认为整体面部表征是身份，姿势和表达的组合。这三个组件由不同的编码器编码。对于身份编码器，在培训期间使用和固定了一个经过良好训练的面部识别模型，这可以减轻对先前工作中特定表达训练数据的限制，并使在野外数据集上可行的分离。同时，用相应的标签优化了姿势和表达编码器。结合身份和姿势特征，解码器应生成输入个体的中性面。当添加表达功能时，应重建输入图像。通过比较同一个体的合成中性图像和表达图像之间的差异，表达成分与身份和姿势进一步分离。实验结果验证了我们方法对实验室控制和野外数据库的有效性，并且我们达到了最新的识别性能。

Facial expression recognition (FER) is a challenging problem because the expression component is always entangled with other irrelevant factors, such as identity and head pose. In this work, we propose an identity and pose disentangled facial expression recognition (IPD-FER) model to learn more discriminative feature representation. We regard the holistic facial representation as the combination of identity, pose and expression. These three components are encoded with different encoders. For identity encoder, a well pre-trained face recognition model is utilized and fixed during training, which alleviates the restriction on specific expression training data in previous works and makes the disentanglement practicable on in-the-wild datasets. At the same time, the pose and expression encoder are optimized with corresponding labels. Combining identity and pose feature, a neutral face of input individual should be generated by the decoder. When expression feature is added, the input image should be reconstructed. By comparing the difference between synthesized neutral and expressional images of the same individual, the expression component is further disentangled from identity and pose. Experimental results verify the effectiveness of our method on both lab-controlled and in-the-wild databases and we achieve state-of-the-art recognition performance.

下载PDF全文

下载文献需遵守相关版权规定

论文标题