论文标题
弱监督的高保真超声视频综合与特征解耦
Weakly-supervised High-fidelity Ultrasound Video Synthesis with Feature Decoupling
论文作者
论文摘要
超声(US)广泛用于实时成像,无辐射和便携性的优势。在临床实践中,分析和诊断通常依赖于美国序列,而不是单个图像来获得动态的解剖信息。对于新手来说,这是一项挑战,因为使用患者的足够视频进行练习在临床上是不切实际的。在本文中,我们提出了一个新颖的框架,以综合高保真美国视频。具体而言,综合视频是通过基于给定驾驶视频的动作来动画源内容图像来生成的。我们的亮点是三倍。首先,利用自我监督学习的优势,我们提出的系统以弱监督的方式进行了培训,以进行关键点检测。然后,这些关键点为处理美国视频中复杂的高动态动作提供了重要信息。其次,我们使用双重解码器将内容和纹理学习解除,以有效地减少模型学习难度。最后,我们采用了对抗性训练策略,并带有GAN损失,以进一步改善生成的视频的清晰度,从而缩小了真实和合成视频之间的差距。我们在具有高动态运动的大型内部骨盆数据集上验证我们的方法。广泛的评估指标和用户研究证明了我们提出的方法的有效性。
Ultrasound (US) is widely used for its advantages of real-time imaging, radiation-free and portability. In clinical practice, analysis and diagnosis often rely on US sequences rather than a single image to obtain dynamic anatomical information. This is challenging for novices to learn because practicing with adequate videos from patients is clinically unpractical. In this paper, we propose a novel framework to synthesize high-fidelity US videos. Specifically, the synthesis videos are generated by animating source content images based on the motion of given driving videos. Our highlights are three-fold. First, leveraging the advantages of self- and fully-supervised learning, our proposed system is trained in weakly-supervised manner for keypoint detection. These keypoints then provide vital information for handling complex high dynamic motions in US videos. Second, we decouple content and texture learning using the dual decoders to effectively reduce the model learning difficulty. Last, we adopt the adversarial training strategy with GAN losses for further improving the sharpness of the generated videos, narrowing the gap between real and synthesis videos. We validate our method on a large in-house pelvic dataset with high dynamic motion. Extensive evaluation metrics and user study prove the effectiveness of our proposed method.