论文标题
向纹理支付U型:多阶段沙漏视觉变压器用于通用纹理综合
Paying U-Attention to Textures: Multi-Stage Hourglass Vision Transformer for Universal Texture Synthesis
论文作者
论文摘要
我们为通用纹理综合提供了一种新型的U-Interion Vision Transformer。我们利用注意力机制可以利用自然的远程依赖性,以使我们的方法合成各种纹理,同时在单个推论中保留其结构。我们提出了一个分层的沙漏骨干,该骨架会在全球结构上进行,并在粗到粉的流中以不同的尺度进行补丁映射。通过跳过连接和卷积设计,以不同的尺度传播和融合信息,我们的分层U型体系结构将注意力从宏结构到微细节的特征统一,并在连续阶段逐渐完善了综合结果。我们的方法比以前在随机纹理和结构化纹理上的工作更强大的2 $ \ times $综合,同时推广到不看到纹理而不会进行微调。消融研究证明了我们体系结构的每个组成部分的有效性。
We present a novel U-Attention vision Transformer for universal texture synthesis. We exploit the natural long-range dependencies enabled by the attention mechanism to allow our approach to synthesize diverse textures while preserving their structures in a single inference. We propose a hierarchical hourglass backbone that attends to the global structure and performs patch mapping at varying scales in a coarse-to-fine-to-coarse stream. Completed by skip connection and convolution designs that propagate and fuse information at different scales, our hierarchical U-Attention architecture unifies attention to features from macro structures to micro details, and progressively refines synthesis results at successive stages. Our method achieves stronger 2$\times$ synthesis than previous work on both stochastic and structured textures while generalizing to unseen textures without fine-tuning. Ablation studies demonstrate the effectiveness of each component of our architecture.