论文标题
深度学习和压缩的第一原则
The First Principles of Deep Learning and Compression
论文作者
论文摘要
2012年Alexnet论文引起的深度学习革命对计算机视野领域具有变革性。现在,使用经典解决方案受到严格限制的许多问题现在已经取得了前所未有的成功。深度学习方法的快速扩散导致它们在消费者和嵌入式应用中的使用急剧增加。消费者和嵌入式应用程序的结果之一是有损失的多媒体压缩,这是在这些真实世界中的有效存储和数据传输所必需的。因此,对多媒体压缩的深度学习解决方案的兴趣增加,这将允许更高的压缩比和提高的视觉质量。 多媒体压缩的深度学习方法,所谓的学习多媒体压缩,涉及使用编码器和解码器的深网计算图像或视频的压缩表示。尽管这些技术取得了令人印象深刻的学术成就,但其行业的采用本质上是不存在的。诸如JPEG和MPEG之类的经典压缩技术在现代计算中太根深蒂固,无法容易替换。该论文采用正交的方法,并利用深度学习来改善这些经典算法的压缩保真度。这允许将深度学习的令人难以置信的进步用于多媒体压缩,而不会威胁到经典方法的普遍性。 这项工作的关键见解是,以第一原则为动机的方法,即开发压缩算法时做出的基本工程决策比一般方法更有效。通过将先验知识编码到算法的设计中,灵活性,性能和/或准确性以一般性为代价提高...
The deep learning revolution incited by the 2012 Alexnet paper has been transformative for the field of computer vision. Many problems which were severely limited using classical solutions are now seeing unprecedented success. The rapid proliferation of deep learning methods has led to a sharp increase in their use in consumer and embedded applications. One consequence of consumer and embedded applications is lossy multimedia compression which is required to engineer the efficient storage and transmission of data in these real-world scenarios. As such, there has been increased interest in a deep learning solution for multimedia compression which would allow for higher compression ratios and increased visual quality. The deep learning approach to multimedia compression, so called Learned Multimedia Compression, involves computing a compressed representation of an image or video using a deep network for the encoder and the decoder. While these techniques have enjoyed impressive academic success, their industry adoption has been essentially non-existent. Classical compression techniques like JPEG and MPEG are too entrenched in modern computing to be easily replaced. This dissertation takes an orthogonal approach and leverages deep learning to improve the compression fidelity of these classical algorithms. This allows the incredible advances in deep learning to be used for multimedia compression without threatening the ubiquity of the classical methods. The key insight of this work is that methods which are motivated by first principles, i.e., the underlying engineering decisions that were made when the compression algorithms were developed, are more effective than general methods. By encoding prior knowledge into the design of the algorithm, the flexibility, performance, and/or accuracy are improved at the cost of generality...