论文标题
超越眼睛的故事:字形位置破坏PDF文本修订
Story Beyond the Eye: Glyph Positions Break PDF Text Redaction
论文作者
论文摘要
在这项工作中,我们发现,由于非编号的字符定位信息,当前许多PDF文本的修复是不安全的。特别是,可以恢复编辑和非编辑字符中的子像素大小的水平偏移,并用于有效地deredact的名字和姓氏。不幸的是,这些发现会影响从PDF删除黑匣子下方的文本的修订。 我们通过对普通PDF修复类型进行全面的脆弱性评估来证明这些发现。我们检查了11种流行的PDF修复工具,包括Adobe Acrobat,并发现它们泄漏了有关编辑文本的信息。我们还有效地取代了数百种现实世界中的PDF修订,包括在OIG调查报告中发现的和FOIA的反应中发现的那些。 为了纠正问题,我们发布了开源算法来修复微不足道的修复并减少无效的修复泄露的信息量(在修订下的文本中,复制的文本是可复制的)。我们还通知了开发人员研究的修订工具。我们已经通知了监察长办公室,免费法律项目,Pacer,Adobe,Microsoft和美国司法部。我们正在与其中几个小组合作,以防止我们的发现被用于恶意目的。
In this work we find that many current redactions of PDF text are insecure due to non-redacted character positioning information. In particular, subpixel-sized horizontal shifts in redacted and non-redacted characters can be recovered and used to effectively deredact first and last names. Unfortunately these findings affect redactions where the text underneath the black box is removed from the PDF. We demonstrate these findings by performing a comprehensive vulnerability assessment of common PDF redaction types. We examine 11 popular PDF redaction tools, including Adobe Acrobat, and find that they leak information about redacted text. We also effectively deredact hundreds of real-world PDF redactions, including those found in OIG investigation reports and FOIA responses. To correct the problem, we have released open source algorithms to fix trivial redactions and reduce the amount of information leaked by nonexcising redactions (where the text underneath the redaction is copy-pastable). We have also notified the developers of the studied redaction tools. We have notified the Office of Inspector General, the Free Law Project, PACER, Adobe, Microsoft, and the US Department of Justice. We are working with several of these groups to prevent our discoveries from being used for malicious purposes.