生成对抗数据编程

论文标题

生成对抗数据编程

Generative Adversarial Data Programming

论文作者

Pal, Arghya, Balasubramanian, Vineeth N

论文摘要

在计算机视觉和其他领域，机器学习模型部署的大型手工标记的训练数据的匮乏形成了一个主要的瓶颈。最近的工作（数据编程）表明，如何使用标签功能形式的遥远监督信号在接近恒定的时间内获取给定数据的标签。在这项工作中，我们提出了对抗性数据编程（ADP），该编程提供了一组弱标记功能，它提出了一种生成数据和策划的汇总标签的对抗方法。更有趣的是，这种标记功能通常很容易被概括，从而使我们的框架可以扩展到不同的设置，包括自我监督的标记图像生成，零弹性文本，以标记图像生成，传输学习和多任务学习。

The paucity of large curated hand-labeled training data forms a major bottleneck in the deployment of machine learning models in computer vision and other fields. Recent work (Data Programming) has shown how distant supervision signals in the form of labeling functions can be used to obtain labels for given data in near-constant time. In this work, we present Adversarial Data Programming (ADP), which presents an adversarial methodology to generate data as well as a curated aggregated label, given a set of weak labeling functions. More interestingly, such labeling functions are often easily generalizable, thus allowing our framework to be extended to different setups, including self-supervised labeled image generation, zero-shot text to labeled image generation, transfer learning, and multi-task learning.

下载PDF全文

下载文献需遵守相关版权规定

论文标题