人类和机器中的道德行为 - 评估有益机器学习的培训数据质量

论文标题

人类和机器中的道德行为 - 评估有益机器学习的培训数据质量

Ethical behavior in humans and machines -- Evaluating training data quality for beneficial machine learning

论文作者

Hagendorff, Thilo

论文摘要

基于学习算法的机器行为可能会受到暴露于不同品质数据的影响。到目前为止，尽管培训和注释数据在监督机器学习中起着重要作用，但这些素质仅以技术术语来衡量，但不在道德上衡量。这是第一个通过描述监督机器学习应用程序的数据质量的新维度来填补这一空白的研究。基于以下理由：个人的不同社会和心理背景在实践中与不同的人类交流模式相关联，本文从道德的角度描述了个人在使用数字技术同时留下的行为数据的不同质量在使用数字技术的情况下如何与机器学习应用程序开发具有社会相关的后果。这项研究的具体目的是描述如何根据对其起源的行为的道德评估来选择培训数据，建立创新的滤波器制度，以从大数据理由n =全部转变为处理机器学习中训练集的更具选择性的方法。这项研究的总体目的是促进实现有益的机器学习应用的方法，这些应用可能对行业和学术界广泛有用。

Machine behavior that is based on learning algorithms can be significantly influenced by the exposure to data of different qualities. Up to now, those qualities are solely measured in technical terms, but not in ethical ones, despite the significant role of training and annotation data in supervised machine learning. This is the first study to fill this gap by describing new dimensions of data quality for supervised machine learning applications. Based on the rationale that different social and psychological backgrounds of individuals correlate in practice with different modes of human-computer-interaction, the paper describes from an ethical perspective how varying qualities of behavioral data that individuals leave behind while using digital technologies have socially relevant ramification for the development of machine learning applications. The specific objective of this study is to describe how training data can be selected according to ethical assessments of the behavior it originates from, establishing an innovative filter regime to transition from the big data rationale n = all to a more selective way of processing data for training sets in machine learning. The overarching aim of this research is to promote methods for achieving beneficial machine learning applications that could be widely useful for industry as well as academia.

下载PDF全文

下载文献需遵守相关版权规定

论文标题