论文标题
trase:从认知科学的角度来解决作者风格
TraSE: Towards Tackling Authorial Style from a Cognitive Science Perspective
论文作者
论文摘要
文本的风格分析是研究领域的关键任务,从作者归因到法医分析和人格分析。现有的风格分析方法受到主题影响,大量作者缺乏可区分性以及对大量不同数据的要求所困扰的。在本文中,确定了这些问题的来源,以及对解决方案的认知观点的必要性。引入了一种新型功能表示,称为基于轨迹的样式估计(TRASE),以支持此目的。在跨域场景中拥有超过27,000名作者和140万样本的作者归因实验,导致90%的归因精度,这表明该特征表示不受这种负面影响的影响,并且是对风格分析的出色候选者。最后,使用物理人类特征(例如年龄)对TRASE进行定性分析,以验证其在捕获认知特征方面的主张。
Stylistic analysis of text is a key task in research areas ranging from authorship attribution to forensic analysis and personality profiling. The existing approaches for stylistic analysis are plagued by issues like topic influence, lack of discriminability for large number of authors and the requirement for large amounts of diverse data. In this paper, the source of these issues are identified along with the necessity for a cognitive perspective on authorial style in addressing them. A novel feature representation, called Trajectory-based Style Estimation (TraSE), is introduced to support this purpose. Authorship attribution experiments with over 27,000 authors and 1.4 million samples in a cross-domain scenario resulted in 90% attribution accuracy suggesting that the feature representation is immune to such negative influences and an excellent candidate for stylistic analysis. Finally, a qualitative analysis is performed on TraSE using physical human characteristics, like age, to validate its claim on capturing cognitive traits.