论文标题
来自异质和时期的软件开发工件的类似人类的摘要
Human-Like Summaries from Heterogeneous and Time-Windowed Software Development Artefacts
论文作者
论文摘要
自动文本摘要对软件工程领域引起了极大的兴趣。总结与软件项目相关的活动是一项挑战,(1)由于涉及的软件伪像的数量和异质性,以及(2),因为不清楚开发人员在此类多文档摘要中寻求哪些信息。我们提供了第一个框架,用于总结在给定时间范围内包含异质数据的多文件软件人工制品。为了产生类似人类的摘要,我们采用了一系列迭代启发式方法来最大程度地减少文本和高维特征向量之间的相似之处。第一项研究表明,用户发现使用单词相似性生成并根据八个最相关的软件人工制品生成的自动生成的摘要最有用。
Automatic text summarisation has drawn considerable interest in the area of software engineering. It is challenging to summarise the activities related to a software project, (1) because of the volume and heterogeneity of involved software artefacts, and (2) because it is unclear what information a developer seeks in such a multi-document summary. We present the first framework for summarising multi-document software artefacts containing heterogeneous data within a given time frame. To produce human-like summaries, we employ a range of iterative heuristics to minimise the cosine-similarity between texts and high-dimensional feature vectors. A first study shows that users find the automatically generated summaries the most useful when they are generated using word similarity and based on the eight most relevant software artefacts.