论文标题
复杂工程数据集的概念识别
Concept Identification for Complex Engineering Datasets
论文作者
论文摘要
在工程应用程序数据集中找到有意义的概念,这些概念在许多情况下可以进行明智的设计分组。它允许确定具有相似属性的不同设计组,并在工程决策过程中提供有用的知识。此外,它为进一步改进了具有某些特征特征的特定设计候选者的路线。在这项工作中,提出了一种在现有工程数据集中定义有意义且一致的概念的方法。数据集中的设计的特征是多种特征,例如设计参数,几何特性或设计的设计参数,以针对各种边界条件进行设计。在建议的方法中,将完整的功能集分为几个称为描述空间的子集。概念的定义尊重这种分区,从而导致确定概念的几个所需属性。通过最先进的聚类或概念识别方法无法实现这一点。提出了一种新颖的概念质量度量,该度量为数据集中的概念定义提供了客观价值。通过考虑一个由约2500个机翼轮廓组成的现实工程数据集,可以证明该度量的有用性,该数据集由计算流体动力学模拟获得了三种不同的操作条件的性能值(升力和阻力)。采用了数值优化过程,该过程最大化了概念质量度量,并为描述空间的不同设置找到有意义的概念,同时还结合了用户的喜好。已经证明了如何使用这些概念来选择数据集的原型代表,这些代表表现出每个概念的特征。
Finding meaningful concepts in engineering application datasets which allow for a sensible grouping of designs is very helpful in many contexts. It allows for determining different groups of designs with similar properties and provides useful knowledge in the engineering decision making process. Also, it opens the route for further refinements of specific design candidates which exhibit certain characteristic features. In this work, an approach to define meaningful and consistent concepts in an existing engineering dataset is presented. The designs in the dataset are characterized by a multitude of features such as design parameters, geometrical properties or performance values of the design for various boundary conditions. In the proposed approach the complete feature set is partitioned into several subsets called description spaces. The definition of the concepts respects this partitioning which leads to several desired properties of the identified concepts. This cannot be achieved with state-of-the-art clustering or concept identification approaches. A novel concept quality measure is proposed, which provides an objective value for a given definition of concepts in a dataset. The usefulness of the measure is demonstrated by considering a realistic engineering dataset consisting of about 2500 airfoil profiles, for which the performance values (lift and drag) for three different operating conditions were obtained by a computational fluid dynamics simulation. A numerical optimization procedure is employed, which maximizes the concept quality measure and finds meaningful concepts for different setups of the description spaces, while also incorporating user preference. It is demonstrated how these concepts can be used to select archetypal representatives of the dataset which exhibit characteristic features of each concept.