论文标题

化学硫化过程中基于软感应预测的特征重要性分析

A Feature Importance Analysis for Soft-Sensing-Based Predictions in a Chemical Sulphonation Process

论文作者

Garcia-Ceja, Enrique, Hugo, Åsmund, Morin, Brice, Hansen, Per-Olav, Martinsen, Espen, Lam, An Ngoc, Haugen, Øystein

论文摘要

在本文中,我们介绍了化学硫化过程的特征重要性分析的结果。该任务包括预测中和数(NT),这是一个表征活性洗涤剂产品质量的度量。该预测基于从工业化学过程中取样的环境测量数据集。我们使用了一种软感应方法,即基于其他过程变量预测感兴趣的变量,而不是直接感知感兴趣的变量。这样做的原因范围从昂贵的感觉硬件到刺激性的环境,例如在化学反应堆内。这项研究的目的是探索和检测哪些变量与预测产品质量以及准确性程度最重要。我们根据线性回归,回归树和随机森林训练了回归模型。随机森林模型被用来按重要性对预测变量进行排名。然后,我们通过一次添加一个功能,从最重要的功能开始,以前向选择样式训练了模型。我们的结果表明,在8个变量中,使用前3个重要变量以实现令人满意的预测结果是足够的。另一方面,在接受所有变量训练时,随机森林获得了最佳结果。

In this paper we present the results of a feature importance analysis of a chemical sulphonation process. The task consists of predicting the neutralization number (NT), which is a metric that characterizes the product quality of active detergents. The prediction is based on a dataset of environmental measurements, sampled from an industrial chemical process. We used a soft-sensing approach, that is, predicting a variable of interest based on other process variables, instead of directly sensing the variable of interest. Reasons for doing so range from expensive sensory hardware to harsh environments, e.g., inside a chemical reactor. The aim of this study was to explore and detect which variables are the most relevant for predicting product quality, and to what degree of precision. We trained regression models based on linear regression, regression tree and random forest. A random forest model was used to rank the predictor variables by importance. Then, we trained the models in a forward-selection style by adding one feature at a time, starting with the most important one. Our results show that it is sufficient to use the top 3 important variables, out of the 8 variables, to achieve satisfactory prediction results. On the other hand, Random Forest obtained the best result when trained with all variables.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源