论文标题
利用专家知识将公司分配给行业:一种新颖的深度学习方法
Exploiting Expert Knowledge for Assigning Firms to Industries: A Novel Deep Learning Method
论文作者
论文摘要
根据预定义的行业分类系统(ICS)将公司分配给行业的行业分配是大量关键商业实践的基础,从公司的运营和战略决策到政府机构的经济分析。三种类型的专家知识对于有效的行业分配至关重要:基于定义的知识(即每个行业的专家定义),基于结构的知识(即ICS中指定的行业之间的结构关系)以及基于任务的知识(即,域专家执行的先前的公司事业任务任务)。现有的行业分配方法仅利用基于任务的知识来学习将未分配的公司分类为行业的模型,并忽略基于定义和基于结构的知识。此外,这些方法仅考虑已分配了公司的哪个行业,但忽略了基于任务的知识的时间特定性,即何时发生任务。为了解决现有方法的局限性,我们提出了一种基于深度学习的新型方法,该方法不仅无缝整合了三种类型的行业分配知识,而且还考虑了基于分配的知识的特定时间。从方法上讲,我们的方法具有两种创新:动态行业表示和分层分配。前者通过通过我们提出的时间和空间聚集机制整合了三种类型的知识,将行业代表为一系列特定时间的向量。后者将行业和公司的代表作为投入,计算将公司分配给不同行业的可能性,并将公司分配给行业的可能性最高。
Industry assignment, which assigns firms to industries according to a predefined Industry Classification System (ICS), is fundamental to a large number of critical business practices, ranging from operations and strategic decision making by firms to economic analyses by government agencies. Three types of expert knowledge are essential to effective industry assignment: definition-based knowledge (i.e., expert definitions of each industry), structure-based knowledge (i.e., structural relationships among industries as specified in an ICS), and assignment-based knowledge (i.e., prior firm-industry assignments performed by domain experts). Existing industry assignment methods utilize only assignment-based knowledge to learn a model that classifies unassigned firms to industries, and overlook definition-based and structure-based knowledge. Moreover, these methods only consider which industry a firm has been assigned to, but ignore the time-specificity of assignment-based knowledge, i.e., when the assignment occurs. To address the limitations of existing methods, we propose a novel deep learning-based method that not only seamlessly integrates the three types of knowledge for industry assignment but also takes the time-specificity of assignment-based knowledge into account. Methodologically, our method features two innovations: dynamic industry representation and hierarchical assignment. The former represents an industry as a sequence of time-specific vectors by integrating the three types of knowledge through our proposed temporal and spatial aggregation mechanisms. The latter takes industry and firm representations as inputs, computes the probability of assigning a firm to different industries, and assigns the firm to the industry with the highest probability.