论文标题
基于实体的模型食谱方法
A Named Entity Based Approach to Model Recipes
论文作者
论文摘要
传统的烹饪食谱遵循一个结构,如果对配方文本的不同部分的规则和语义进行了准确的分析和准确表示,则可以很好地建模。我们提出了一个可以准确表示配方以及管道的结构,以推断该均匀结构中配方的最佳表示。食谱中的成分部分通常列出所需的成分和相应的属性,例如数量,温度和加工状态。这可以通过定义这些属性及其值来建模。构成食谱的物理实体可以大致分为烹饪技术相关的器皿,成分及其组合。 “指令”部分列出了一系列事件,其中将烹饪技术或过程应用于这些器皿和成分。我们以元组的形式对这些关系进行建模。因此,使用这些方法的组合,我们在数据集食谱中对烹饪配方进行建模,以显示我们方法的功效。该挖掘的信息模型可以具有多种应用程序,包括在语言之间翻译食谱,确定食谱之间的相似性,新颖的食谱的产生以及对食谱的营养概况的估计。为了识别成分属性,我们训练命名实体关系(NER)模型,并借助K-均值聚类来分析推论。我们的模型在所有数据集中呈现F1分数为0.95。我们使用类似的NER标记模型来标记烹饪技术(F1分数= 0.88)和餐具(F1分数= 0.90)。最后,我们确定成分,器皿和烹饪技术之间的关系时间序列,用于建模指令步骤。
Traditional cooking recipes follow a structure which can be modelled very well if the rules and semantics of the different sections of the recipe text are analyzed and represented accurately. We propose a structure that can accurately represent the recipe as well as a pipeline to infer the best representation of the recipe in this uniform structure. The Ingredients section in a recipe typically lists down the ingredients required and corresponding attributes such as quantity, temperature, and processing state. This can be modelled by defining these attributes and their values. The physical entities which make up a recipe can be broadly classified into utensils, ingredients and their combinations that are related by cooking techniques. The instruction section lists down a series of events in which a cooking technique or process is applied upon these utensils and ingredients. We model these relationships in the form of tuples. Thus, using a combination of these methods we model cooking recipe in the dataset RecipeDB to show the efficacy of our method. This mined information model can have several applications which include translating recipes between languages, determining similarity between recipes, generation of novel recipes and estimation of the nutritional profile of recipes. For the purpose of recognition of ingredient attributes, we train the Named Entity Relationship (NER) models and analyze the inferences with the help of K-Means clustering. Our model presented with an F1 score of 0.95 across all datasets. We use a similar NER tagging model for labelling cooking techniques (F1 score = 0.88) and utensils (F1 score = 0.90) within the instructions section. Finally, we determine the temporal sequence of relationships between ingredients, utensils and cooking techniques for modeling the instruction steps.