论文标题

GreendB-用于提取消费品可持续性信息的数据集和基准

GreenDB -- A Dataset and Benchmark for Extraction of Sustainability Information of Consumer Goods

论文作者

Jäger, Sebastian, Flick, Alexander, Garcia, Jessica Adriana Sanchez, Driesch, Kaspar von den, Brendel, Karl, Biessmann, Felix

论文摘要

消费品的生产,运输,使用和处置对温室气体排放和资源耗竭有重大影响。机器学习(ML)可以通过考虑产品搜索或现代零售平台建议的可持续性方面来帮助促进可持续消耗模式。但是,缺乏具有可信赖的可持续性信息的高质量公共产品数据阻碍了ML技术的发展,这可以帮助实现我们的可持续性目标。在这里,我们提出GreendB,这是一个数据库,该数据库每周从欧洲在线商店收集产品。作为产品可持续性的代理,它依赖于由专家评估的可持续性标签。 GreendB模式扩展了著名的schema.org产品定义,并且可以轻松地集成到现有的产品目录中。我们提出初始结果,表明接受我们数据训练的ML模型可以可靠(F1分数96%)预测产品的可持续性标签。这些贡献可以帮助补充现有的电子商务体验,并最终鼓励用户采取更可持续的消费模式。

The production, shipping, usage, and disposal of consumer goods have a substantial impact on greenhouse gas emissions and the depletion of resources. Machine Learning (ML) can help to foster sustainable consumption patterns by accounting for sustainability aspects in product search or recommendations of modern retail platforms. However, the lack of large high quality publicly available product data with trustworthy sustainability information impedes the development of ML technology that can help to reach our sustainability goals. Here we present GreenDB, a database that collects products from European online shops on a weekly basis. As proxy for the products' sustainability, it relies on sustainability labels, which are evaluated by experts. The GreenDB schema extends the well-known schema.org Product definition and can be readily integrated into existing product catalogs. We present initial results demonstrating that ML models trained with our data can reliably (F1 score 96%) predict the sustainability label of products. These contributions can help to complement existing e-commerce experiences and ultimately encourage users to more sustainable consumption patterns.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源