论文标题

使用与实体嵌入向量的混合高斯过程模型在细胞系之间的知识转移

Knowledge transfer across cell lines using Hybrid Gaussian Process models with entity embedding vectors

论文作者

Hutter, Clemens, von Stosch, Moritz, Bournazou, Mariano Nicolas Cruz, Butté, Alessandro

论文摘要

迄今为止,进行了大量实验以开发生化过程。生成的数据仅使用一次,以做出开发决策。我们是否可以利用已经开发的过程的数据来对新过程进行预测,我们可以大大减少所需的实验数量。不同产品的过程表现出行为差异,通常只有子集的行为相似。因此,对多个产品跨越过程数据的有效学习需要明智的产品标识。我们建议通过嵌入为高斯过程回归模型的输入的向量来表示产品身份(分类特征)。我们演示了如何从过程数据中学到嵌入向量并表明它们捕获了产品相似性的可解释概念。将性能的改进与在模拟跨产品学习任务上进行的传统一壁编码进行了比较。总而言之,所提出的方法可能会导致湿lab实验的显着减少。

To date, a large number of experiments are performed to develop a biochemical process. The generated data is used only once, to take decisions for development. Could we exploit data of already developed processes to make predictions for a novel process, we could significantly reduce the number of experiments needed. Processes for different products exhibit differences in behaviour, typically only a subset behave similar. Therefore, effective learning on multiple product spanning process data requires a sensible representation of the product identity. We propose to represent the product identity (a categorical feature) by embedding vectors that serve as input to a Gaussian Process regression model. We demonstrate how the embedding vectors can be learned from process data and show that they capture an interpretable notion of product similarity. The improvement in performance is compared to traditional one-hot encoding on a simulated cross product learning task. All in all, the proposed method could render possible significant reductions in wet-lab experiments.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源