论文标题
基于人群的黑盒优化生物序列设计
Population-Based Black-Box Optimization for Biological Sequence Design
论文作者
论文摘要
黑盒优化在设计新生物序列的设计中是一个具有革命性影响的新兴研究领域。湿LAB实验的成本和延迟需要在几批序列的几回合中找到良好序列的方法 - 这种设置了现成的黑盒优化方法是不适合处理的。我们发现,现有方法的性能在跨优化任务上发生了巨大变化,这对现实世界应用构成了重要的障碍。为了提高鲁棒性,我们提出了基于人群的黑盒优化(P3BO),该优化通过从方法集合中采样来生成序列批次。从任何方法采样的序列数量与先前提出的序列质量成正比,允许P3BO在对抗其先天性的同时结合单个方法的强度。使用进化优化的每种方法调整每种方法的超参数进一步提高了性能。通过对内部优化任务的广泛实验,我们表明P3BO在其人群中的任何单一方法都优于其人群中的任何一种方法,提出了更高质量的序列以及更多样化的批次。因此,P3BO和自适应P3BO是将ML部署到现实世界序列设计的关键步骤。
The use of black-box optimization for the design of new biological sequences is an emerging research area with potentially revolutionary impact. The cost and latency of wet-lab experiments requires methods that find good sequences in few experimental rounds of large batches of sequences--a setting that off-the-shelf black-box optimization methods are ill-equipped to handle. We find that the performance of existing methods varies drastically across optimization tasks, posing a significant obstacle to real-world applications. To improve robustness, we propose Population-Based Black-Box Optimization (P3BO), which generates batches of sequences by sampling from an ensemble of methods. The number of sequences sampled from any method is proportional to the quality of sequences it previously proposed, allowing P3BO to combine the strengths of individual methods while hedging against their innate brittleness. Adapting the hyper-parameters of each of the methods online using evolutionary optimization further improves performance. Through extensive experiments on in-silico optimization tasks, we show that P3BO outperforms any single method in its population, proposing higher quality sequences as well as more diverse batches. As such, P3BO and Adaptive-P3BO are a crucial step towards deploying ML to real-world sequence design.