论文标题
单倍型分辨的从头组装带有分阶段的组装图
Haplotype-resolved de novo assembly with phased assembly graphs
论文作者
论文摘要
单倍型分辨的从头组装是研究基因组序列变化的最终解决方案。但是,现有的算法要么崩溃的杂合等位基因分为一个共识副本,要么无法干净地将单倍型分开以产生高质量的分阶段组件。在这里,我们描述了Hifiasm,这是一种新的DE NOVO组装程序,它利用长长的高保真序列读取可以忠实地表示分阶段组装图中的单倍型信息。与其他旨在维持一种单倍型连续性的基于图的汇编器不同,Hifiasm努力保留所有单倍型的连续性。此功能使图形三重合算法的开发在标准三重合箱上大大推进。在三个人类和五个非人类数据集上,包括加利福尼亚红木和$ \ sim $ 30 gigabase六核基因组,我们表明,Hifiasm经常提供比现有工具更好的组装,并且在单倍型分解的组件上始终如一地胜过其他工具。
Haplotype-resolved de novo assembly is the ultimate solution to the study of sequence variations in a genome. However, existing algorithms either collapse heterozygous alleles into one consensus copy or fail to cleanly separate the haplotypes to produce high-quality phased assemblies. Here we describe hifiasm, a new de novo assembler that takes advantage of long high-fidelity sequence reads to faithfully represent the haplotype information in a phased assembly graph. Unlike other graph-based assemblers that only aim to maintain the contiguity of one haplotype, hifiasm strives to preserve the contiguity of all haplotypes. This feature enables the development of a graph trio binning algorithm that greatly advances over standard trio binning. On three human and five non-human datasets, including California redwood with a $\sim$30-gigabase hexaploid genome, we show that hifiasm frequently delivers better assemblies than existing tools and consistently outperforms others on haplotype-resolved assembly.