论文标题

COGEDAP:一个全面的基因组数据分析平台

COGEDAP: A COmprehensive GEnomic Data Analysis Platform

论文作者

Akdeniz, Bayram Cevdet, Frei, Oleksandr, Hagen, Espen, Filiz, Tahir Tekin, Karthikeyan, Sandeep, Pasman, Joelle, Jangmo, Andreas, Bergsted, Jacob, Shorter, John R., Zetterberg, Richard, Meijsen, Joeri, Sonderby, Ida Elken, Buil, Alfonso, Tesli, Martin, Lu, Yi, Sullivan, Patrick, Andreassen, Ole, Hovig, Eivind

论文摘要

大型联盟中的基因组研究中的不可分性敏感数据收集和分析是复杂的。由于不同的操作系统,软件依赖性和运行软件,因此出现了安装软件的耗时问题。因此,更容易,更标准化,自动化协议和平台可以是克服这些问题的解决方案。我们已经开发了一种使用软件容器技术的基因组数据分析的解决方案。该平台,COGEDAP,由不同的软件工具组成,这些工具放置在奇异容器中,并具有相应的管道和有关如何通过相应工具执行全基因组关联研究(GWAS)和其他基因组数据分析的说明。使用Python编写的提供的帮助脚本,用户可以获取自动生成的脚本,以对高性能计算(HPC)系统和个人计算机进行所需的分析。可以通过使用软件容器运行这些自动生成的脚本来完成分析。助手脚本还对输入/输出数据进行次要重新构架,以便最终用户可以使用统一的文件格式,无论哪种遗传软件用于分析。来自不同国家/项目的用户正在积极使用COGEDAP来进行基因组数据分析。借助此平台,用户可以轻松地运行GWAS和其他基因组分析,而无需花费大量精力来安装软件,数据格式和其他技术要求。

Non-sharable sensitive data collection and analysis in large-scale consortia for genomic research is complicated. Time consuming issues in installing software arise due to different operating systems, software dependencies and running the software. Therefore, easier, more standardized, automated protocols and platforms can be a solution to overcome these issues. We have developed one such solution for genomic data analysis using software container technologies. The platform, COGEDAP, consists of different software tools placed into Singularity containers with corresponding pipelines and instructions on how to perform genome-wide association studies (GWAS) and other genomic data analysis via corresponding tools. Using a provided helper script written in Python, users can obtain auto-generated scripts to conduct the desired analysis both on high-performance computing (HPC) systems and on personal computers. The analyses can be done by running these auto-generated scripts with the software containers. The helper script also performs minor re-formatting of the input/output data, so that the end user can work with a unified file format regardless of which genetic software is used for the analysis. COGEDAP is actively being used by users from different countries/projects to conduct their genomic data analyses. Thanks to this platform, users can easily run GWAS and other genomic analyses without spending much effort on software installation, data formats, and other technical requirements.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源