论文标题

独占:用于基因表达数据聚类和可视化比较研究的MATLAB GUI软件

EXCLUVIS: A MATLAB GUI Software for Comparative Study of Clustering and Visualization of Gene Expression Data

论文作者

Poddar, Sudip, Mukhopadhyay, Anirban

论文摘要

聚类是一种流行的数据挖掘技术,旨在将输入空间划分为多个同质区域。文献中存在几种聚类算法。聚类算法的性能取决于其输入参数,该参数可能会严重影响算法的行为。集群有效性指数确定最适合基础数据的分区。在生物信息学中,微阵列基因表达技术使同时测量数千个基因的基因表达水平成为可能。许多旨在分析某些基因功能的基因组研究高度依赖于某些聚类技术来基于基因的相似表达值将类似表达的基因分组或分割组织样品。在这项工作中,已经使用MATLAB图形用户界面(GUI)环境开发了一个称为Dextuvis(基因表达数据群集和可视化)的应用程序包,以分析基因表达数据集中不同聚类算法的性能。在此应用程序包中,用户需要选择许多参数,例如内部有效性指数,外部有效性指数和来自活动窗口的簇数,以评估聚类算法的性能。独占比较K-均值,模糊c均值,分层聚类和多主体进化聚类算法的性能。热图和簇轮廓图用于可视化结果。 Depruvis允许用户轻松找到聚类解决方案的优点,并提供聚类结果的视觉表示。

Clustering is a popular data mining technique that aims to partition an input space into multiple homogeneous regions. There exist several clustering algorithms in the literature. The performance of a clustering algorithm depends on its input parameters which can substantially affect the behavior of the algorithm. Cluster validity indices determine the partitioning that best fits the underlying data. In bioinformatics, microarray gene expression technology has made it possible to measure the gene expression levels of thousands of genes simultaneously. Many genomic studies, which aim to analyze the functions of some genes, highly rely on some clustering technique for grouping similarly expressed genes in one cluster or partitioning tissue samples based on similar expression values of genes. In this work, an application package called EXCLUVIS (gene EXpression data CLUstering and VISualization) has been developed using MATLAB Graphical User Interface (GUI) environment for analyzing the performances of different clustering algorithms on gene expression datasets. In this application package, the user needs to select a number of parameters such as internal validity indices, external validity indices and number of clusters from the active windows for evaluating the performance of the clustering algorithms. EXCLUVIS compares the performances of K-means, fuzzy C-means, hierarchical clustering and multiobjective evolutionary clustering algorithms. Heatmap and cluster profile plots are used for visualizing the results. EXCLUVIS allows the users to easily find the goodness of clustering solutions as well as provides visual representations of the clustering outcomes.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源