Fisher线性判别函数在基于COGs分类的基因组间距离研究中的应用
DOI:
作者:
作者单位:

作者简介:

通讯作者:

中图分类号:

基金项目:

国家自然科学基金资助项目(39890070, 19890380, 39993420),中国科学院创新工程项目(KSCX2-2-07,KJCX1-08)和北京市科委特别资助项目.


The Application of Fisher Linear Discriminant to Distance Between Genomes Which Based on COGs
Author:
Affiliation:

Fund Project:

This work was supported by grants from The National Natural Sciences Foundation of China (39890070,19890380,39993420), Knowledge Innovation Project of The Chinese Academy of Sciences (KSCX2-2-07 and KSCX1-08) and a Special Grant Science and Technology Com

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    利用全基因组信息构建系统发育树.基于COGs类,对每一个基因组的每一个基因,都用一个17维的向量来描述其编码蛋白隶属于17个COGs类的程度;而与一个基因组的所有基因相对应的那些矢量就组成一个集合.接着,利用Fisher线性判别函数,寻找一组最优化的权重因子;在此基础上利用Fisher线性变换将上述各集合中每一个矢量进行线性变换.使得经Fisher线性变换后17个COGs类对基因组进化的重要程度得到更准确的反映.最后,用进行变换后的矢量组成的集合间的距离代替基因组之间的距离.使用这种方法,分别用38个和43个基因组做的进化树都支持了Woese的三界理论.该方法克服了其他基于全基因组信息构建系统发育树方法难以对大小相差很大的基因组进行比较的问题,并能减少基因横向迁移对基因组间距离的扭曲.

    Abstract:

    A new method to construct a phylogeny tree based on whole genome information is introduced. Each gene of an organism is represented by a 17 dimensional vector, each dimension of which relates to one of the 17 COGs(clusters of orthologous groups of proteins) classes. All the vectors of a genome constitute a set. Then Fisher linear discriminant was used to find a set of optimal weights which reflect more accurately the different contribution of the 17 COGs classes to the genome's evolution. That is, under the Fisher criteria, each vector of a genome is linear mapped. After that, the distance between two genomes was represented by the distance between the related two sets constituted by mapped vectors. At last, the distance matrix was used to construct a phylogenetic tree by PHILP software package. Phylogeny trees of 38 and 43 genomes constructed by this method respectively well support the “three primary kingdom” theory of Woese. This method rectifies the shortcoming of other methods which are difficult to compare genomes differring remarkably in genome size. In addition, the method diminishes the distortion on the distances between genomes brought by lateral gene transfer.

    参考文献
    相似文献
    引证文献
引用本文

刘蓉,王月兰,朱小蓬,凌伦奖,韩汝珊. Fisher线性判别函数在基于COGs分类的基因组间距离研究中的应用[J].生物化学与生物物理进展,2002,29(5):760-765

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2002-03-06
  • 最后修改日期:2002-04-11
  • 接受日期:
  • 在线发布日期:
  • 出版日期: