芯片数据标准化方法比较研究
DOI:
CSTR:
作者:
作者单位:

作者简介:

通讯作者:

中图分类号:

基金项目:

国家自然科学基金资助项目(30570425, 30400552), 国家重点基础研究发展规划资助项目(2003CB715903, 2006CB503806), 微软亚洲研究院开放项目部分资助.


The Comparison of Different Normalization Methods in Microarray Data
Author:
Affiliation:

Fund Project:

This work was supported by grants from The National Natural Science Foundation of China (30570425, 30400552), The National Key Basic Research Project of China (2003CB715903, 2006CB503806), and supported in part by Microsoft Research Asia (MSRA).

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    在基因芯片实验中,基因表达水平之间的相关性在推断基因间相互关系时起到非常重要的作用. 未经标准化处理的芯片数据基因之间往往都呈现出很强的相关性,这些高相关性一部分是由基因表达水平变化引起的,而另外一部分是由系统偏差引起的. 对芯片数据进行标准化处理的目的之一是消除系统偏差引起的高相关性,同时保留由真正生物学原因引起的基因表达水平高相关性. 虽然目前对标准化方法已经有了不少比较研究,但还较少有人研究标准化方法对基因之间相关系数的影响,以及哪种方法最有利于恢复基因之间的相关性结构. 通过对基因表达水平数据的模拟,具体比较了几种常用标准化方法的效果,从而给出最有利于恢复基因之间相关性结构的那种标准化方法.

    Abstract:

    Correlation coefficient between the expression levels of two genes plays an important role in the inference of their relationship in microarray experiments. Gene expression data before normalization often present high correlation coefficients among a large proportion of genes. Some of these high correlations are caused by changes in gene expression levels. However, most of them are caused by systematic errors. It is intended to eliminate superficial high correlations induced by systematic errors and at the same time, preserve high correlation coefficients stem from gene interactions. Although there are a number of comparisons among different normalization methods, less work focused on evaluating the effect of normalization procedures on correlation coefficients among genes and which method does the best in restoring gene correlation structure. Some gene expression data were simulated with reference to real world gene expression data. With the help of these simulated data, it was determined which normalization method does the best in restoring gene correlation structure. In addition, it was shown that the simulated data and the real world data have the same gene correlation structure, so the conclusion drawn from simulated data can be applied to the real world. For 5 normalization methods compared here, it can be concluded that the loess method is the most appropriate one in eliminating superficial correlation coefficients.

    参考文献
    相似文献
    引证文献
引用本文

谈效俊,张永新,钱敏平,张幼怡,邓明华.芯片数据标准化方法比较研究[J].生物化学与生物物理进展,2007,34(6):625-633

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2006-12-12
  • 最后修改日期:2007-03-02
  • 接受日期:
  • 在线发布日期: 2007-05-29
  • 出版日期: 2007-06-20