北京工业大学生命科学与生物工程学院,北京工业大学生命科学与生物工程学院,北京工业大学生命科学与生物工程学院,北京工业大学生命科学与生物工程学院
国家自然科学基金(11572014)和智能制造领域大科研推进计划(01500054631751)资助项目
School of Life Science and Bioengineering,Beijing University of Technology,School of Life Science and Bioengineering,Beijing University of Technology,School of Life Science and Bioengineering,Beijing University of Technology,School of Life Science and Bioengineering,Beijing University of Technology
This work was supported by grants from The National Natural Science Foundation of China (11572014) and Major Research Projects in The Field of Intelligent Manufacturing (01500054631751)
本文选取癌症基因组图谱数据库的乳腺癌样本作为数据集,在全基因组的水平上研究乳腺癌病人从正常到发病I期基因表达的变化,寻找与乳腺癌发病密切相关的特征基因,建立乳腺癌发生的模式识别分类方法,为乳腺癌预防及早期诊断提供理论支持.研究中综合利用相关性、t检验、置信区间等统计学方法建立乳腺癌发生特征基因筛选方法,获得与乳腺癌发生具有显著性差异的特征基因336个,通过机器学习方法建模,得到的分类准确率能达到98%以上,与之前乳腺癌相关的研究相比,准确率更高.同时采用KEGG (Kyoto Encyclopedia of Genes and Genomes) 通路分析得到与基因显著相关(P<0.05)的通路有8个,GO(Gene Ontology) 基因功能富集分析显示与基因显著相关(P<0.05)的功能有18个,最后对映射在8个通路中的一部分基因进行简要功能分析,说明了其在调控水平上的密切关系,表明识别的特征基因在乳腺癌的发生过程中有重要的作用,这对了解乳腺癌发病机理以及乳腺癌的早期诊断非常重要.
To identify signature genes for the pathogenesis of breast cancer, which provides a theoretical support for prevention and early diagnosis of breast cancer. The pattern recognition method was used to analysis the genome-wide gene expression data which was collected from the breast cancer part of TCGA (The Cancer Genome Atlas) database.336 gene expression signature genes were selected by means of a combination of statistical methods such as correlation, t test, confidence interval, etc. The accuracy can be as high as 98% through the machine learning method modeling, which is higher compared with the previous study. The KEGG (Kyoto Encyclopedia of Genes and Genomes) pathway analysis and GO (Gene Ontology) enrichment analysis indicated the significant correlation among eight and eighteen kinds of genes respectively. A functional analysis of the part of the eight pathways showed theirs close relationship at the level of gene regulation which indicted the identified signature genes play an important role in the pathogenesis of breast cancer and is very important for understanding the pathogenesis of breast cancer and the early diagnosis of breast cancer.
温建鑫,王学栋,李晓琴,常宇.乳腺癌发生的特征基因筛选及模式识别[J].生物化学与生物物理进展,2017,44(11):1016-1025
复制生物化学与生物物理进展 ® 2025 版权所有 ICP:京ICP备05023138号-1 京公网安备 11010502031771号