This work was supported by a grant from The Doctor Innovation Grant of Northwestern Polytechnical University.
基于支持向量机和贝叶斯方法,从蛋白质一级序列出发对蛋白质同源二聚体、同源三聚体、同源四聚体、同源六聚体进行分类研究,结果表明:基于支持向量机, 采用“一对多”和“一对一”策略, 其分类总精度分别为77.36%和93.43%, 分别比基于贝叶斯协方差判别法的分类总精度50.64%提高26.72和42.79个百分点.从而说明支持向量机可用于蛋白质同源寡聚体分类,且是一种非常有效的方法.对于多类蛋白质同源寡聚体分类,基于相同的机器学习方法(如支持向量机),采用“一对一”策略比“一对多”效果好.同时亦表明蛋白质同源寡聚体一级序列包含四级结构信息.
The homo-dimer, homo-trimer, homo-tetramer and homo-hexamer of protein were classified using both of support vector machine and Bayes covariant discriminant methods. It was found that the total accuracies of “one-versus-rest” and “all-versus-all” are 77.36% and 93.43% respectively using support vector machine in jackknife test, which are 26.72 and 42.79 percentile higher respectively than that of Bayes covariant discriminant method in the same test. These results show that the support vector machine is a specially effective method for classifying the higher protein homo-oligomers from protein primary sequences. Using “all-versus-all” policy is better than “one-versus-rest” policy for classifying homo-oligomers based on the same machine learning method (such as support vector machine). And it was also indicated that the primary sequences of homo-oligomeric proteins contain quaternary information.
张绍武,潘泉,陈润生,张洪才.基于支持向量机的蛋白质同源寡聚体分类研究[J].生物化学与生物物理进展,2003,30(6):879-883
复制生物化学与生物物理进展 ® 2025 版权所有 ICP:京ICP备05023138号-1 京公网安备 11010502031771号