This work was supported by grants from The National Natural Sciences Foundation of China (39960023 and 90103030).
给出了α型、β型、α/β型、多域型蛋白质二级结构主序列六联体的分布规律.提出了根据蛋白质二级结构主序列对蛋白质结构型进行识别(分类)的方法.以蛋白质二级结构主序列三联体为参数,利用Mahalanobis距离方法对上述4种结构型的蛋白质进行识别,分类的总体准确率为81%;以二级结构主序列中六联体的频数构成蛋白质结构的多样性源,利用多样性增量极小化对上述4种结构型进行识别,分类的总体准确率为83%. 同时也给出了对紧结构域的识别途径.
The distribution of hexa-structures in secondary structure sequences of different classes of proteins has been found. Based on this, two methods for the recognition of the structural class of a protein are proposed. The first is the method of Mahalanobis distance which is based on the frequencies of tri-structures in secondary structure sequence. The second is the method of diversity measure which is based on the frequencies of hexa-structures that are regarded as the source of diversity. The prediction has been done in a set of 1 130 proteins of four classes, namely α-class, β-class, α/β-class and multi-domain protein. The successful rates for two recognitions are about 81% and 83% respectively. The method introduced here also gives an approach to predict the compact structural domain of proteins.
李晓琴,罗辽复.蛋白质结构型的识别方法[J].生物化学与生物物理进展,2002,29(6):938-941
复制生物化学与生物物理进展 ® 2025 版权所有 ICP:京ICP备05023138号-1 京公网安备 11010502031771号