Hunan Agricultural University,Hunan Agricultural University,Hunan Agricultural University,Hunan Agricultural University
This work was supported by grants from Specialized Research Fund for the Doctoral Program of Higher Education (20124320110002), The Natural Science Foundation of Hunan Province, China (14JJ2082) and The Science and Technology Planning Projects of Changsha, China (K1406018-21)
Glycosylation is a major modification process in post-translational modification of protein. Accurate prediction of O-linked glycosylation sites is a big challenging faced by machine-learning, for the fixed-model of O-linked glycosylation is not yet known. In this paper, on the basis of the largest-ever Steentoft database up to now, a new feature——chi-square score difference table method based on position (χ2-pos) was first proposed, which combined with pseudo position-specific scoring matrix (PsePSSM) and undirected composition of k-spaced amino acid pairs (Undirected-CKSAAP) were used to present protein sequences. Then 5 support vector machines models were constructed with the same proportion of positive and negative samples. At last, by weighted voting, our results showed that the prediction accuracy, Matthew’s correlation coefficient and area under ROC curve reached 89.62%, 0.79 and 0.96 respectively. They were superior to the literature report. It also demonstrated that the combination of three different features χ2-pos, PsePSSM and Undirected-CKSAAP has extensive application prospect in protein sites prediction such as glycosylation and phosphorylation.
XIANG Yan, CHEN Yuan, TAN Si-Qiao, YUAN Zhe-Ming. Predicting O-glycosylation Sites by Combining Three Different Types of Features[J]. Progress in Biochemistry and Biophysics,2016,43(7):691-698
Copy® 2025 All Rights Reserved ICP:京ICP备05023138号-1 京公网安备 11010502031771号