Yan Xu*, Yingxi Yang, Zu Wang and Yuanhai Shao Pages 1 - 9 ( 9 )
In vivo, one of the most efficient biological mechanisms for expanding the genetic code and regulating cellular physiology is protein post-translational modification (PTM). Because PTM can provide very useful information for both basic research and drug development, identification of PTM sites in proteins has become a very important topic in bioinformatics. Lysine residue in protein can be subjected to many types of PTMs, such as acetylation, succinylation, methylation and propionylation and so on. In order to deal with the huge protein sequences, the present study is devoted to developing computational techniques that can be used to predict the multiple K-type modifications of any uncharacterized protein timely and effectively. In this work, we proposed a method which could deal with the acetylation and succinylation prediction in a multilabel learning. Three feature constructions including sequences and physicochemical properties have been applied. The multilabel learning algorithm RankSVM has been first used in PTMs. In 10-fold cross-validation the predictor with physicochemical properties encoding got accuracy 73.86%, abslute-true 64.70%, respectively. They were better than the other feature constructions. We compared with other multilabel algorithms and the existing predictor iPTM-Lys. The results of our predictor were better than other methods. Meanwhile we also analyzed the acetylation and succinylation peptides which could illustrate the results.
Multi-label learning, K-PTM protein prediction, RankSVMulti-label learning, RankSVM
University of California, Los Angeles, Department of Microbiology, Immunology and Molecular Genetics, University of Science and Technology, Beijing, Department of Information and Computer Science, University of Science and Technology Beijing, Department of Information and Computer Science, Hainan University, School of Economics and Management