Submit Manuscript  

Article Details


iRSpotH-TNCPseAAC: Identifying Recombination Spots in Human by Using Pseudo Trinucleotide Composition With an Ensemble of Support Vector Machine Classifiers

[ Vol. 14 , Issue. 9 ]

Author(s):

Zhao-Chun Xu, Wang-Ren Qiu* and Xuan Xiao*   Pages 703 - 713 ( 11 )

Abstract:


Background: For the formation of human gametes, meiotic recombination is crucial. Meanwhile, it has played an important role in the process that generates genetic diversity for that it is a defining event in the formation of human sperm and eggs. However, the recombination isn't a random occurrence across a genome, it usually occurs in some genomic regions, the so-called “hotspots”, with higher probability, while in the so-called “coldspots” with lower probability. Research has shown that new combinations of genetic variations can be provided by recombination. Therefore, the useful insights for in-depth studying of the genome evolution process and the mechanism of recombination would be provided based on the information of the coldspots and hotspots. Currently, the recombination regions would be determined by experiments, but it's a tedious job, which generally requires precious instruments and takes a long time. So in the study the work is starting to be studied by computational predicting models to address the above problems.

Method: In this paper, a new predictor, called ‘iRSpotH-TNCPseAAC’ was developed to identify the human recombination coldspots and hotspots. In the new discrete predictive model, a feature vector called ‘pseudo trinucleotide composition’ or PseTNC is proposed to formulate the given DNA segment with its sequence-order information as complete as possible.

Results: In this study, based on the rigorous jackknife test the overall success rate obtained by iRSpotH- TNCPseAAC is higher than 93% in identifying human’s recombination spots, and with mean success rate is 76.07% of the concerned 18 chromosomes. It means that our predictor can become a useful complementary tool in this area. Not only that, the PseTNC method can be used to further explore many other DNA-related problems. Finally, a web- server called iRSpotH-TNCPseAAC, which has the advantages of easy operation and convenient for using, is built and freely accessible at http://www.jci-bioinfo.cn/iRSpotH-TNCPseAAC.

Conclusion: To timely acquire the information of recombination spots in DNA sequence is very significant to make in-depth study on epigenetic inheritance and analyze human diseases. Furthermore, it will facilitate drug development. A certain conclusion is that the iRSpotH-TNCPseAAC predictor may become a very practical online predictive high throughput tools in identifying recombination spots.

Keywords:

Pseudo amino acid composition, support vector machine, web-server, iRSpotH-TNCPseAAC, meiosis, coldspots, hotspots.

Affiliation:

Computer Department, Jing-De-Zhen Ceramic Institute, Jing-De-Zhen 333403, Computer Department, Jing-De- Zhen Ceramic Institute, Jing-De-Zhen 333403, Gordon Life Science Institute, Boston, MA 02478

Graphical Abstract:



Read Full-Text article