Analyzing updates in Amino Acid Composition and Translation Algorithm towards Predicting Membrane Proteins using Machine Learning Approaches
DOI:
https://doi.org/10.21015/vtcs.v9i1.1004Abstract
Membrane proteins are of different types that take on different functions. Classification of protein sequences in a data set is very important for understanding cell functions, disease prevention, and drug discovery. Initially, traditional methods were used for transmembrane protein classification. However, due to advanced technology and new research, it increases the transmembrane protein datasets by thousands which are almost impossible to obtain accurate results based on traditional methods. Computational methods are very useful for membrane protein classification. Several methods such as Pseudo Amino Acid Composition (PseAAC) can extract many silent features of a protein sequence. In this work, we intended to modify an existing algorithm of amino acid composition and translation to extract membrane protein features with better accuracy. To validate our algorithm, we will use the Support Vector Machine SVM and KNN.
References
. Chou, K. C., & Elrod, D. W. (1999). Prediction of membrane protein types and subcellular locations. Proteins: Structure, Function, and Bioinformatics, 34(1), 137-153.
. Chou, K. C. (2001). Prediction of protein cellular attributes using pseudo‐amino acid composition. Proteins: Structure, Function, and Bioinformatics, 43(3), 246-255.
. Wang, M., Yang, J., Liu, G. P., Xu, Z. J., & Chou, K. C. (2004). Weighted-support vector machines for predicting membrane protein types based on pseudo-amino acid composition. Protein Engineering Design and Selection, 17(6), 509-516.
. Chou, K. C., & Cai, Y. D. (2005). Prediction of membrane protein types by incorporating amphipathic effects. Journal of chemical information and modeling, 45(2), 407-413.
. Chou, K. C., & Cai, Y. D. (2005). Using GO-PseAA predictor to identify membrane proteins and their types. Biochemical and biophysical research communications, 327(3), 845-847.
. Augen, J. (2004). Bioinformatics in the post-genomic era: Genome, transcriptome, proteome, and information-based medicine. Addison-Wesley Professional.
. Fulekar, M. H. (Ed.). (2009). Bioinformatics: applications in life and environmental sciences. Springer Science & Business Media.
. Leavitt, H. J., & Whisler, T. L. (1958). Management in the 1980’s. November.
. Krogh, A., Larsson, B., Von Heijne, G., & Sonnhammer, E. L. (2001). Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes. Journal of molecular biology, 305(3), 567-580.
. Wallin, E., & Heijne, G. V. (1998). Genome‐wide analysis of integral membrane proteins from eubacterial, archaean, and eukaryotic organisms. Protein Science, 7(4), 1029-1038.
. Jones, D. T. (1998). Do transmembrane protein superfolds exist?. FEBS letters, 423(3), 281-285.
. Gao, Q. B., Ye, X. F., Jin, Z. C., & He, J. (2010). Improving discrimination of outer membrane proteins by fusing different forms of pseudo amino acid composition. Analytical biochemistry, 398(1), 52-59.
. Russell, R. B., & Eggleston, D. S. (2000). New roles for structure in biology and drug discovery. Nature Structural & Molecular Biology, 7, 928-930.
. Russ, A. P., & Lampel, S. (2005). The druggable genome: an update. Drug discovery today, 10(23), 1607-1610.
. Singer, S. J., & Nicolson, G. L. (1972). The fluid mosaic model of the structure of cell membranes. Membranes and Viruses in Immunopathology; Day, SB, Good, RA, Eds, 7-47.
. Wallin, E., & Heijne, G. V. (1998). Genome‐wide analysis of integral membrane proteins from eubacterial, archaean, and eukaryotic organisms. Protein Science, 7(4), 1029-1038.
. Bagos, P. G., Liakopoulos, T. D., Spyropoulos, I. C., & Hamodrakas, S. J. (2004). A Hidden Markov Model method, capable of predicting and discriminating β-barrel outer membrane proteins. BMC bioinformatics, 5(1), 1.
. Fairman, J. W., Noinaj, N., & Buchanan, S. K. (2011). The structural biology of β-barrel membrane proteins: a summary of recent reports. Current opinion in structural biology, 21(4), 523-531.
. Afridi, T. H., Khan, A., & Lee, Y. S. (2012). Mito-GSAAC: mitochondria prediction using genetic ensemble classifier and split amino acid composition. Amino Acids, 42(4), 1443-1454.
. Ung, P., & Winkler, D. A. (2011). Tripeptide motifs in biology: targets for peptidomimetic design. J. Med. Chem, 54(5), 1111-1125.
. Kumar, M., Gromiha, M. M., & Raghava, G. P. (2011). SVM based prediction of RNA‐binding proteins using binding residues and evolutionary information. Journal of Molecular Recognition, 24(2), 303-313.
. Chou, K. C. (2011). Some remarks on protein attribute prediction and pseudo amino acid composition. Journal of theoretical biology, 273(1), 236-247.
. Ding, H., Deng, E. Z., Yuan, L. F., Liu, L., Lin, H., Chen, W., & Chou, K. C. (2014). iCTX-Type: a sequence-based predictor for identifying the types of conotoxins in targeting ion channels. BioMed research international, 2014.
. Hayat, M., & Iqbal, N. (2014). Discriminating protein structure classes by incorporating pseudo average chemical shift to Chou's general PseAAC and support vector machine. Computer methods and programs in biomedicine, 116(3), 184-192.
. Breiman, L. (2001). Random forests. Machine learning, 45(1), 5-32.
Downloads
Published
How to Cite
Issue
Section
License
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License (CC-By) that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).
This work is licensed under a Creative Commons Attribution License CC BY