Proteomics Data Reveals Alternative Splicing Proteoforms

doi:10.16476/j.pibb.2024.0109

Home > Archive>Volume 51, Issue 12, 2024 >3151-3162. DOI:10.16476/j.pibb.2024.0109

Proteomics Data Reveals Alternative Splicing Proteoforms
DOI:
                        10.16476/j.pibb.2024.0109
                    
Author:
                        WU Yi-YingWU Yi-Ying
Institute of Integrated Traditional Chinese and Western Medicine, Hebei Medical University, Shijiazhuang050017, China
Find this author on All Journals
Find this author on BaiDu
Search for this author on this site
ZHANG WeiZHANG Wei
Institute of Integrated Traditional Chinese and Western Medicine, Hebei Medical University, Shijiazhuang050017, China
Find this author on All Journals
Find this author on BaiDu
Search for this author on this site
KONG De-ZhiKONG De-Zhi
Institute of Integrated Traditional Chinese and Western Medicine, Hebei Medical University, Shijiazhuang050017, China
Find this author on All Journals
Find this author on BaiDu
Search for this author on this site

                    
Affiliation:Institute of Integrated Traditional Chinese and Western Medicine, Hebei Medical University, Shijiazhuang050017, China
Clc Number:
Fund Project:This work was supported by grants from The National Natural Science Foundation of China (82174004) and the Natural Science Foundation of Hebei Province (H2022206211, H2022206387).

Article

Figures

Metrics

Reference

Cited by

Materials

Comments

Abstract:

Alternative splicing is an important regulatory mechanism in organisms, influencing the expression of genes involved in processes such as drug metabolism, pathway activation, and apoptosis. It refers to the process of removing introns from precursor mRNA and joining the remaining exons to produce mature mRNA. During this process, different combinations of exons can result in multiple mature mRNAs. This process is known as alternative splicing. Alternative splicing allows the same gene to produce different transcript variants and protein isoforms, increasing protein diversity and functional complexity. Transcriptomics and proteomics are two main approaches for identifying alternative splicing events. Transcriptomics identifies alternative splicing by analyzing differences between RNA sequencing data and reference sequences in databases. This method relies on the development of modern sequencing technologies. It also depends on increasingly improved splicing identification algorithms. Examples of these algorithms include alignment mapping and sequencing data quality control. The other approach is proteomic data analysis, which identifies corresponding protein products. We consider alternative splicing events more meaningful when they can be detected at the protein level. Alternative splicing proteoforms can be identified using bottom-up proteomics based on mass spectrometry. Due to the high sequence similarity between these alternative splicing proteoforms, general proteomic data analysis pipelines do not achieve good discrimination between them. To improve the identification of proteoforms and obtain differentiation information for different isoforms in proteomic data, two strategies have been developed for improving data processing: the construction of special databases and targeted identification algorithms. We believe that this potential protein isoform information may play a crucial role in life science research. In terms of databases, it is not enough to only use ordinary public databases for searching. To ensure the discovery of as many isoforms as possible, the method of constructing sample-specific databases assisted by RNA sequencing data has been widely used, which can increase the probability of detecting proteoforms. Another key strategy is the improvement of protein identification algorithms. Traditional identification algorithms often struggle to distinguish between highly similar or mutually inclusive proteoforms. To address the complex identification of alternative splicing proteoforms, several inference algorithms have been developed, which are combined with existing search engines to better characterize and detect alternative splicing proteoforms. These include peptide grouping (PeptideClassifier, SEPepQuant, GpGrouper), peptide quantitative correlation (PQPQ, PeCorA, COPF, SpliceVista), machine learning (IsoSVM, Re-Fraction, LibSVM), and major splice isoform theory (ASV-ID). Such methods have shown promising results in focusing on alternative splicing proteoforms. When using these algorithms, we should try different ones based on actual situations. Additionally, the performance of these algorithms is limited by the quality of input data. To ensure reliable identification, it is also essential to perform proper peptide identification and quality control at the front end. In general, the detection and differentiation of spliced protein isoforms are still inadequate, requiring continued attention. This article reviews recent research progress on alternative splicing and its biological functions, as well as the detection of alternative splicing at different levels, and introduces the main methods for identifying alternative splicing proteoforms using bottom-up proteomic data. Identifying different alternative splicing proteoforms helps us understand the comprehensive functions of proteins and is of great significance for discovering related biomarkers and key drug targets.

Key words:alternative splicing;mass spectrometry data analysis;protein identification algorithm;protein sequence database

Get Citation

WU Yi-Ying, ZHANG Wei, KONG De-Zhi. Proteomics Data Reveals Alternative Splicing Proteoforms[J]. Progress in Biochemistry and Biophysics,2024,51(12):3151-3162

Copy

Article Metrics

Abstract:
PDF:
HTML:
Cited by:

History

Received:March 18,2024
Revised:August 09,2024
Accepted:July 01,2024
Online: July 03,2024
Published: December 20,2024

Get Citation

Share

Article Metrics

History