Proteomics Data Reveals Alternative Splicing Proteoforms
Author:
Affiliation:

Institute of Integrated Traditional Chinese and Western Medicine, Hebei Medical University, Shijiazhuang 050017, China

Clc Number:

Fund Project:

The National Natural Science Foundation of China

  • Article
  • |
  • Figures
  • |
  • Metrics
  • |
  • Reference
  • |
  • Related
  • |
  • Cited by
  • |
  • Materials
  • |
  • Comments
    Abstract:

    Alternative splicing is an important regulatory mechanism in organisms, influencing the expression of genes involved in processes such as drug metabolism, pathway activation, and apoptosis. It refers to the process of removing introns from precursor mRNA and joining the remaining exons to produce mature mRNA. During this process, different combinations of exons can result in multiple mature mRNAs. This process is known as alternative splicing. Alternative splicing allows the same gene to produce different transcript variants and protein isoforms, increasing protein diversity and functional complexity. Transcriptomics and proteomics are two main approaches for identifying alternative splicing events. Transcriptomics identifies alternative splicing by analyzing differences between RNA sequencing data and reference sequences in databases. This method relies on the development of modern sequencing technologies. It also depends on increasingly improved splicing identification algorithms. Examples of these algorithms include alignment mapping and sequencing data quality control. The other approach is proteomic data analysis, which identifies corresponding protein products. We consider alternative splicing events more meaningful when they can be detected at the protein level. Alternative splicing proteoforms can be identified using bottom-up proteomics based on mass spectrometry. Due to the high sequence similarity between these alternative splicing proteoforms, general proteomic data analysis pipelines do not achieve good discrimination between them. To improve the identification of proteoforms and obtain differentiation information for different isoforms in proteomic data, two strategies have been developed for improving data processing, as shown in the figure: the construction of special databases and targeted identification algorithms. We believe that this potential protein isoform information may play a crucial role in life science research. In terms of databases, it is not enough to only use ordinary public databases for searching. To ensure the discovery of as many isoforms as possible, the method of constructing sample-specific databases assisted by RNA sequencing data has been widely used, which can increase the probability of detecting proteoforms. Another key strategy is the improvement of protein identification algorithms. Traditional identification algorithms often struggle to distinguish between highly similar or mutually inclusive proteoforms. To address the complex identification of alternative splicing proteoforms, several inference algorithms have been developed, which are combined with existing search engines to better characterize and detect alternative splicing proteoforms. These include peptide grouping (PeptideClassifier, SEPepQuant, GpGrouper), peptide quantitative correlation (PQPQ, PeCorA, COPF, SpliceVista), machine learning (IsoSVM, Re-Fraction, LibSVM), and major splice isoform theory (ASV-ID). Such methods have shown promising results in focusing on alternative splicing proteoforms. When using these algorithms, we should try different ones based on actual situations. Additionally, the performance of these algorithms is limited by the quality of the input data. To ensure reliable identification, it is also essential to perform proper peptide identification and quality control at the front end. In general, the detection and differentiation of spliced protein isoforms are still inadequate, requiring continued attention. This article reviews recent research progress on alternative splicing and its biological functions, as well as the detection of alternative splicing at different levels, and introduces the main methods for identifying alternative splicing proteoforms using bottom-up proteomic data. Identifying different alternative splicing proteoforms helps us understand the comprehensive functions of proteins and is of great significance for discovering related biomarkers and key drug targets.

    Reference
    Related
    Cited by
Get Citation

wuyiying, kongdezhi, zhangwei. Proteomics Data Reveals Alternative Splicing Proteoforms[J]. Progress in Biochemistry and Biophysics,,():

Copy
Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:March 18,2024
  • Revised:July 01,2024
  • Accepted:July 01,2024
  • Online:
  • Published: