山东农业大学信息科学与工程学院农业大数据研究中心,泰安 271000
Tel:
国家自然科学基金(32070684, 31571306)资助项目.
College of Information Science and Engineering , Shandong Agricultural University,Tai'an 271000, China
This work was supported by grants from The National Natural Science Foundation of China (32070684, 31571306).
本文提出了一种基于卷积神经网络和循环神经网络的深度学习模型,通过分析基因组序列数据,识别人基因组中环形RNA剪接位点.首先,根据预处理后的核苷酸序列,设计了2种网络深度、8种卷积核大小和3种长短期记忆(long short term memory,LSTM)参数,共8组16个模型;其次,进一步针对池化层进行均值池化和最大池化的测试,并加入GC含量提高模型的预测能力;最后,对已经实验验证过的人类精浆中环形RNA进行了预测.结果表明,卷积核尺寸为32×4、深度为1、LSTM参数为32的模型识别率最高,在训练集上为0.9824,在测试数据集上准确率为0.95,并且在实验验证数据上的正确识别率为83%.该模型在人的环形RNA剪接位点识别方面具有较好的性能.
In this paper, we propose a deep learning model based on convolutional neural network and recurrent neural network, which uses genome sequence data to identify human circular RNA splicing sites. Firstly, we preprocessed the original genome sequences and designed 16 models with two network depths, eight convolution kernel sizes and three LSTM parameters; secondly, the pooling layer was further tested for average pooling and maximum pooling; and GC content was added to improve the prediction ability of the model; finally, we predicted the circRNA in human seminal plasma. The results show that the model with convolution kernel of 32 × 4, depth of 1 and LSTM parameter of 32 has the highest recognition rate of 0.9824 on training data set, and 0.95 on test data set. Also, we tested our model with a published study and the accuracy reaches 0.83. The model has good performance in the recognition of human circular RNA splicing sites.
孙凯,魏庆功,臧超禹,孙如轩,姜丹,孙晓勇.基于卷积神经网络和循环神经网络的环形RNA剪接位点识别研究[J].生物化学与生物物理进展,2021,48(3):328-335
复制生物化学与生物物理进展 ® 2025 版权所有 ICP:京ICP备05023138号-1 京公网安备 11010502031771号