NELDA:基于网络嵌入的lncRNA-疾病关联关系预测
CSTR:
作者:
作者单位:

西北工业大学自动化学院,信息融合技术教育部重点实验室,西安 710072

作者简介:

通讯作者:

中图分类号:

基金项目:

国家自然科学基金(61873202,62173271) 资助项目。


NELDA: Prediction of LncRNA-disease Associations With Network Embedding
Author:
Affiliation:

School of Automation, Key Laboratory of Information Fusion Technology of Ministry of Education, Northwestern Polytechnical University, Xi’an 710072, China

Fund Project:

This work was supported by grants from The National Natural Science Foundation of China (61873202, 62173271).

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    目的 长非编码RNA(lncRNAs)参与多种重要的生物学过程并与各种人类疾病密切相关,因此,lncRNA-疾病关联预测研究有助于疾病的诊断、治疗和在分子水平理解人类疾病的发生发展机制。目前,大多数lncRNA-疾病关联预测方法倾向于浅层整合lncRNA和疾病的相关信息,忽略网络拓扑结构中的深层嵌入特征;另外通过随机选取lncRNA-疾病非关联对构建负样本训练集合,影响预测方法的鲁棒性。方法 本文提出一种基于网络嵌入的NELDA方法,预测潜在的lncRNA-疾病关联关系。NELDA首先利用lncRNA 表达谱、疾病本体论和已知的lncRNA-疾病关联关系,构建lncRNA相似性网络、疾病相似性网络和lncRNA-疾病关联网络。然后,通过设计4个深度自编码器分别从lncRNA/疾病的相似性网络、lncRNA-疾病关联网络学习lncRNA和疾病的低维网络嵌入特征。串联lncRNA和疾病的相似性网络嵌入特征及lncRNA和疾病的关联网络嵌入特征,分别输入两个支持向量机分类器预测lncRNA-疾病关联。最后,采用加权融合策略融合两个支持向量机分类器的预测结果,给出lncRNA-疾病关联关系的最终预测结果。另外,根据已知的lncRNA-疾病关联对和疾病语义相似性,设计一种负样本选取策略构建可信度相对较高的lncRNA-疾病非关联对样本集,用以改善分类器的鲁棒性,该策略通过设计一种打分函数为每对lncRNA-疾病进行打分,选取得分较低的lncRNA-疾病对作为lncRNA-疾病非关联对样本(即负样本)。结果 十折交叉验证实验结果表明:NELDA能够有效预测lncRNA-疾病关联关系,其AUC达到0.982 7,比现有LDASR和 LDNFSGB方法分别提高了0.062 7和0.020 7。另外,负样本选取策略与决策级加权融合策略能够有效改善NELDA预测性能。胃癌和乳腺癌案例研究中,29/40(72.5%)预测的与胃癌和乳腺癌关联lncRNAs,在近期文献和公共数据库中能够发现相关的支撑证据。结论 这些实验结果表明,NELDA是一种有效的lncRNA-疾病关联关系预测方法,具有挖掘潜在lncRNA-疾病关联关系的能力。

    Abstract:

    Objective Long non-coding RNAs (lncRNAs) participate in a variety of vital biological processes and closely relate with various human diseases. The prediction of lncRNA-disease associations can help to understand the mechanisms of human disease at the molecular level, and also contribute to diagnosis and treatment of diseases. Most existing methods of predicting the lncRNA-disease associations ignore the deep embedding features hiding in lncRNA/disease network topological structures. Moreover, randomly selecting the negative samples will affect the robustness of predictors.Methods Here we first set up a high quality dataset by using an effective strategy to select the negative samples (i.e., pairs of non lncRNA-disease association) with relatively higher quality instead of randomly selecting the negative samples, then proposed a novel method (called NELDA) to predict the potential lncRNA-disease associations by building 4 deep auto-encoder models to learn the low dimensional network embedding features from the lncRNA/disease similarity networks, and lncRNA-disease association network, respectively. NELDA takes the lncRNA/disease similarity network embedding features as the input of one support vector machine (SVM) classifier, and the lncRNA/disease association network embedding features as the input of another SVM classifier. The prediction results of these two SVM classifiers are fused by the weighted average strategy to obtain the final prediction results.Results In 10-fold cross-validation (10 CV) test, the AUC of NELDA achieves 0.982 7 on high quality dataset, which is 0.062 7 and 0.020 7 higher than that of other two state-of-the-art methods of LDASR and LDNFSGB, respectively. In the case studies of stomach cancer and breast cancer, 29/40 (72.5%) novel predicted lncRNAs associated with stomach and breast cancers are supported by recent literatures and public datasets.Conclusion These experimental results demonstrate that NELDA is a superior method for predicting the potential lncRNA-disease associations. It has the ability to discover the new lncRNA-disease associations.

    参考文献
    相似文献
    引证文献
引用本文

李维娜,樊校楠,张绍武. NELDA:基于网络嵌入的lncRNA-疾病关联关系预测[J].生物化学与生物物理进展,2022,49(7):1369-1380

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2021-05-10
  • 最后修改日期:2021-07-15
  • 接受日期:2021-09-02
  • 在线发布日期: 2022-07-20
  • 出版日期: 2022-07-20