清华大学生命科学学院生物信息学教育部重点实验室,南昌大学第一附属医院,加利福尼亚大学伯克利分校统计学系,清华大学生命科学学院生物信息学教育部重点实验室,清华大学生命科学学院生物信息学教育部重点实验室
国家自然科学基金(31171274)和国家重点基础研究发展计划(2012CB725203)资助项目
MOE Key Laboratory of Bioinformatics, School of Life Sciences, Tsinghua University,The First Affiliated Hospital of Nanchang University,Department of Statistics, University of California,MOE Key Laboratory of Bioinformatics, School of Life Sciences, Tsinghua University,MOE Key Laboratory of Bioinformatics, School of Life Sciences, Tsinghua University
This work was supported by grants from The National Natural Science Foundation of China (31171274) and National Basic Research Program of China (2012C B725203)
二代测序技术的涌现推动了基因组学研究,特别是在疾病相关的遗传变异研究中发挥了重要作用.虽然大多数遗传变异类型都可以借助于各种二代测序分析工具进行检测,但是仍然存在局限性,比如短串联重复序列的长度变异.许多遗传疾病是由短串联重复序列的长度扩张导致的,尤其是亨廷顿病等多种神经系统疾病.然而,现在几乎没有工具能够利用二代测序检测长度大于测序读长的短串联重复序列变异.为了突破这一限制,我们开发了一个全新的方法,该方法基于双末端二代测序辨识短串联重复序列长度变异,并可估计其扩张长度,将其应用于一项基于全外显子组测序的运动神经元疾病临床研究中,成功地鉴定出致病的短串联重复序列长度扩张.该方法首次原创性地利用测序读长覆盖深度特征来解决短串联重复序列变异检测问题,在人类遗传疾病研究中具有广泛的应用价值,并且对于其他二代测序分析方法的开发具有启发性意义.
Next generation sequencing (NGS) technologies boosted genomic and medical research, particularly for identification of disease-causing variants. Although most types of genetic variants could be identified through NGS data analysis, there are still some limitations, such as length variations of short tandem repeats (STRs). Many genetic diseases are known to be caused by expansions of STRs, especially neurological disorders, such as Huntington disease. However, almost none of existing tools could detect STRs expanded longer than sequencing read length based on NGS. To break through the limitation, we developed a novel method for detecting length variations of STRs and estimating the length of expansions based on paired-end NGS. We applied our method in a clinical study of motor neuron disease using whole-exome sequencing and successfully identified a disease-causing expansion of STR. Our method firstly used special features of depth of read coverage at STRs to address the variant calling problem. It has widely application value in human genetic disease research and inspirational value in developing new NGS data processing tools.
严章明,王瑶,刘珂,向书念,孙之荣.一种基于二代测序辨识短串联重复序列长度变异的新方法及其在人类遗传疾病研究中的应用[J].生物化学与生物物理进展,2016,43(8):768-777
复制生物化学与生物物理进展 ® 2025 版权所有 ICP:京ICP备05023138号-1 京公网安备 11010502031771号