This work waqs supported by a grant from The National Natural Science Foundation of China(30330080).
随着后基因组时代的到来,批量的测序,特别是 EST 的测序,逐渐成为普通实验室的日常工作 . 这些新的序列往往需要进行批量的 Gene Ontology (GO) 的注释及随后的统计分析 . 但是目前除了 Goblet 以外,并没有软件适合对未知序列进行批量的 GO 注释,而 GoBlet 因为具有上载量的限制,以及仅仅利用 BLAST 作为预测工具,所以仍有许多不足之处 . 开发了一个软件包 GoPipe ,通过整合 BLAST 和 InterProScan 的结果来进行序列注释,并提供了进一步作统计比较的工具 . 主程序接收任意个 BLAST 和 InterProScan 的结果文件,并依次进行文本分析、数据整合、去除冗余、统计分析和显示等工作 . 还提供了统计的工具来比较不同输入对 GO 的分布来挖掘生物学意义 . 另外,在交集工作模式下,程序取 InterProScan 和 BLAST 结果的交集, 在测试数据集中,其精确度达到 99.1% ,这大大超过了 InterProScan 本身对 GO 预测的精确度,而敏感度只是稍微下降 . 较高的精确度、较快的速度和较大的灵活性使它成为对未知序列进行批量 Gene Ontology 注释的理想的工具 . 上述软件包可以在网站 (http://gopipe.fishgenome.org/ ) 免费获得或者与作者联系获取 .
Accelerated availability of new sequences, especially ESTs, calls for computational methods to link sequences with Gene Ontology (GO) terms in a batch mode. There is currently no program for such purpose except Goblet, an online tool which uses BLAST to interpret query sequence with proper GO terms, but has a restriction of upload sequence files less than 100 kilobytes in size. GoPipe is a standalone package that integrates BLAST and InterProScan results to obtain Gene Ontology annotation with built-in statistical options. GoPipe takes any number of BLAST and/or InterProScan output files simultaneously and launches jobs sequentially to perform parsing, data integration, redundancy removal, GO distributions calculation and graphic display. A very high annotation specificity of 99.1% was achieved for a test dataset when the program was run in the "intersection" mode, which intersects the BLAST and InterProScan results, outperforming the specificity (81.1%) obtained from the InterProScan only. Statistical tools are also provided to compare GO distributions between different inputs, so that GO distributions of different sets of batch sequences can be compared, and differentially represented GO terms can be easily displayed. High specificity, speed and flexibility make GoPipe an ideal tool for streamlined GO annotation for batch sequences. The package is freely available at http://gopipe.fishgenome.org/ or by contacting the authors.
陈作舟,薛成海,朱 晟,周丰丰,XUEFENG BRUCE LING,刘国平,陈良标. GoPipe: 批量序列的Gene Ontology 注释和统计分析[J].生物化学与生物物理进展,2005,32(2):187-191
复制生物化学与生物物理进展 ® 2024 版权所有 ICP:京ICP备05023138号-1 京公网安备 11010502031771号