中国科学院大学电子电气与通信工程学院,北京 100190
国家自然科学基金 (61431017) 资助项目.
School of Electronic, Electrical and Communication Engineering, University of Chinese Academy of Sciences, Beijing 100190, China
This work was supported by a grant from The National Natural Science Foundation of China (61431017).
设计结合不同化学结构底物的酶结合袋是一个巨大的挑战. 传统的湿实验要筛选成千上万甚至上百万个突变体来寻找对特定配体结合的突变体,此过程需要耗费大量的时间和资源. 为了加快筛选过程,我们提出了一种新的工作流程,将分子建模和数据驱动的机器学习方法相结合,生成具有高富集率的突变文库,用于高效筛选能识别特定底物的蛋白质突变体. M. jannaschii酪氨酰tRNA合成酶(Mj. TyrRS)能识别特定的非天然氨基酸并催化形成氨酰tRNA,其不同的突变体能够识别不同结构的非天然氨基酸,并且已经有了许多报道和数据的积累,因此我们使用TyrRS作为一个例子来进行此筛选流程的概念验证. 基于已知的多个Mj. TyrRS的晶体结构及分子建模的结果,我们发现D158G/P是影响残基158~163位α螺旋蛋白骨架变化的关键突变. 我们的模拟结果表明,在含有687个突变体的测试数据中,与随机突变相比,分子建模和打分函数计算排序可以将目标突变体的富集率提高2倍,而使用已知突变体和对应的非天然氨基酸数据训练的机器学习模型进行校准后,筛选富集率可提高11倍. 这种分子建模和机器学习相结合的计算和筛选流程非常有助于Mj.TyrRS的底物特异性设计,可以大大减少湿实验的时间和成本. 此外,这种新方法在蛋白质计算设计领域具有广泛的应用前景.
Design of enzyme binding pocket to accommodate substrates with different chemical structure is a great challenge. Traditionally, thousands even millions of mutants have to be screened in wet-lab experiments to find a ligand-specific mutant and large amount of time and resources are consumed. To accelerate the screening process, we propose a novel workflow through integration of molecular modeling and data-driven machine learning method to generate mutant libraries with high enrichment ratio for recognition of specific substrate. We collected all the M. jannaschii tyrosyl-tRNA synthetase (Mj. TyrRS) mutants reported in the literature to compare and analyze the sequence and structural feature and difference between mutant and wild type Mj. TyrRS. Mj. TyrRS is used as an example since the sequences and structures of many unnatural amino acid specific Mj. TyrRS mutants have been reported. Based on the crystal structures of different Mj. TyrRS mutants and Rosetta modeling result, we found D158G/P is the critical residue which influences the backbone disruption of helix with residue 158-163. Our results showed that compared with random mutation, Rosetta modeling and score function calculation can elevate the enrichment ratio of desired mutants by 2-fold in a test library having 687 mutants, while after calibration by machine learning model trained using known data of Mj. TyrRS mutants and ligand, the enrichment ratio can be elevated by 11-fold. This molecular modeling and machine learning-integrated workflow is anticipated to significantly benefit to the Mj. tyrRS mutant screening and substantially reduce the time and cost of wet-lab experiments. Besides, this novel process will have broad application in the field of computational protein design.
段秉亚,孙应飞.利用机器学习提高M. jannaschii酪氨酰tRNA合成酶底物特异性分子建模预测的准确度[J].生物化学与生物物理进展,2021,48(10):1214-1232
复制生物化学与生物物理进展 ® 2024 版权所有 ICP:京ICP备05023138号-1 京公网安备 11010502031771号