Investigating the volume &diversity of data needed for generalizable antibody-antigen ∆∆G prediction
Автор: Boston Protein Design and Modeling Club
Загружено: 2023-08-04
Просмотров: 1587
Presented on August 3rd 2023 by Alissa Hummer
Abstract:
Antibody-antigen binding affinity lies at the heart of therapeutic antibody development: efficacy is guided by specific binding and control of affinity. Here we present Graphinity, an equivariant graph neural network architecture built directly from antibody-antigen structures that achieves state-of-the-art performance on experimental ∆∆G prediction. However, our model, like previous methods, appears to be overtraining on the few hundred experimental data points available. To test if we could overcome this problem, we built a synthetic dataset of nearly 1 million FoldX-generated ∆∆G values. Graphinity achieved Pearson’s correlations nearing 0.9 and was robust to train-test cutoffs and noise on this dataset. The synthetic dataset also allowed us to investigate the role of dataset size and diversity in model performance. Our results indicate there is currently insufficient experimental data to accurately and robustly predict ∆∆G, with orders of magnitude more likely needed. Dataset size is not the only consideration – our tests demonstrate the importance of diversity. We also confirm that Graphinity can be used for experimental binding prediction by applying it to a dataset of over 36,000 Trastuzumab variants.
Доступные форматы для скачивания:
Скачать видео mp4
-
Информация по загрузке: