Improve Protein-Protein Docking Models with ProQDock: Evaluating Performance and Correlation

Table of Contents

Introduction

Current day bioinformatics and structural modelling has been faced with the question of how particular molecules of interest interact with certain counterparts of cells like other proteins. Protein to protein interaction has been studied as it emphasises that proteins rarely act alone due to their functions needing to be regulated. Many physical contacts between the protein’s interfaces reflect the composition of a protein’s surface despite the hydrophobic residues it tends to be enriched with. Researchers and scientists have proposed different theoretical simulations and models to answer this problem using “docking models”.

Molecular docking has been an important method in structural biology, aiming to predict the most likely orientation of binding between a ligand and a protein (Morris and Lim-Wilby, 2008). It has been a strategy used in drug designs based on arrangement, being able to explore molecular interactions and binding, as well as the conformational changes that occur (Ferreira et al., 2015). To be able to understand if the proposed orientation is statistically reliable or not it uses scoring, which can evaluate the particular pose, and ranking, classifying the interaction of what is most likely to occur based on affinity and favourable intermolecular interactions (Guedes et al., 2013). However, such methods using scoring and ranking cannot always be dependable, requiring intervention to improve its accuracy.

Basu and Wallner (2016) aimed to improve the limitations of sampling and scoring in the paper “Finding correct protein–protein docking models using ProQDock”, to be able to allow the development of a good quality docking model. The scoring function is a key function for a docking model to be successful and rank correctly and accurately (Basu and Wallner, 2016). The research paper presented a new algorithm called “ProQDock” which predicts the quality of docking models by combining structural information, scoring functions and predicted features using a support vector machine (Basu and Wallner, 2016). A support vector machine (SVM) is a supervised learning model related to a learning algorithm that can analyse data used for classification and “regression analysis” (Cortes and Vapnik, 1995). An SVM uses kernel methods which exploits a class of algorithms for pattern analysis, finding and “studying general type of relations in dataset’s” e.g. rankings, classifications (Cristiani and Shawe-Taylor, 2004).

This method was developed to be able to predict “DockQ”, a continuous quality measure for protein-protein docking models combining particular quality measures and scoring between 0 and 1. ProQDock was trained against two combined benchmarks (CnM), CAPRI (Lensink and Wodak, 2014) and MOAL (Moal et al., 2013), containing predictions for different methods, overcoming the small number of targets and models presented (Basu and Wallner, 2016). Benchmark 5.0 was also used as an independent test set (Vreven et al., 2015). A benchmark, in this context, is the act of running a computer program to be able to assess the performance of the subject (Fleming and Wallace, 1986). ProQDock was also trained against particular training functions, cross validation test sets and specifically SVM training. The new algorithm’s SVM results were then compared against current scoring functions, ZRANK and ZRANK 2, which both utilise the physio-chemical features of a protein in order to rescore established docking forecasts (Pierce and Weng, 2007, 2008).

Whilst it is believed that the ProQDock model was a success in comparison to other existing models, like ZRANK, it can still be argued that the reliability of its evidence may be doubted and that other measures could have been taken to make it a better algorithm.

Critical Analysis

The main findings of the paper were that overall, ProQDock was a successful method that managed to detect correct docking models in a set with incorrect ones, as well as a hybrid of ProQDock and ZRANK (ProQDockZ) working slightly better together. The training of ProQDock found that, in its constituent features, it had little correlation with the DockQ predicted values however, when combined into subsets and then the overall algorithm, performance increased up to 0.49 for ProQDock (See Figure 1). Additionally, at a 99% level of confidence interval, it was found that ProQDock had a higher correlation with DockQ on all four datasets compared to ZRANK and ZRANK 2. However, ProDockQZ – the hybrid model- gave a slightly higher correlation, which highlighted that by combining the different scoring functions, it gave a better performance. In Figure 2a, it shows that the ranking ability of ProQDock and ZRANK 2 using the cross validated data (CnM set) are similar, with the hybrid method being slightly better overall.

Computational docking relies on the efficient sampling and scoring of possible poses of protein-protein interaction. Utilising a good scoring system allows the reliable ranking of the ligands based on their docked scores, correlated to benchmarks however, this is proving more challenging. The main drawback of docking models is that in increasing complexity, it requires a more developed scoring function to distinguish right and wrong models. Even though ProQDock has shown that it can improve the scoring system as a whole and justify the paper’s conclusion, it doesn’t take into consideration the expenses needed to develop a good scoring system in a docking model. Mohan et al. (2005) found that a scoring function complexity is reduced, simultaneously resulting in a loss of accuracy. This may mean that even though the new algorithm may rank docking models better, it still may be doubted due to the source of its information. Therefore, such models and programmes focus more on certain areas and don’t take everything in overall, reducing its accuracy.

One of the key strengths of the ProQDock research is that it has had to ability to combine its new method of scoring docking models with an existing one, ZRANK, to produce a hybrid that overall is more successful. The results consistently show that the correlations are higher with ProQDockZ but by a marginal amount. On the other hand, the difference in correlation between the proposed algorithm and hybrid is very small, for example in the CnM performance test, there is a 0.1 difference or the AUC values (See Figure 2) (Basu and Wallner, 2016). This suggests that the majority of the performance is down to the new algorithm and ZRANK’s contribution has been overstretched. Arguably, Basu and Wallner (2016) found notable differences between ZRANK and ProQDock. ZRANK has complementarity characteristics that ProQDock doesn’t have, as it can pick up correct models with a low electrostatic complementarity at the interface (EC) that ProQDock misses. The combination of both models may help to counteract any conflicting disadvantages of either model, contributing to this slight increased performance.

Additionally, Basu and Wallner (2016) make use of a lot of correlational data, using the Pearson’s coefficient as a statistical test. Correlational research has been seen to lack the control that experimental research does, as it cannot confidently provide a conclusive answer for why there is a relationship. Therefore, causation can’t be determined (Filipowich, 2018). It is correct in assuming that ProQDock had a positive correlation with DockQ due to its highest score being 0.49 across the datasets (see Figure 3), however its performance could have been driven by a confounding factor. This means that the overuse of correlational data may deem the data not reliable.

The use of machine learning utilising different target functions like EC, the accessibility score (rGb) and shape complementarily (Sc) has been effective in separating correct and true native poses of proteins to ones that are doubted or incorrect (Williams, 2018). Basu and Wallner (2016) attempted to consider all key features that can affect protein structure and their interaction when developing ProQDock. There are five key features that can affect the former, such as hydrogen bonds and disulphide bridges. Nonetheless, the research carried out failed to mention or test hydrogen bonds on the interface of proteins. Lo Conte et al. (1999) said that water molecules contributed to the assembly and interaction of proteins, providing their polar interactions, even though hydrophobic effect is still quite prevalent in proteins. While the computational use of designing models for protein-protein interaction is important, it doesn’t seem to always include all contributing factors, therefore causing doubt on ProQDock’s precision.

Future Experiments

One experiment that could be carried out to further investigate rank correcting models is to test ProQDock against other existing datasets of models and targets and then evaluate it. The aim would be to further expand on ProQDock as an algorithm and its adaptability to different data sets and benchmarks. The paper limits itself to only four different data sets causing its application to be somewhat partial. By utilising five-fold cross validation and the same strict criteria of Blastclust (Altschul et al., 1990) with no homologous proteins, the methodology can just replace the CAPRI or MOAL test set for another, such as the “Score_set” (Lensink and Wodak, 2014). This is another expanded benchmark set in able to test scoring functions. Five-fold cross validation is a statistical approach utilised in the paper, also known as k-fold cross validation that uses equal sized subsets to test against each other. By doing so, this provides the possibility of being able to elaborate on the success of ProQDock and provide a comparison of the resulting correlations against the hybrid model and also ZRANK and ZRANK2. The same method can then be used on many different targets and models to increase the number of results to extrapolate a valid conclusion.

Further research that may be carried out is the input or substitution of other quality measures. DockQ is a quality measure that was optimised to reproduce the CAPRI classification and combines Fnat, LRMS and iRMS. It was previously seen as quite similar to the IS score (Gao and Skolnick, 2011) in its design, which is the Interface Similarity. From this, it could be of interest to use ProQDock against other quality measures, like the IS score, as both can be then used to evaluate targets from a dataset and compare their effectiveness. In order to test this, it would be best to perform a large-scale benchmark test analysis to compare and validate the individual scoring schemes, using the data sets CAPRI and MOAL for a confident comparison. This may highlight differences and similarities in what each quality measure is good at assessing in relation to the docking models. The IS score has been tested to be somewhat consistent with the official CAPRI assessment so would prove a fair comparison (Gao and Skolnick, 2011). By also utilising another pre-existing quality measure, it removes the possibility of a research bias due to DockQ being developed by the same researchers. The results from this would possibly lead to the development of ProQDock by constantly assessing and improving training features and quality measures that it has.

Aside from assessing docking models, the data that is obtained from ProQDock could be used to actually develop the proteins in the laboratory and tested in a more realistic situation. It would be best to use the models that had ranked the highest in the actual research and carry them forward. This means that it could be used to reinforce the computational results and take into consideration what confounding factors, e.g. in the environment, could be affecting the structure and interaction of the proteins. By using crystallography or nuclear magnetic resonance (NMR) this could provide direction and rooms for improvement in the model (Vakser, 2014). The protein that is used can then be refined and evaluated using benchmarking, comparing it to ensure that what has been proposed is statistically correct/similar. Kershaw et al (2013) proposed a similar method by screening a protein and detected protein-ligand interactions to then investigating it further using computational modelling to assist the refinement process. The results of this experiment will provide a greater understanding of the interactions of proteins, by also incorporating factors that may have been missed out in the highly ranked models of ProQDock. This would then help the development of algorithms themselves to include factors that are missing, leading to better scoring models elaborating what they are supposed to look for.

Conclusion

After analysing Basu and Wallner’s (2018) work, is evident that the premise of their data is reliable, regardless of particular improvements that could be made. The overall value of the research does prove very useful in “finding correct protein-protein docking models” and the data may prove valuable in applying it to potential drug targets (e.g. enzymes, antibodies) in the pharmaceutical industry as it concretes how and why certain molecules interact with each other. There still is room for improvement over the topic itself but this is something that will come with time and more development of computational models to tackle these problems.

Essay: Improve Protein-Protein Docking Models with ProQDock: Evaluating Performance and Correlation

Essay details and download:

Text preview of this essay:

Introduction

Conclusion

About this essay:

Essay details and download:

Text preview of this essay:

Introduction

Conclusion

About this essay:

Essay Categories: