![]() ![]() The method falls under supervised learning and requires categories of sites as input.Ī sequence-order independent algorithm for binding site comparison. Similarity exercise is performed through alignment of hash tables.Ī binding site is encapsulated in a 3D voxel-grid which is passed through a convolutional neural network to map to a vector descriptor. It is a geometric hashing based comparison technique, hashes pairs of residues, taken from input structures into a 3D hash table based on residue types and distances from geometric centers. Table 1. Brief descriptions of methods for binding site comparison.īinding sites are aligned based on matching residue groups between sites and carrying out site alignment. A brief description of these methods is provided in (table 1). Several methods have come till date for the purpose of binding site comparison. Alignment dependent methods provide detailed information on the atomic mapping of protein structures where as alignment free methods output a similarity score. There are two types of site comparison methods: alignment-dependent methods and alignment-free methods. However for scaling to handle thousands of features originating in data such as medical texts, protein function, pathway information and other sources, deep learning methodologies have come into the picture. Traditional machine learning methodologies over the last two decades require careful feature engineering and sometimes involve hand crafted features and atomic motifs. This volume of data has enabled use of machine learning methodology for comparison of protein–ligand binding sites. Today about 167 000- high-quality protein structures are available in the Protein Data Bank. The structure of a protein is determined through a variety of means including experimental mechanisms such as x-ray crystallography or nuclear magnetic resonance and analytical methods such as sequence or structural alignment to known structures. ![]() Ī main source for structure of a binding site is the structure of its protein in terms of the 3D coordinates of all of its thousands of atoms. Binding site comparison is one of the major methods in the field of structural bioinformatics and drug discovery. A major reason for functional similarity between proteins is the similarity of their binding sites and hence interaction patterns with ligand molecules. One protein may have a dozen concavities and some of which may be functional resulting in moonlighting behaviour. Electrostatic and chemical complimentarity, aromatic stacking and other forms of interactions between atoms of a concavity and a ligand molecule result in chemical binding. ![]() A binding site is a concavity on the surface of a protein, composed of several amino acids whose side chains interact with a ligand molecule. Protein–ligand interaction plays an important role in biological systems. ![]() We also provide the method as a standalone executable and a web service hosted at ().Ī protein performs its function by interacting with ligands and other small molecules. The algorithm serves for high throughput processing and has been evaluated for stability with respect to reference frame shifts, coordinate perturbations and residue mutations. The method has been the top performer with more than 95% quality scores in extensive benchmarking studies carried over 10 data sets and against 23 other site comparison methods in the field. The vector embedding serves as a locality sensitive hash function for proximity queries and determining similar sites. The method is based on pairwise distances between representative points and chemical compositions in terms of constituent amino acids of a site. We report here a novel algorithm, Site2Vec, that derives reference frame invariant vector embedding of a protein–ligand binding site. However, one fundamental challenge in applying deep learning to structures of binding sites is the input representation and the reference frame. In this regard, deep neural network algorithms are now deployed which can capture very complex input feature space. Traditional methods based on hand engineered motifs and atomic configurations are not scalable across several thousands of sites. Machine learning methods for similarity assessment require feature descriptors of binding sites. To this end, methods for computing similarities between binding sites are still evolving and is an active area of research even today. Tasks such as assessment of protein functional similarity and detection of side effects of drugs need identification of similar binding sites of disparate proteins across diverse pathways. Binding sites would also determine ADMET properties of a drug molecule. Ligands are small molecules that interact with protein molecules at specific regions on their surfaces called binding sites. Protein–ligand interactions are one of the fundamental types of molecular interactions in living systems. ![]()
0 Comments
Leave a Reply. |