Characterization and prediction of B-cell epitopes
Date of Issue2013
School of Computer Engineering
Bioinformatics Research Centre
A B-cell epitope is a set of antigen surface residues which can be recognized by an antibody. Identifying epitopes facilitates the understanding of the basic recognition mechanism of immune responses, which in turn guides disease diagnosis, vaccine design and drug development. However, the identification of epitopes is challenging due to the complicated nature of antigen-antibody interactions as well as the context-awareness principle behind the interaction. Context-awareness highlights the existence of multiple epitopes in antigens and the reconfiguration of epitope residues when an antigen interacts with a different antibody. A coarse binary classification of antigen regions into epitopes or non-epitopes without specifying antibodies may not accurately reflect this biological reality. To accurately and rationally detect the epitopes of an antigen, comprehensive analysis of multiple epitopes is carried out. This is followed by antibody-specific epitope prediction in line with the principle of context-awareness and antibody-agnostic epitope prediction which is capable of predicting one or multiple epitopes that are consistent with the rule of context-dependence. A multi-interface domain is one that can shape multiple and distinctive binding sites to contact with many other domains, forming a hub in domain-domain interaction networks. Graph theory and algorithms are applied to discover fingerprints of interfaces, explore relations between interfaces, and establish associations between interfaces and their functions of multi-interface domains retrieved from the PDB. Experimental results show that about 40% of proteins have multiple interfaces; however, the involved multi-interface domains account for only a tiny fraction (1.8%) of the total number of domains. The interfaces of these multi-interface domains are distinguishable in terms of their fingerprints, indicating the functional specificity of the multiple interfaces in a domain. Furthermore, both cooperative and distinctive structural patterns are observed in the interfaces of multi-interface domains. Based on the fact that multiple interfaces exist in antigens and that these interfaces associate with different antibodies, a two-dimensional association-based model is established to predict antibody-specific epitopes. The two kinds of associations revealing the contextual awareness are: (i) residues-residues pairing preference, and (ii) the dependence between sets of contacting residue pairs. Preference plays a bridging role to link interacting paratope and epitope residues, while dependence is used to infer new interacting residue pairs. Experiments conducted on a non-redundant data set containing 80 antibody-antigen structural complexes have found that the proposed model yields good performance in antibody-specific epitope prediction. In addition, this model predicts antibody-specific epitopes from antigen-antibody sequences, although it is trained on antigen-antibody structural complexes, hence indicating its broad applicability in epitope prediction. The two-dimensional association can capture the context-awareness of paratope-epitope interacting complexes, but it cannot cover the contacts within a paratope or an epitope. Thus, a new concept --- coupling graph --- is introduced to include both inter-protein contacts between a paratope and an epitope as well as intra-protein contacts within a paratope or an epitope. The coupling graph is a two-layered graph with each node in one graph connecting with nodes in the other graph. The coupling graph can represent the context-awareness principle well; however, it is very challenging to mine frequent coupling subgraphs which are used to reveal the context-awareness. Therefore, a new algorithm for coupling graph mining, based on graph transformation, has been designed. Experiments show that the innovative algorithm significantly reduces the time cost and memory consumption in coupling graph mining, and its application in antibody-specific epitope prediction outperforms the association-based model. A novel graph based on the antibody-agnostic epitope prediction model is built to predict one or multiple epitopes of an antigen, which overcomes the problem that existing models predict all the antigenic residues of an antigen as a single epitope although these antigenic residues may belong to totally different epitopes. This model divides an antigen surface graph into subgraphs by using the Markov Clustering algorithm, and then a classifier is constructed to distinguish these subgraphs as epitopes or non-epitopes. The classifier is then taken to predict epitopes for a test antigen. On a big data set of 92 antigen-antibody PDB complexes, the proposed method significantly outperforms the state-of-the-art epitope prediction methods, achieving 24.7% higher averaged f-score than the best existing model. In particular, this model performs equally well on protrusive epitopes and planar epitopes which are hardly addressed by existing models. Furthermore, it can also detect multiple epitopes whenever they exist. In summary, we have comprehensively analyzed the property of multi-interfaces from both structural and functional perspectives, and have built both antibody-specific and antibody-agnostic epitope prediction models which are consistent with the principle of context-awareness. We observe that multi-interface proteins are ubiquitous, consolidating the principle of context-awareness. Both the association-based and the coupling graph-based antibody-specific epitope prediction models are effective, and the graph based antibody-agnostic epitope prediction model significantly improves prediction performance by identifying one or multiple epitopes.
DRNTU::Engineering::Computer science and engineering::Computer applications::Life and medical sciences