Success Stories
Predict substrate

Substrate prediction for enzyme Tm0936

A pressing challenge in biology is predicting the function of the proteins the genes encode. It would be useful to be able to probe for such function directly, based on protein structures. In an effort to do so, Johannes Hermann in the Shoichet Lab attempted to predict substrates of the enzyme Tm0936 from T. maritima. The x-ray structure of this enzyme had been determined as part of a structural genomics effort (PDB codes 1p1m and 1j6p), and it can be assigned to the Amidohydrolase Superfamily (AHS) by fold classification and by the identity of certain active site groups; these metaloenzymes are united by the attack of a nucleophilic hydroxide or water on an electrophilic center-beyond this, their mechanisms are quite diverse. Consequently, the substrates for Tm0936 were anything but clear. By sequence similarity, Tm0936 most resembles the large chlorohydrolase and cytosine deaminase subgroup, which is often used to annotate amidohydrolases of unknown function. However, Raushel's group had tested 14 cytosine derivatives as Tm0936 substrates but observed no turnover.

Site & Ligand Innovations

In an effort to find the true substrate, Johannes therefore docked a database of high-energy intermediates into the structure of Tm0936, sampling thousands of configurations and conformations of each molecule. Each of these was scored by electrostatic and van der Waals complementarity, corrected for ligand desolvation energy, and ranked accordingly. There were two important innovations here: first, the decision to restrict ourselves to the KEGG metabolites database of about 10,000 possible molecules, and second to dock these in their high-energy intermediate forms, and not in their ground state structures. We reasoned that we were more likely to find substrates among primary metabolites than in any other source of compounds, thus dramatically reducing the search space. We further reasoned that the enzyme was pre-organized to recognize the excited, high-energy intermediate form of the substrate, the docking of which would inform us of the reaction being performed in addition to improving the fidelity of the modeled interactions.

Substrate tested high-energy intermediate Dock Rank Relative docking scores (kcal mol-1) Km (µM) kcat (s-1) kcat/Km (M-1s-1)
S-adenosyl-homocysteine 50210±4012.2±0.85.8 x 104
5-Methyl-thioadenosine 64.444±47.2±0.21.4 x 105
Adenosine 149.5250±402.3±0.29.2 x 103
adenosine-5-monophosphate 8020.2nd<10-3nd
S-adenosyl-l-methionine 51135.2nd<10-3nd


In the docking calculation about 4,300 metabolites that had functionality known to be recognized by members of the AHS enzymes were docked into the structure of Tm0936. Since we could not assume stereochemistry, and because multiple protomers and tautomers had to be considered, the number of molecular structures docked was over 22,000, each in several thousand low energy conformations. The molecules best-ranked computationally were dominated by adenine and adenosine analogs, which make up 9 of the 10 top-scoring docking hits. For all of these, an exocyclic nitrogen has been transformed into a tetrahedral, high energy center, as would occur in a deamination reaction. The dominance of adenine and adenosine analogs, in this form, owes to nearly ideal interactions with the active site. An example is the docked structure of the high-energy intermediate for the deamination of 5-methyl-thioadenosine (MTA), the 6th ranked molecule (figure 1).

Based on the docking ranks and compound availability, we selected four potential substrates for deamination by Tm0936: MTA, SAH, adenosine and adenosine-monophosphate (AMP), all of which scored well (5th, 6th, 14th, 80th out of 4207 docked metabolites), underwent the same reaction, and chemically resembled one another. By extension, we also investigated the well-known metabolite S-adenosyl-l-methionine (SAM), a close analog of SAH, even though its docking rank, at 511th, was poor.

These five molecules were tested as substrates in the lab of Frank Raushel, and three had substantial activity as substrates, with MTA and SAH reaching kcat/Km values of 1.4x105 and 5.8x104 M-1s-1 respectively, and adenosine close to 104 M-1s-1. Tm0936 is relatively active compared to other adenosine deaminases, 22 especially since the optimal temperature for this thermophilic enzyme is almost certainly higher than the 30°C at which it was assayed. Consistent with the docking predictions, SAM was not deaminated by Tm0936, despite its close similarity to SAH. Conversely, AMP, which did rank relatively well (80th of 4207), was also not an enzyme substrate. The inability of the docking program to fully deprioritize AMP reflects some of the well-known problems in docking scoring functions, in this case balancing ionic interactions and desolvation penalties for the highly charged phosphate group of AMP.

To further investigate the mechanism, the structure of Tm0936 in complex with the purified product of the SAH deamination reaction, S-inosylhomocysteine (SIH), was determined to 2.1 Å resolution by x-ray crystallography by Sasha Federov in Steve Almo's lab (figure 1). The differences between the docked prediction and the crystallographic result are inconsequential, with every key polar and non-polar interaction represented in both structures. Indeed, the correspondence between the docked and crystallographic structures is closer than one might expect for inhibitor predictions, where docking has been more commonly used. This may reflect the advantages of docking substrates in high-energy intermediate geometries, which encode more of the information necessary to specify fit.

Lessons & Caveats

This work describes one successful function prediction by docking, and it would be imprudent to assume that the technique will always be reliable towards this end. Raushel's recognition of Tm0936 as an amidohydrolase limited the number of possible reactions to be considered. When even the gross mechanistic details of an enzyme cannot be inferred, this will not be possible. Restricting ourselves to metabolites was also helpful, but this too will not always be appropriate. Finally, we were fortunate that Tm0936 underwent little conformational change on substrate binding. Enzymes that undergo large conformational changes along their reaction coordinates will be more challenging for docking. Notwithstanding these points, it does suggest that molecular docking is a useful tool for substrate prediction, one that will find increasing use as the structures of more proteins of unknown function are determined.


The paper describing this work was: JC Hermann, R Mart-Arbona, AA Fedorov, E Federov, SC Almo, BK Shoichet, FM Raushel. Structure-based activity prediction for an enzyme of unknown function. Nature, Aug 16 (2007).

© 2000-2018 Blue Dolphin Lead Discovery, LLC. Last updated Jan 5, 2018. Please address comments about this website to bluedolphindiscovery at