![]() |
![]() |
Synthesis at the Interface of Chemistry and Biology

Overview
The focus of our research is on synthesis of molecules and molecular assemblies with novel physical, chemical or biological properties and functions. Although chemists are quite sophisticated in their ability to synthesize complex molecular architectures, our ability to rationally design and synthesize molecules with a desired molecular function is still in its infancy. Nature, on the other hand, has generated a vast array of molecules with remarkable properties – from the antibody molecule (molecular recognition) and enzymes (catalysis) to the photosynthetic center (energy harvesting). Given Nature’s “synthetic” prowess, we have undertaken a biologically inspired approach to synthesis in which the molecules and processes of living organisms are combined with the principles and tools of chemistry to create molecules with new functions difficult to generate by either approach alone. By studying the properties of the resulting molecules, new insights are gained into the molecular mechanisms of complex biological and chemical systems.
The above theme runs throughout all of the projects in the lab. Current efforts focus on: (1) the generation of catalytic antibodies and the characterization of their mechanisms and immunological evolution; (2) the development and application of general methods to expand the genetic codes of living organisms; (3) the application of combinatorial methods to the generation of small molecules, proteins, nucleic acids, and even solid state materials with novel properties; and (4) the application of chemical and genomics tools (including the use of arrayed small molecule, siRNA and cDNA libraries in cellular pathway and phenotypic screens, forward mouse genetics screens, and protein/mRNA expression profiling) to better understand and ultimately control (both in vitro and in vivo) the biological processes involved in stem cell self renewal and differentiation, cellular reprogramming, oncogenesis and degenerative diseases.
Catalytic Antibodies
One example of the synergy between chemistry and biology in the synthesis of new molecular function is the development and application of combinatorial strategies. This approach, in which large, diverse collections or “libraries” of molecules are generated and subsequently screened/selected for novel functions, stems from combinatorial processes in nature. For example, the humoral immune system has developed highly sophisticated combinatorial mechanisms for generating large libraries of antibodies and selecting antibodies from this diversity that can recognize foreign antigens with high affinity and selectivity. The notion that natural immunological diversity can be used to generate novel chemical function was first illustrated with the generation of catalytic antibodies. The early experiments involved the generation of esterolytic antibodies using phosphonate/phosphate transition state analogues. Since those experiments, antibodies have been developed that catalyze a wide array of chemical and biological reactions, from acyl transfer to water oxidation, with specificities rivaling or exceeding those of enzymes. In a number of cases, antibodies have been found to have rates and mechanisms comparable to those of known enzymes (e.g., catalysts for pericyclic, acyl transfer, metallation, and adol reactions). In addition to the notions of transition state stabilization, many other strategies have been developed to generate catalytic antibodies including general base and covalent catalysis, proximity effects, and the use of strain. Most recently, we have focused on the development of novel chemical screens, and genetic and phage-based selections for identifying mutants with enhanced catalytic function.

Figure 1. The structural plasticity of a germline antibody is fixed by somatic mutations during affinity maturation.
The characterization of catalytic antibodies is also providing insights into the mechanisms and evolution of catalytic function in nature. For example, structural comparisons of germline and affinity-matured antibodies are providing new insights into the molecular nature of the immune response, including the critical role of structural plasticity in determining the tremendous binding potential of the germline repertoire. This work underscores the importance of conformational flexibility in the ability of the immune system, and very likely primitive proteins, to evolve a large number of different binding and catalytic activities. These structures have also revealed important insights in the role of mutations removed from the active site in controlling active site structure and catalytic activity.
The recent detailed characterization of the immunological evolution, three dimensional structures and mechanisms of antibodies that catalyze oxy-Cope, Diels-Alder and metallation reactions are helping to dissect the relationship between binding energy and catalysis in the evolution of catalytic function. For example, we have recently solved the structure of a “ferrochelatase” antibody Michaelis complex which reveals that the substrate is bound in a strained conformation – providing direct structural evidence for the theory of substrate strain proposed by Haldane over 70 years ago. Currently, we are applying phage-based screens and selections to evolve antibodies with increased aldolase, hydrolyase and oxidase activity.
Other Applications of Molecular Diversity
The demonstration that the vast structural diversity of antibody molecules can be redirected with the proper chemical instruction to generate selective catalysts shows the synthetic power of using Nature’s molecular diversity (the antibody repertoire in this case) to produce new function. We have also applied this combinatorial approach to many other problems in chemistry and biology. For example, we have used diversity-based approaches to generate (1) sequence specific DNA binding molecules and site specific recombinases; (2) small molecules that act as molecular switches to regulate macromolecular interactions (growth factor-receptor and transcription factor-DNA interactions); (3) novel protein folds from random libraries of secondary structural elements; and (4) cellular sensors of RNA localization and cellular respiration. We have also extended diversity-based approaches to the generation of solid state materials with novel properties. This has involved the development of methods for the parallel synthesis, processing and screening of large libraries of solid state inorganic and organic materials (electronic, magnetic, optical, and catalytic) and even devices for new properties. Novel magnetoresistive, luminescent and ferroelectric materials have been identified using this approach. These experiments have enhanced our ability to mine the periodic table for new materials with novel properties. In other projects in the materials science area, we have used biomolecules to control the three dimensional structures of nanoclusters in order to further explore this novel form of matter. Most recently, we have synthesized DNAs containing metallo-base pairs and are exploring their ability to act as molecular wires.

Figure 2. A library of novel luminescent molecules generated by laser ablation of metal oxides through a series of physical masks.
Expanding the Genetic Code
The genetic codes of every known organism encode the same 20 amino acid building blocks using triplet codons generated from A, G, C and T. These twenty amino acids contain a limited number of functional groups including acids, amides, alcohols, basic amines and thiols. Clearly there is a need for additional functional groups in proteins as evidenced by the large number of posttranslational modifications and cofactors, as well as the rare occurrence of the unnatural amino acids selenocysteine and pyrrololysine. Why then are there only 20 genetically encoded amino acids? Is this the ideal number or would additional amino acids allow the generation of proteins or even entire organisms with enhanced properties? The ability to introduce amino acids with precisely tailored steric and electronic properties into proteins would also allow us to carry out “physical organic” studies of proteins much the same way as has been historically done with small molecules.
To this end, we have developed a methodology that allows one to genetically encode novel amino acids, beyond the common twenty, in prokaryotic and eukaryotic organisms. This methodology involves the generation a unique codon-tRNA pair and corresponding aminoacyl-tRNA synthetase. Specifically, an orthogonal tRNA is constructed that is not a substrate for any natural aminoacyl synthetases and which inserts its cognate amino acid in response to the amber nonsense codon. A cognate synthetase is then generated which recognizes this unique tRNA and no other; the substrate specificity of this synthetase is then evolved to recognize a desired “twenty first” amino acid, and no endogenous amino acid. We have shown that this methodology can be used to efficiently incorporate a large number of amino acids into proteins in E. coli and yeast with fidelity and efficiency rivaling that of the common amino acids. Using this methodology, we have added a variety of novel amino acids to the genetic codes of E. coli. These include heavy atom containing amino acids to facilitate x-ray crystallographic studies; amino acids with novel steric/packing and electronic properties; photocrosslinking amino acids which can be used to probe protein-protein interactions in vitro or in vivo; keto, acetylene, azide, thioester and selenide containing amino acids which can be used to selectively introduce a large number of biophysical probes, tags, and novel chemical functional groups into proteins in vitro or in vivo, or to generate crosslinked or cyclic proteins and peptides; glycosylated amino acids which allow the selective glycosylation of proteins in vivo; redox active amino acids to probe and modulate electron transfer in proteins; photocaged and photoisomerizable amino acids to photoregulate biological processes; metal binding amino acids for catalysis, self-assembly and metal ion sensing; amino acids that contain fluorescent or IR active side chains to probe protein structure and dynamics in vitro and in vivo; and sulfated amino acids and mimetics of phosphorylated amino acids as probes of protein posttranslational modifications.

Figure 3. Scheme for evolving aminoacyl tRNA synthetases with novel specificities.
We have also shown that one can “synthesize” a completely autonomous bacterium that not only genetically encodes a novel amino acid, but also biosynthesizes this amino acid to form basic carbon sources. This is the first example of the creation of a twenty-one amino acid organism, and allows us to explore its ability to evolve under a variety of growth conditions in comparison to a twenty amino acid bacterium. We have also generalized this methodology to yeast and mammalian cells, and are currently developing strategies to apply this methodology to multicellular organisms. A consensus-based approach has also been developed for generating new orthogonal tRNA-synthtase pairs and has been used (together with a novel selection scheme to identify efficient four base codon decoding tRNAs) to genetically encode novel amino acids in bacteria in response to four base codons. In addition, we are attempting to remove redundancy in the existing genetic code of E. coli to encode additional amino acids. Finally, we are applying this methodology to studies of protein structure and function in vitro and in vivo, as well as the evolution of proteins with novel properties, including therapeutic peptides (cyclic peptides, oligomerized peptides, etc.), proteins (antibody libraries containing unnatural amino acids, modified and bivalent antibodies, etc.) and vaccines (containing immunogenic amino acids to break tolerance).

Figure 4. Novel amino acids that have been or are being genetically encoded in prokaryotic and eukaryotic organisms.
In collaboration with Floyd Romesberg, we have also extended the above ideas to the genetic lexicon by asking whether the Watson-Crick A-T and G-C pairs can be augmented (ultimately in vivo) with unnatural base pairs. We have developed a series of hydrophobic and metallo-base pairs that are as stable and selective as AT and GC and have used polymerases to enzymatically incorporate a novel base pair into DNA. This work suggests that Watson-Crick hydrogen bonding may not be a unique solution to the problem of information, storage and replication. We are also evolving a DNA polymerase (using “catalytic phage selection methods”) that incorporates into DNA with good fidelity.

Figure 5. A metallo-base pair
Experiments in Functional Genomics
We have recently begun to use “rationally designed” chemical libraries together with phenotypic and pathway based screens to identify and characterize small molecules with novel biological activities. Chemistries have been developed to efficiently synthesize combinatorial libraries of over 100,000 heterocyclic compounds designed around a large number of kinase-directed molecular scaffolds, including substituted purines, pyrimidines, quinazolines, pyrazines, pyrrolopyrimidine, pyrazolopyrimidine, phthalazines, pyridazines, and quinoxalines.

Figure 6. A highly efficient strategy for the synthesis of combinatorial libraries of heterocyclic compounds.
These libraries are being screened for molecules that control stem cell fate and self-renewal (embryonic and adult), as well as molecules that induce reprogramming of lineage committed cells. For example molecules have been identified that: (1) selectively induce neurogenesis or oligodendrocyte formation in murine adult neural stem cells and embryonic stem cells (ESCs); (2) efficiently induce cardiomyogenesis in murine ESCs and selectively differentiate mESCs to germ cells; (3) selectively induce osteogenesis in mesenchymal stem cells; (4) allow embryonic stem cells to be propagated in the absence of feeder cells or LIF; (5) induce differentiation of hematopoietic stem cells to megakaryocytes; and (6) induce differentiation of keratinocytes. Not only may it be possible to use such molecules to control stem cell proliferation and differentiation in animal models of neurodegeneration, cardiac disease or diabetes, they should also provide new insights into the biology of these complex processes. For example, by using a combination of biochemical and genomic techniques (cDNA complementation, siRNA knockdown, phosphoproteome and mRNA expression analysis, etc.), we have shown that one of these molecules acts on the sonic hedgehog pathway, and another on Wnt signaling—two key developmental pathways. The action of other molecules involves simultaneous inhibition of two signaling molecules (RasGAP and Erk1 for mESC self-renewal, and NMMII and Mek1 for myoblast reprogramming.). An exciting recent result is the identification of molecules that induce reprogramming of lineage committed cells (myoblasts in one case and oligodendrocyte progenitor cells in another) to multipotent progenitor cells that can be differentiated to other cell types. This opens the possibility of regenerating tissue from fully differentiated cells rather than the use of ESCs. In addition, we are carrying out cellular screens to identify molecules that modulate hedgehog, Wnt and Notch signaling pathways (cancer), inhibit malarial kinases, activate antioxidant genes (aging and neurodegeneration), inhibit translocated kinases (cancer), modulate cell migration (multiple sclerosis and metastasis), upregulate fetal hemoglobin (sickle cell anemia), degrade A2E and ß amyloid (neurodegeneration), and proliferate ß cells (diabetes).

Figure 7. Inducing cardiomyogenesis in embryonic stem cells.
In addition to small molecules, we are screening large arrayed cDNA libraries and siRNA libraries and have identified novel gene products involved in p53 activation, Wnt signaling, HCV replication, cell cycle regulation, cell migration, neurogenesis and osteogenesis. Arrayed cDNA libraries are also being used in complementation experiments to determine the mechanism of action of small molecules. We are also carrying out cellular screens of siRNA libraries that target a large number of conserved noncoding RNAs between mouse and human to characterize their function; as well as investigating localization and modifications of noncoding RNAs.
We are also carrying out forward genetics experiments with ENU mutagenized mice to identify novel genes involved in tissue regeneration and cancer resistance. Other functional genomics efforts involved the use of mRNA and phosphoprotein profiling to gain a better understanding of physiological and disease processes such as aging, viral infection, stem cell self renewal, and cell cycle progression and regulation. And finally, we have begun a major effort to identify and characterize endogenous small molecules that modulate developmental processes in mammalian cells, and have in preliminary studies isolated molecules that control the fate of a number of adult stem cells.

Figure 8. An image-based siRNA screen of the mammalian cell cycle.
Many of the above experiments in functional genomics are being carried out in collaboration with investigators at the Genomics Institute of the Novartis Research Foundation (GNF) (www.gnf.org), and with Drs. Charles Cho and Xu Wu who are adjunct Assistant Professors at TSRI and Principal Investigators at GNF.