A Perspective of Synthetic Biology: Assembling Building Blocks For Novel Functions
A Perspective of Synthetic Biology: Assembling Building Blocks For Novel Functions
A Perspective of Synthetic Biology: Assembling Building Blocks For Novel Functions
DOI 10.1002/biot.200600019
Review
Synthetic biology is a recently emerging field that applies engineering formalisms to design and construct new biological parts, devices, and systems for novel functions or life forms that do not exist in nature. Synthetic biology relies on and shares tools from genetic engineering, bioengineering, systems biology and many other engineering disciplines. It is also different from these subjects, in both insights and approach. Applications of synthetic biology have great potential for novel contributions to established fields and for offering opportunities to answer fundamentally new biological questions. This article does not aim at a thorough survey of the literature and detailing progress in all different directions. Instead, it is intended to communicate a way of thinking for synthetic biology in which basic functional elements are defined and assembled into living systems or biomaterials with new properties and behaviors. Four major application areas with a common theme are discussed and a procedure (or protocol) for a standard synthetic biology work is suggested.
Keywords: Biomaterials Genetic circuits Minimal genome Synthetic biology Systems biology
1 Introduction
Over the last five decades, engineering sciences have inspired numerous successful applications in the fields of manufacturing, electronics, communications, transportation, computer and networks, and so on. Systems and control engineers have made substantial contributions to modeling, analysis, design, and implementation of a wide variety of very complicated engineering systems in use today. The multi-scale, holistic approaches that have leveraged technology improvements in sensing and computation with breakthroughs in the underlying principles and mathematics have characterized the non-biological sciences in the 20th century. Compared to the engineering systems, biological systems are more complex and their mechanisms are less
Correspondence: Professor Pengcheng Fu, Department of Molecular Biosciences and Bioengineering, University of Hawaii at Manoa, 1955 East-West Road, Honolulu, HI 96822, USA E-mail: pengchen@hawaii.edu Fax: +1-808-956-3542 Abbreviation: AHL, acylhomoserine lactone
known. Historically, biological questions have been approached by a reductionist paradigm that is completely different from methodologies being applied to engineering systems. This reductionist way of thinking was based on the assumption that by unraveling the function of all the different component parts the information gained could be used to piece together the puzzle of complex cellular networks [1]. The research paradigm has dominated mainstream biology with enormous progresses in accumulating biological information at genetic and protein levels. However, this is a slow and exhaustive process that fails to adequately approach the true complexities of living phenomena, and is of limited relevance to biological systems as a whole. With the fast-growing applications of genomics and high-throughput technologies, it has been found that a new paradigm is needed in biology for the next level of understanding the functions of genes and proteins, and the regulation of intracellular networks, which cannot be obtained by studying the individual constituents on a part-by-part basis. It is also realized that there is great similarity of biology and engineering at the system level, despite their obviously different physical implementation, and that important research challenges in biology may have parallels with those in complicated en-
690
www.biotechnology-journal.com
gineering systems [2]. This similarity forms a basis for the introduction of synthetic biologyor the engineering applications within biological systems. The emergence of synthetic biology is a result of numerous advances in experimental and computational biology over the last couple of decades or so. However, the idea of applying engineering principles to biological systems is not new. More than half century ago, Wiener tried to explicitly consider biological systems with a cybernetics approach in the same way as for technical systems [3]. Since then, many attempts have been made to apply engineering approaches, such as systems and control design, to biology for developing synthetic technologies [4]. Bioengineering [5], or biological engineering, is one of the examples for which the engineering knowledge is transferred to fields of medicine and biology [6], and the engineering design process is employed to treat biological systems [7]. Since the 1970s, recombinant DNA methods [8, 9] have been developed as an enabling technology to genetically engineer biological systems for numerous bioproducts and bioproperties. Synthetic biology was initially termed in the literature in 1980 by Barbara Hobom for the genetic engineering of bacteria using recombinant DNA technology [10]. Researchers from different fields have developed different perspectives on synthetic biology. For example, computer scientists have borrowed the term for their study of artificial life by digital computers [11]. On the other hand, chemists have used synthetic biology to describe the method to obtain artificially synthesized organic molecules that function in cellular systems [12]. The engineering community views synthetic biology as technology for processing chemicals, energy, information and material using living systems. More recently, synthetic biology has been redefined as for: (i) the design and construction of new biological parts, devices, and systems that do not already exist in the nature, and (ii) the re-design of existing, natural biological systems for useful purposes (http://syntheticbiology.org/). More specifically, synthetic biology aims to design and build engineered biological systems that process information, manipulate chemicals, fabricate materials, produce energy, provide food, and maintain and enhance human health and our environment (see the Wikipedia: http://en.wikipedia.org/ wiki/Synthetic_biology). However, the scope and framework of synthetic biology are not yet fully defined.
and assembled into living systems or biomaterials for new properties and behavior. We focus on the current synthetic biology research in the follow four areas detailed below: (i) design and redesign of cellular networks, (ii) genetic circuit engineering, (iii) synthesis of biomaterials, and (iv) quest for the minimal organism.
2.1 Design and redesign of cellular networks
Cells can be viewed as networks of interactions among proteins, DNA, and metabolites involved in signaling, material and energy transfer. Among them, metabolic networks are highly complex nonlinear reaction webs with continuous dynamics, which are tightly coordinated to meet the physiological demands of living systems and to adapt to the cellular environment. Studies on metabolic networks have been ongoing for more then three decades [13]. Potentially useful metabolic pathways and their regulatory networks are likely to remain undiscovered until particular growth conditions or inducing agents trigger their expression. Discovery in the absence of expression is done by genome-wide cloning into bacterial artificial chromosomes, cosmids, or other vectors using random sequence tags (see, for example, [14]), followed by sequencing and computational motif searches. YegerLotem et al. [15] worked from the premise that networks can be identified by concurrently expressed combinations of transcription factors and the proteins they regulate, and/or by members of the set of protein-protein interactions. They showed that from these two data sets, there were 5 possible directed interaction motifs between two proteins, 13 possible between three proteins, more than 3000 possible between any four proteins. They also demonstrated that most of the four-protein motifs were combinations of three-protein motifs. Numerous examples of these types of network were found in the yeast genome. In parallel to the metabolic networks, genetic circuits are used to represent relationships among genes. These relationships may be established by observing how the expression level of each one affects the expression level of others [16]. With the increasing complexity of the genetic circuits, we can no longer grasp such gene interactions by gazing at the 2-D map of biological regulatory networks. Advanced ways of analyzing and visualizing the biological regulatory networks, especially explaining methods are needed to integrate experimental information such as transcriptional and protein-interaction array data. It was found that the dynamics of prokaryotic gene circuits is determined by the system behavior descriptors, such as the number, type, and placement of the regulatory protein binding sites [17]. Currently, hundreds to thousands of diverse genetic regulatory circuits, most of which are from bacteria, have been characterized experimentally [18]. Researchers are seeking to understand and design genetic networks to gain insights into how the or-
691
Biotechnology Journal
ganisms work. A comprehensive, but still incomplete picture for transcriptional regulatory networks is available for a small number of organisms such as E. coli and S. cerevisiae [19]. Qualitative tools are being developed for the dynamic analysis of gene regulatory networks [20]. The study of genetic networks lies in the broader area of molecular biology, computer science and system engineering. It intends to provide predictive power about the behavior of individual genes under given conditions and the overall behavior of the system based on the interactions of genes and proteins. Network theory and nomenclature can be used for the determination of modular structure of the biological systems. For example, Dennis Bray described the characteristics of scale-free, (as opposed to regular or random) networks, and cited the evidence that metabolic networks are of such scale-free structure [21]. Other key properties, including their modular nature, recurring use of simple elements, and sequence motifs that are distinctive for those elements, were presented by Alon [22]. An approach to computational identification of gene modules was developed by Bar-Joseph et al. [23], based on the assumption that groups of co-expressed genes are likely to be associated with DNA binding of the same sets of transcription factors. Synthetic biology aims to create novel functions and life forms by modifying/integrating biological/non-biological components into metabolic/genetic networks. Mathematical tools should be used to predict and direct the integration of heterologous components into cellular networks. The model predictions should be validated by experiments. Furthermore, high-throughput approaches such as microarray, proteomics, RNAi screens etc, should be used to generate large amount of information, which is then used to improve the modeling accuracy. Analysis of microarray experimental data (Dutilh and Hogeweg, 1999, report Binf.1999.11.01, Bioinformatics, Utrecht University; http://www-binf.bio.uu.nl/dutilh/gene-networks) is an example of how the combination of computation and experiments can lead to the reconstruction of genetic networks. In this work, models for gene networks were established based on Boolean networks [24, 25]; differential equations [26], etc. The advent of microarrays, reporter gene systems, two-hybrid and immunoprecipitation assays, and other technologies has produced vast and diverse pools of information about transcriptional networks and subsequent protein interactions, modifications and turnover. In many cases, the background of constitutive activity, and the different ranges of response make it difficult to compare results and clearly identify/quantify the modified signals. Nevertheless, three major concepts have emerged: (a) Basic biological circuit elements (subnetworks) may be identified by methods that measure transcription regulation (TR) or protein-protein (PP) interactions, (b) it is possible to define basic circuit elements that mediate seven types of processes: transcription and
turnover of regulator and effector mRNAs; synthesis and turnover of transcription factors and enzymes; variations in amounts of reaction substrates and intracellular signaling molecules; transport of extracellular signal molecules; and degradation or dilution of reaction intermediates, cofactors, and products [18]. (c) Subnetworks connect to form finite sets of larger networks with predictable, recurring motifs, and it is possible to identify motifs that include both TR and PP interactions [15]. The availability of complete genome sequences has led to development of microarray-based methods to concurrently identify and locate the sites of transcriptional activation or repression. Genome-wide Location Analysis (GWLA) was first described by Ren et al. [27] to correlate DNA-binding protein data with microarray profiles of genes expressed in yeast. Qualitative simulation is a critical way for analyzing and designing genetic regulatory systems. It intends to provide predictive power about the behavior of individual genes under given conditions, and the overall behavior of the system based on the interactions of genes and proteins. A qualitative simulation of large and complex genetic circuits by an algorithm using a class of piece-wise linear differential equations has been carried out. The model was constrained on possible trajectories in the phase space. Eighteen genes in complex feedback loops were used for the simulation [28]. The method has been shown to be capable of dealing with networks of large size ad complexity. Gardner et al. [29] used multiple linear regression of quantitative PCR data from transcriptional perturbation experiments to formulate a steady-state model of the nine-gene SOS DNA repair network based on an engineering formalism called system identification theory, without prior information about the networks structure. Experimental data were used as training sets for the simulation. The model accurately predicted the expression of the individual genes under various conditions, and identified the gene affected by an inducing compound, mitomycin C. Furthermore, the accuracy of the predictive model could be improved using a less noisy method (MS) to quantify mRNA levels. Similar experiments have been used to define transcriptional networks in terms of nonlinear relationships [30]. Although progress has been made in modeling and simulation of natural biological systems, these systems are not optimized to be easy to model [31]. As with other networks, biological networks function in time and space. Complex examples include the control of cell division (mitosis), and processes of embryonic development and morphogenesis. Remarkable progress has been made in simulating some of these phenomena. New genes and control elements are continually being discovered and added to the simulations, which in turn have led to modified experiments. A current model for regulation of differentiation between endoderm and mesoderm in the sea urchin embryo includes more than 40 genes [32]. Existing bacterial regulatory networks can be characterized
692
www.biotechnology-journal.com
in terms of the more advanced performance characteristics applied to engineered circuits. These include stability (the range of variation in which the network maintains its normal function), robustness (sensitivity of the network to changes in individual effectors), responsiveness (the rise time or decay time for a change in a particular perturbing variable), gain (magnitude of linear or logarithmic change in response to an effector), coupling (direct, inverse, delayed, or no effect of the output upon the regulator gene), transient or constitutive response, and stochasticity (effect of noise, and probabilistic response) ([18] and [33], also available at http://helix-web.stanford.edu/psb98). More recently, a computational method was reported that searches genome-wide sequence data, identifies enzymes, and assigns them to appropriate metabolic pathways [34]. The gene-pathway relationship places gene variations in a more biochemical context. Pathway assignment is useful to verify genomic annotations, it reveals missing enzymes, co-regulated enzymes, enzymes that function in more than one pathway, evolutionary changes in pathways, cryptic pathways that exist but do not coordinately operate, the requirement for particular cofactors, genes that encode subunits of a particular enzyme, and linkages that exist between groups of pathways. The genome-pathway linkage also provides information about symbiosis, adaptation of the organism to different growth conditions, and the ability to create and respond to various external signals.
2.2 Genetic circuit engineering
While systems biologists have been using the ever-expanding experimental and computational tools to create a more complete big picture of organismal function, others have been taking a traditional engineering approach, attempting to build functional modules from nucleic acid and protein parts lists [35]. This work is largely based on the analogy between biological regulatory and information transfer networks and their electronic, mechanical, and computational counterparts. Such similarity of transcriptional regulation to electrical network and circuit theory was first made evident by the pioneering work of Jacob and Monod [36]. In that work, they established the model of prokaryotic operon and stated that It is obvious from analysis of these mechanisms that their known elements could be connected into a wide variety of circuits endowed with any desired degree of stability [36]. Using electrical circuits as their paradigm, synthetic biologists have established genetic circuit engineering that starts with the simplest elements, such as activators and repressors to actualize digital functions in cellular systems [37]. Genetic circuit engineering requires tools from multiple disciplines, such as physics and engineering principles, dynamic systems and control theories, bifurcation analysis, computational biology and genetic engineering
etc. The design steps include: (i) establishment of genetic circuitry that demonstrates a rudimentary control behavior, such as oscillations, bistability, step activation, spikes, etc.; (ii) simulations that predict the dynamic behavior of the genetic circuitry; (iii) construction of plasmids containing the genetic parts with desired behaviors predicted by the simulation and transform the plasmids into the living system to be modified; and (iv) observation of the actual behaviors of the system to see whether satisfactory results been reached or not [38]. There are two pioneering applications of genetic circuit engineering. Gardner et al. [39] reported the construction of a bi-stable genetic toggle switch, consisting of genes for two repressors, sequences of two constitutive promoters, and a well-characterized green fluorescent protein (GFP), which acted as a reporter. The function of the toggle switch was produced by engineered cellular genetic machinery in which two constitutively active and repressible promoters were configured to mutually repress the gene expression of each other. Two stable expressions states for the cells were observed. The system could be flipped from one state to the other via external signaling. The constructed toggle switch demonstrated robust bistability and reversibility. Switch times range from 35 min to 6 h after a pulse based on different switching modes. In the second report, Elowitz and Liebler at Princeton described a biological oscillator constructed from transcriptional promoter and repressor sequences, and the GFP as reporter [40]. This genetic construct, called a repressilator, when expressed in E. coli, alternately turned production of the GFP reporter on and off. In the researchers initial work, the frequency of the oscillator was adjusted by attaching a small peptide tag to one repressor and the reporter protein, making them more rapidly degradable in the host cell. It was found that 40% of the cells show oscillation behavior with the period of oscillations ranging from 120 to 200 min. The oscillations were found to depend on cell cycle and to halt in the stationary phase. Problems with stochasticity (differences in performance in individual host cells) and noise (factors that affected amplitude and frequency of expression) common to electronic circuits were observed, and steps were taken to eliminate them [4143]. These were among the first function modules to be constructed from engineering of regulatory pathways, rather than individual protein or DNA sequences. Recently, a modified form of the repressilator was made to regulate expression of a receptor for an acylhomoserine lactone (AHL), the small ligand that acts as an indicator of cell density in bacterial cultures (known as quorum sensing). This system functions as a biological clock to trigger cell division in the host bacteria [44, 45]. The recent literature includes examples of bacterial regulatory pathways that are the counterparts of digital inverters and logic gates [46, 47]. One of the objectives of synthetic biology is to create cells that respond to endogenous, as well as external,
693
Biotechnology Journal
stimuli. To achieve this, the engineered module must be able to integrate seamlessly with the natural regulatory circuits. Levskaya et al. [48] have engineered a photosensitive E. coli that is switched between different states by red light. The bacterium contains a synthetic sensor kinase that allows a lawn of bacteria to function as a biological film. The system was tweaked so that the gene expression would be regulated by its exposure to an external light source. This spatial control of bacterial gene expression could be used to print complex biological materials as a high-definition (about 100 megapixels per square inch), 2-D chemical image. Comparative experiments were conducted to show that the E. coli that had been under the light had developed a pigment, while those that were kept in dark did not. Voigts team [31] is working with heat- and cold-shock promoters to engineer a temperature-sensitive organism. Another recent demonstration involved interfacing the genetic toggle switch with the DNA-damage-inducible SOS repair pathway and with the quorum sensing pathway from the bacterium Vibrio fischerii expressed in E. coli [49]. The authors described their approach as a step toward the development of plug-and-play genetic circuitry that can be used to create cells with programmable behaviors. Quorum sensing is the regulation of gene expression in response to fluctuations in cell-population density [50]. Quorum sensing has been exploited to create a system of sensing between bacteria, in which the genetic equivalent of a pulse generator transiently activated, and then shut down, expression of a reporter protein in response to steady-state levels of AHL. This system could be adjusted so that the signal-receiving cells responded differently with respect to the distance from the AHL-producing cells and the length of time that the AHL concentration remained constant [51].
2.3 Synthesis of biomaterials
The interest in biosynthesis of proteins with non-natural amino acids is spurred by the prospects for new, more robust biomimetic materials for novel applications. Production of proteins with non-natural amino and keto acids [52] has been achieved by forming an acylated transfer RNA chemically [53, 54], by use of modified aminoacyl tRNA synthetases [55, 56], or by producing new codons with four or five bases [57, 58], or codons containing nonnatural base pairs [59]. This technology has made it possible to create proteins with desirable new functions [56] and to investigate why the genetic code has remained virtually invariant throughout evolution. Useful devices or systems at the whole-cell level will require multiple interacting regulatory elements more complex than elementary gates and switches. For many applications, it would be preferable to build in non-natural control mechanisms that would not result in perturbation of natural cell functions due to the consequences of
receptor cross-talk and secondary effects of signals that originate externally. The operation of ATP-energized intracellular molecular motors provides an example [60]. Prokaryotes and eukaryotes have evolved several proteins and protein complexes that transduce chemical energy into lateral or rotary mechanical motion. These include the actin-myosin contractile system, bacterial flagellae, the dyneins and kinesins that translocate along microtubules, RNA and DNA polymerases, helicases, related DNA-processing enzymes, and the proton-coupled F0/F1ATP synthase/ATPase, which is a ubiquitous energy-producing enzyme. These motor proteins are increasingly important elements of the synthetic biologists toolkit. Perhaps the most versatile is the F0/F1 ATP synthase/ATPase, in which the outer shell of and subunits rotates about the central protein in one direction when protons flow into the complex and ATP is made, and rotates in the opposite direction when ATP is hydrolyzed and protons are released. Moreover, under appropriate reaction conditions, immobilized F1 ATPase acts as a stepper motor, with the subunit axle rotating in discrete increments of 120, with hydrolysis of approximately one ATP per step [61]. This study also provided evidence that (i) the conversion of chemical energy to mechanical work (approx. 90 piconewtons per nanometer) is close to 100% efficient, (ii) the work done is essentially constant over a wide range of loads, and (iii) the rotation is relatively smooth under appropriate reaction conditions. F1 ATPase has been incorporated into nanomechanical devices that rotate propeller-like loads, and could be started by addition of ATP and stopped by addition of sodium azide (an ATPase inhibitor) [62, 63]. Montemagno and co-workers [64] went a step further by engineering allosterically active Zn2+-binding sites into the F1 ATPase, to create a switch that reversibly turns the motor on upon addition of Zn2+ and off with the addition of a Zn2+ chelator. The complexity of these macromolecular motors, their high energy efficiency, and the innovative methods used to observe their motion in real time, are beyond the scope of this article. Interested readers are referred to the brief introductions and overviews by Block [65] and Perkel [66], the recent comprehensive review by Schmidt and Montemagno [67], and to several papers detailing the structure and function of the F0/F1ATP synthase/ATPase [60, 6870]. Further advances in development of biomolecular motors have been made with kinesins, which are myosin-like cytoplasmic proteins of approximately 120 kDa that translocate organelles and genetic material along microtubules in eukaryotic cells, using ATP hydrolysis to provide the energy [71]. Microtubules are long polymers of self-assembled asymmetric --tubulin dimers, which give the polymer a ratchet-like directionality that guides the transport of loads by kinesin molecules. In vitro assays have been devised in which microtubules aligned on solid surfaces act as tracks for movement of kinesin-coated microspheres or asymmetric glass chips in the presence
694
www.biotechnology-journal.com
of ATP. In this assay, kinesins have been shown to transport their cargoes (beads of 0.1-m diameter) for distances up to 2.2 mm [72]. An alternate assay has been described in which microtubules can be transported across surfaces coated with immobilized kinesin [73, 74]. Several important observations were made in these and similar studies. First, the cargoes could be transported much farther than the 1040-m length of individual microtubules. Kinesin-coated beads were able to cross short open spaces between ends of individual microtubules. Second, larger cargo particles coated with kinesin could be transported by straddling two parallel microtubules. Third, transport speeds averaged 0.50.6 m/s, with a maximum of about 1.5 m/s. Fourth, glutaraldehyde-fixed and unfixed microtubules were able to support transport equally well. Lastly, kinesin-coated cargoes occasionally encountered others that for some reason had stopped moving. In some instances, the moving unit continued, pushing the previous stopped one. In other cases, the moving unit transferred to an adjacent microtubule and proceeded forward. Studies in other laboratories established that movement of kinesin along microtubules did not involve rotation. Like myosin, kinesin molecules have two globular domains that contact the microtubule, and these promote a hand-over-hand movement similar to a person climbing a rope [66, 75]. The rotary motor that drives bacterial flagellae is composed of several proteins. Each complex is roughly 40 60 nm, about the size of some viruses. Bacterial flagellar motors reach speeds approaching 1000 Hz and can propel bacteria at speeds of 100 m/s [65]. Another cytoplasmic protein, dynein, which is much larger than kinesins, is such a motor protein. Unlike the other rotary and linear motors, dynein is a microtubule-binding, ATP-driven motor protein that involves in chromosome segregation during mitosis. According to Montemagno, the development of more sophisticated living/non-living machines will require three enabling technologies: the ability to produce large amounts of functional biomolecular motors, the ability to fabricate mechanical components with the desirable properties such as spring constants, at an appropriate scale and development of methods for observing and measuring performance parameters such as rotation speed, torque, and energy efficiency [61, 76].
2.4 Quest for the minimal organism
In the quest to understand and build life forms, synthetic biologists as well as systems biologists are interested in determining the smallest set of genes, molecules and structures for replication, growth, metabolism, and regulation that comprise life. Study for such a minimal gene set and its features may shed insights on the basics of cellular functions and help to determine the subset of essential genes in most species. Furthermore, in silico minimal organism reconstructions are experimentally testable, so
that computational and experimental efforts can be combined for the analysis and design of life forms [77]. There are two approaches to investigate the minimum genetic repertoires of living cells. The first we call the subtraction method (to delete genes from genomes to form minimal organisms), and the second is the addition method (to build genomes from the ground up). Theoretical and experimental efforts have been made for comparative genomics and systems analysis to determine the list of essential genes for minimal functions that many organisms have in common [78]. The smallest possible group of genes from small genomes by subtraction is presumed necessary and sufficient for sustaining a functional growth of cells in the presence of a full complement of essential nutrients and in the absence of environmental stress [77]. Methods for making estimates of the minimal genes set by experimental biology include saturating transposon mutagenesis (gene knockout) [79] and gene silencing with antisense RNA [80], etc. These genes can also be computationally identified from the well-studied organisms with small genomes by comparison of essential and non-essential proteins across related genera [81], and using database of essential genes [82]. Comparative study of the parasites Haemophilus influenzae and Mycoplasma genitalium has produced a minimal set of about 250 essential genes [83, 84]. In another study, analyzing viable gene knockouts in Bacillus subtilis, Mycoplasma genitalium, and Mycoplasma pneumoniae has resulted in a similar estimate [85]. It was found that approximately 80 genes out of the 250 in the original minimal gene set are represented by orthologs in all life forms. For ~15% of the genes from the minimal number of genes, viable knockouts were obtained in M. genitalium [78]. E. coli is also used as a model system for gene knockout to create a reduced clean genome. Fred Blattners team [31] has removed about 750 redundant genes and planed to delete 500600 more genes to approach the core genome that may be common to all organisms. It was claimed that after the gene removal, the constructs were observed to be more genetically stable, exhibit increased protein synthesis, and increased electroporation efficiency [31]. In comparison to the subtraction approach, which starts with existing life forms, minimal organisms may also be built up from scratch by the addition approach. Synthetic genomics [31] is such an example, which aims to design a modular system from ground zero that can be given functions. This approach can be traced back to the synthetic biology efforts as early as 1953 when Stanley Miller designed an apparatus to send the electric current through a chamber containing methane, ammonia, hydrogen and water. Organic compounds including amino acids, the building blocks of life, were synthesized as the result. Since then, many attempts have been made for the creation of primitive life forms [86]. For example, David Deamer and others sought to build up a protocell,
695
Biotechnology Journal
attempting to meet requirements such as: having membrane enclosures that can (i) capture energy, (ii) maintain ion gradients, and (iii) encapsulate macromolecules and divide [87]. The additional requirements are: the cell must contain genes and enzymes that can be replicated, and they must be shared among daughter cells [88]. It was realized that more delicate works are needed to mimic functioning cells that will contain genes and proteins that can be replicated. With the focus on the creation of cells closer to natural biological systems, hypotheses about the first life forms may be experimentally tested, and the transition from nonliving to living matter become possible [89]. Synthetic viruses with the ability to replicate, to make self-assembling proteins and/or a membrane coat, and to infect the hosts have been attempted. The genomes of bacteriophage X174 [90], and polio virus [91] have been completely synthesized and shown to produce infectious progeny virus when introduced to appropriate host cells. The replication of prions is receiving similar attention. Among prokaryotes, the mycoplasma have the smallest known genomes. M. genitalium has 517 genes and about 480 proteins. Its close relative M. pneumoniae, has 677 proteins. These genomes have been compared by global transposon mutagenesis, a method that identifies the nonessential genes as those in which transposon insertion is not lethal [79, 81]. Comparison of the results in the two Mycoplasma species indicated that between 265 and 350 protein-encoding genes were essential for viability and growth under laboratory conditions. It was noteworthy that about 100 of these genes had no known function [79].
3 Conclusions
The synthetic capability of synthetic biology is based on the quantitative prediction of the systems behavior and engineering implementation of cellular modification. It relies on and shares tools from genetic engineering, biological engineering and systems biology. These tools include recombinant DNA techniques, pathway analysis and modification, genome-scale mathematical modeling and in silico simulation, hypothesis and experimental tests, etc. A frequently asked question is how synthetic biology differs from the ever-more sophisticated genetic engineering strategies that have been developed since the advent of recombinant DNA technology. They are different in both insights and approach. In a recent commentary, Roger Brent [92] suggested that results that can be exploited to a particular advantage by genetic engineers (for example, increased protein synthesis, production of useful products, or desirable attributes such as resistance to infection), are only useful to the synthetic biologist in the context of higher-level functions not previously associated with living organisms. Examples include biological systems that exhibit complex dynamical behavior, log-
ical behavior, the ability to exist in a number of states, or the ability to execute small numbers of programmed steps, (such as those) in complex chemical synthesis. Brent avers that synthetic biologists follow doctrine from other branches of engineering, including rigorous distinctions and interfaces between parts, devices, and systems, thorough design and simulation before construction, and novel combinations of biological and non-biological materials such as unnatural amino acids. For example, the effort to use combinations of regulatory proteins and sites to engineer a cell that can add two numbers together might be considered synthetic biology, whereas the effort to modify the glycosylation machinery of maize to enable production of a human antibody might not [92]. It can be seen that synthetic biology will not only investigate the effects of genetic and pathway modification, or the cellular responses on genetic variation/environmental perturbation, but also design and build biological systems with novel cellular functions, combining in silico and in vivo experimental approaches. The synthetic biology applications reviewed in this article are divided amongst many significantly different disciplines. What is in common for these applications? I believe that the theme of synthetic biology is that it applies engineering formalisms to design and build functional modules and then to integrate such modules into an existing living systems for novel functions, or to create entirely new life forms by assembling the cellular signaling, regulatory and metabolic parts as building blocks. The focus of synthetic biology is on how to build artificial biological systems for engineering applications. This approach may expand the repertoire of biological chemistry, create new insights into the origins of life and the simplest life forms on earth and perhaps elsewhere, and open new possibilities for built-in early-warning diagnostic and therapeutic inventions at the molecular, subcellular, and cellular levels, just to list some of the potential applications. Synthetic biology is still in its infancy. So far it is more of an exploratory research than a solid body of results. As we ponder the future of this field, it is important to define a procedure (or protocol) for the practice of synthetic biology. So far there is neither such a definition available, nor a standard method for exploring the capabilities and uses of the artificial biological systems. Based on this review, I envision a procedure for the combined computational and experimental synthetic biology effort as described by the schematic flowchart in Fig. 1. From the figure, it can be seen that synthetic biology work can be started from the establishment of a library of in silico building blocks consisting of a parts list for the assembly of metabolic pathways, genetic circuits and protein interactions, etc. A design/selection unit will do the decision-making for which building blocks should be integrated into a genome-scale model for simulation of the prediction of synthetic biological phenotypic outcomes.
696
www.biotechnology-journal.com
Experimental Biology
Modules using Insertion building blocks
Function
Cell
Gene Knocking Deletion out Modeling/ Simulation Library of Building Blocks Bioinformatics Design/ SelectionUnit
Property
Product
Evaluation Unit
Computational Biology
Figure 1. Schematic description of the procedure for the synthetic biology practice. Process of synthetic biology research combining experimental biology and computational biology is cyclical. They will enhance, move and evolve into each other continuously. An evaluation step is needed for the assessment of the functions, products and/or properties of the artificial life forms created by synthetic biology.
Based on the results of computational biology, we can conduct the assembly of biological/non-biological parts for a module to be inserted into the cell using the protocols such as that is described in the MIT registry of BioBricks modules (see MIT Synthetic Biology Group, BioBrick standard assembly guide: http://rosalind.csail. mit.edu/r/parts/htdocs/Assembly/BB_Assembly.htm; BioBrick alpha logic guide: http://rosalind.csail.mit.edu/r/ parts/htdocs/BB_General.htm; and the MIT registry of standard biological parts: http://rosalind.csail.mit.edu/r/ parts/partsdb/index.cgi). We can also delete genes from a given genome to reach the core of the genome. The novel functions, products or properties will be assessed by an evaluation process to see whether a satisfactory result has been achieved or not. If not, process iteration may be carried out for further improvements. Synthetic biology promises great opportunities. The field is evolving at such a rate that even todays synthetic biologists may not be able to recognize the tools, products and concepts in a decade. On the other hand, potential risks, benefits, and ethical issues in synthetic biology need to be well assessed [93, 94]. Ethical and moral issues raised by this new engineering discipline are similar to those that have been discussed with genetically modified organisms (GMO, see Block et al. [95]), and with synthesis of genomic DNA sequences, etc [94]. However, since synthetic biology has drastically increased the pace at which some biological works will be performed, more precise and effective methods to identify risks, monitor activities, and communicate concerns among the synthetic biology community are required. It is thus suggested that a professional society should be formed and a code of ethics should be developed to ensure a positive future impact for synthetic biology [31].
I would dedicate this article to the late Dr. Alex Karu who passed away several days before the submission of this manuscript. Dr. Karu has contributed to this review article significantly by his stimulating discussions with the author and constructive comments on this manuscript. I am also very grateful for an anonymous reviewer for his critical feedback that has improved this manuscript.
References
[1] Hofmeyr, J. H., Westerhoff, H. V., Building the cellular puzzle: control in multi-level reaction networks. J. Theor. Biol. 2001, 208, 261285. [2] Doyle J. C., Robustness and dynamics in biological networks, in: The First International Conference on Systems Biology. Japan Science and Technology Corporation, MIT Press, New York 2000. [3] Wiener N., Cybernetics Control and Communication in the Animal and the Machine. John Wiley & Sons, New York 1948. [4] Kitano H., Systems biology: A brief overview. Science 2002, 295, 16621664. [5] Dorf, R. C., Mandell, J. D., Bioengineering at the University of Santa Clara. Proceedings of the Annual Conference on Engineering in Medicine and Biology. The Conference Committee for the 19th ACEMB, San Francisco, USA, Aug. 912, 1966. [6] Encyclopaedia Britannica, 15th edn. Encyclopaedia Britannica, Inc., Chicago 1994. [7] Cuello, J. L., The Descent of Biological Engineering. Proceedings of the Institute of Biological Engineering, IBE Publications, Athens 2002. [8] Cohenm S. N., Chang, A. C. Y., Boyer, H., Helling, R. B., Construction of biologically functional bacterial plasmids in vitro. Proc. Natl. Acad. Sci. USA 1973, 70, 32403244. [9] Rodgers, M., The pandoras box congress. Rolling Stone 1975, 189, 3777. [10] Hobom, B., Surgery of genes. At the doorstep of synthetic biology. Med. Klin. 1980. 75, 1421.
697
Biotechnology Journal
Pengcheng (Patrick) Fu is currently an assistant professor in the Department of Molecular Biosciences and Bioengineering, University of Hawaii, Manoa (UHM) in Honolulu, Hawaii, USA. He received his Ph.D. in Biochemical Engineering from the University of Sydney, Australia in 1996. He then performed postdoctoral research in Japan and the USA during 19962000. He was employed in Diversa Corp (San Diego, CA) in 20012002, and became Assistant Professor at the UHM in 2002. His area of interests includes Metabolic Engineering, Bioprocess Control, Metabolomics and Systems and Synthetic Biology.
[11] Sullins, J., Synthetic biology: the technoscience of artificial life, in: Proceeding of the 20th World Congress of Philosophy, Boston, Massachusetts, USA 1015 August, 1998. [12] Rawls, R., Synthetic biology makes its debut. Chem. Eng. News 2000, 4951. [13] Covert M. W., Schilling C. H., Famili, I., Edwards, J. S. et al., Metabolic modeling of microbial strains in silico. Trends Biochem. Sci. 2001, 26, 179186. [14] Zazopoulos, E., Huang, K., Staffa, A., Liu, W. et al., A genomics-guided approach for discovering and expressing cryptic metabolic pathways. Nat. Biotechnol. 2003. 21, 187190. [15] Yeger-Lotem, E., Sattath, S., Kashtan, N., Itzkovitz, S. et al., Network motifs in integrated cellular networks of transcription-regulation and protein-protein interaction. Proc. Natl. Acad. Sci. USA 2004. 101, 59345939. [16] Brazhnik, P., de la Fuente, A., Mendes, P., Gene networks: how to put the function in genomics. Trends Biotechnol. 2002, 11, 467472. [17] Wolf, D. M., Eeckman, F. H., On the relationship between genomic regulatory element organization and gene regulatory dynamics. J. Theor. Biol. 1998, 195, 167186. [18] Wall, M. E., Hlavacek, W. S., Savageau, M. A., Design of gene circuits: lessons from bacteria. Nat. Rev. Genet. 2004. 5, 3442. [19] Alm E., Arkin, A. P.. Biological networks. Curr. Opin. Struct. Biol. 2003, 13, 193202. [20] McAdams H. H., Arkin, A. P., Gene regulation: towards a circuit engineering discipline. Curr. Biol. 2000, 10, R318R320. [21] Bray, D., Molecular networks: the top-down view. Science 2003, 301, 18641865. [22] Alon, U., Biological networks: the tinkerer as an engineer. Science 2003. 301, 18661867. [23] Bar-Joseph, Z., Gerber, G. K., Lee, T. I., Rinaldi, N. J. et al., Computational discovery of gene modules and regulatory networks. Nat. Biotechnol. 2003, 21, 13371342. [24] Liang, S., Fuhrman, S., Somogyi, R., REVEAL, a general reverse engineering algorithm for inference of genetic network architectures. Biocomputing 98: Proc. Pac. Symp. Biocomput., Maui, Hawaii, January 49, 1998, 3, 1829. [25] Akutsu, T., Miyano, S., Kuhara, S., Identification of genetic networks from a small number of gene expression patterns under the Boolean network model. Biocomputing 99: Proc. Pac. Symp. Biocomput., Mauna Lani, Hawaii, January 49, 1999, 4, 1728. [26] Wahde, M., Hertz, J., Coarse-grained Reverse Engineering of Genetic Regulatory Networks. BioSystems 2000, 55, 129136.
[27] Ren, B. R., Wyrick, F., Aparicio, J. J., Jennings, O. et al., Genomewide location and function of DNA binding proteins. Science 2000. 290, 23062309. [28] de Jong, H., Page, M., Qualitative simulation of large and complex genetic regulatory systems. Proceedings of the 14th European Conference on Artificial Intelligence (ECAI), Berlin, Germany, August 2025, 2000, IOS Press. [29] Gardner, T.S., di Bernardo, Lorenz, D., Collins, D., James, J., Inferring genetic networks and identifying compound mode of action via expression profiling. Science 2003, 301), 102105. [30] Imoto, S., Goto, T., Miyano, S., Estimation of genetic networks and functional structures between genes by using Bayesian networks and nonparametric regression, in: Proceedings of the 7th Pacific Symposiun on Biocomputing (PSB 2002), Lihue, Hawaii, USA, January 37, 2002. [31] Salisbury, M. W., Get ready for synthetic biology. Genome Technol. 2006, Jan/Feb, 2633. [32] Davidson, E. H. et al., A Provisional regulatory gene network for specification of endomesoderm in the sea urchin embryo. Dev. Biol. 2002. 246, 162190. [33] Savageau, M. A., Rules for the evolution of gene circuitry. Biocomputing 98: Proc. Pac. Symp. Biocomput., Maui, Hawaii, January 49, 1998, 3, 5465. [34] Romero, P. et al., Computational prediction of human metabolic pathways from the complete human genome. Genome Biol. 2004, 6, R2.117. [35] Simpson, M. L., Rewiring the cell: synthetic biology moves towards higher functional complexity. Trends Biotechnol. 2004, 22, 555557. [36] Jacob, F., Monod, J., Genetic regulatory mechanisms in the synthesis of proteins. J. Mol. Biol. 1961. 3, 318356. [37] Kaern, M., Blake, W. J., Collins, J. J., The engineering of gene regulatory networks. Annu. Rev. Biomed. Eng. 2003, 5, 179206. [38] Canton, B., Engineering the Interface Between Cellular Chassis and Integrated Biological Systems. MIT Synthetic Biology Working Group Reports published online 8 August 2005 (doi: 1721.1/19813). [39] Gardner, T. S., Cantor, C. A., Collins, J. A., Construction of a genetic toggle switch in Escherichia coli. Nature 2000, 403, 339342. [40] Elowitz, M. B., Leibler, S., A synthetic oscillatory network of transcriptional regulators. Nature 2000, 403, 335338. [41] Elowitz, M. B., Levine, A. J., Siggia, E. D. et al., Stochastic gene expression in a single cell. Science 2002, 297, 11831186. [42] Rosenfeld, N., Elowitz, M. B., Alon, U., Negative autoregulation speeds the response times of transcription networks. J. Mol. Biol. 2002, 323, 785793. [43] Swain, P. S., Elowitz, M. B., Siggia, E. D., Intrinsic and extrinsic contributions to stochasticity in gene expression. Proc. Natl. Acad. Sci. USA 2002, 99, 1279512800. [44] Garcia-Ojalvo, J., Elowitz, M. B., Strogatz, S. H., Modeling a synthetic multicellular clock: repressilators coupled by quorum sensing. Proc. Natl. Acad. Sci. USA 2004, 101, 1095510960. [45] Atkinson, M. R., Savageau, M. A., Myers, J. T. et al., Development of genetic circuitry exhibiting toggle switch or oscillatory behavior in Escherichia coli. Cell 2003, 113, 597607. [46] Weiss, R., Basu, S., Kalmbach, A., Hooshangi, S. et al., Genetic circuit building blocks for cellular computation, communications, and signal processing. Natural Comput. 2003, 2, 4784. [47] Yokobayashi, Y., Collins, C. H., Leadbetter, J. R., Weiss, R. et al., Evolutionary design of genetic circuits and cell-cell communications. Adv. Complex Syst. 2003, 6, 3745. [48] Levskaya, A., Chevalier, A.A., Tabor, J.J., Simpson, Z.B. et al., Synthetic biology: engineering Escherichia coli to see light. Nature 2005, 438, 441442.
698
www.biotechnology-journal.com
[49] Kobayashi, H., Kaern, M. Araki, M., Chung, K. et al., Programmable cells: interfacing natural and engineered gene networks, Proc. Natl. Acad. Sci. USA 2004, 101, 84148419. [50] Miller, M. B., Bassler, B. L., Quorum sensing in bacteria. Annu. Rev. Microbiol. 2001, 55, 165199. [51] Basu, S., Mehreja, R., Thiberge, S., Chen, M.T. et al., Spatiotemporal control of gene expression with pulse-generating networks. Proc. Natl. Acad. Sci. USA 2004, 101, 63556360. [52] Wang, L., Zhang, Z., Brock, A , Schultz, P. G., Addition of the keto functional group to the genetic code of Escherichia coli. Proc. Natl. Acad. Sci. USA 2003, 100, 5661. [53] Noren, C. J., Anthony-Cahill, S. J., Suich, D. J., Noren, K.A. et al., Invitro suppression of an amber mutation by a chemically aminoacylated transfer RNA prepared by runoff transcription. Nucleic Acids Res. 1990, 18, 8388. [54] Robertson, S. A., Noren, C. J., Anthony-Cahill, S. J., Griffith, M.C. et al., The use of 5 phospho-2-deoxyribocytidylylriboadenosine as a facile route to chemical aminoacylation of transfer RNA. Nucleic Acids Res. 1989, 17, 96499660. [55] Mehl, R. A., Anderson, J. C., Santoro, S. W. et al., Generation of a bacterium with a 21 amino acid genetic code. J. Am. Chem. Soc. 2003, 125, 935939. [56] Chin, J. W., Cropp, T. A., Anderson, J. C., Mukherji, M. et al., An expanded eukaryotic genetic code. Science 2003, 301, 964967. [57] Hohsaka, T., Ashizuka, Y., Taira, H., Murakami, H. et al., Incorporation of nonnatural amino acids into proteins by using various fourbase codons in an Escherichia coli in vitro translation system. Biochemistry 2001, 40, 1106011064. [58] Hohsaka, T., Ashizuka, Y., Murakami, H., Sisidoa, M., Five-base codons for incorporation of nonnatural amino acids into proteins. Nucleic Acids Res, 2001, 29, 36463651. [59] Wang L., Schultz, P. G., Expanding the genetic code. Chem. Commun. (Camb) 2002, 111. [60] Boyer, P. D., The ATP synthaseA splendid molecular machine. Annu. Rev. Biochem. 1997, 66, 717749. [61] Yasuda, R., Noji, H., Kinosita, K., Yoshida, M., F1-ATPase is a highly efficient molecular motor that rotates with discrete 120 steps. Cell 1998, 93, 11171124. [62] Soong, R. K., Bachand, G. D., Neves, H. P., Olkhovets, A.G. et al., Powering an inorganic nanodevice with a biomolecular motor. Science 2000, 290, 15551558. [63] Soong, R. E., Neves, H. P., Schmidt, J. J., Bachand, G.D. et al., Engineering issues in the fabrication of a hybrid nano-propeller system powered by F1-ATPase. Biomed. Microdevices 2001, 3, 7173. [64] Liu, H., Schmidt, J. J., Bachand, G. G., Rizk, S.S. et al., Control of a biomolecular motor-powered nanodevice with an engineered chemical switch. Nat. Mater. 2003, 1, 173177. [65] Block, S. M., Real engines of creation. Nature 1997, 386, 217219. [66] Perkel, J. M., Investigating molecular motors step by step: recent discoveries begin to answer how dyneins, kinesins, and myosins actually work. Scientist 2004, 18, 1931. [67] Schmidt J. J., Montemagno, C. D., Bionanomechanical systems. Annu. Rev. Mater. Res. 2004, 34, 315337. [68] Abrahams, J. P., Leslie, A. G. W., Lutter, R., Walker, J.E., Structure at 2.8 resolution of F1-ATPase from bovine heart mitochondria. Nature 1994, 370, 621628. [69] Noji, H., Yasuda, R., Yoshida, M., Kinosita Jr., K., Direct observation of the rotation of F1 ATPase. Nature 1997, 386, 299302. [70] Duncan, T., Bulygin, V., Zhou, Y., Hutcheon, M. L. et al., Rotation of subunits during catalysis by Escherichia coli F1-ATPase. Proc. Natl. Acad. Sci. USA 1995, 92, 1096410968. [71] Vale, R. D., Reese, T. S., Sheetz, M. P., Identification of a novel forcegenerating protein, kinesin, involved in microtubule-based motility. Cell 1985, 42, 3950.
[72] Bhm, K. J., Stracke, R., Mhlig, P., Unger, E., Motor protein-driven unidirectional transport of micrometer-sized cargoes across isopolar microtubule arrays. Nanotechnology 2001, 12, 238244. [73] Kuo, S. C., Gelles, J., Steuer, E., Sheetz, M.P., A model for kinesin movement from nanometer-level movements of kinesin and cytoplasmic dynein and force measurements. J. Cell Sci. Suppl. 1991, 14, 135138. [74] Dennis, J. R., Howard, J., Vogel, V., Molecular shuttles: directed motion of microtubules across nanoscale kinesin tracks. Nanotechnology 1999, 10, 232236. [75] Schief, W. R., Clark, R. H., Crevenna, A. H., Howard, J., Inhibition of kinesin motility by ADP and phosphate supports a hand-over-hand mechanism. Proc. Natl. Acad. Sci. USA 2004, 101, 11831188. [76] Montemagno, C. D., Nanomachines: A roadmap for realizing the vision. J. Nanoparticle Res. 2001, 3, 13. [77] Koonin, E. V., How many genes can make a cell: the minimal-geneset concept. Annu. Rev. Genomics Hum. Genet. 2000, 1, 99116. [78] Luisi, P. L., Oberholzer, T., Lazcano, A., The notion of a DNA minimal cell: a general discourse and some guidelines for an experimental approach. Helvet. Chim. Acta 2002, 85, 17591777. [79] Hutchison, C. A., III, Peterson, S. N., Gill, S. R., Cline, R. T., Global transposon mutagenesis and a minimal mycoplasma genome. Science 1999, 286, 21652169. [80] Ji, Y., Zhang, B., von Horn, S.F., Warren, P. et al., Identification of critical staphylococcal genes using conditional phenotypes generated by antisense RNA. Science 2001, 293, 22662269. [81] Chandonia, J.-M., Konerding, D. E., Allen, D.G., Computational structural genomics of a complete minimal organism. Genome Informatics 2002, 13, 390391. [82] Zhang, R., Ou, H.-Y., Zhang, C.T., DEG: a database of essential genes. Nucleic Acids Res. 2004, 32, D271272. [83] Kobayashi, K., Ehrlich, S. D., Albertini, A., Amati, G. et al., Essential Bacillus subtilis genes. Proc. Natl. Acad. Sci. USA 2003, 100, 46784683. [84] Mushegian, A. R., Koonin, E. V., A minimal gene set for cellular life derived by comparison of complete bacterial genomes. Proc. Natl. Acad. Sci. USA 1996, 93, 1026810273. [85] Arigoni, F., Talabot, F., Peitsch, M., Edgerton, M. D., Meldrum, E., A genome-based approach for the identification of essential bacterial genes. Nat. Biotechnol. 1998, 16, 851856. [86] Lazcano, A. The never-ending story. Am. Scientist 2003, 91, no. 5. [87] Deamer, D., A giant step towards artificial life? Trends Biotechnol. 2005, 23, 336338. [88] Deamer, D., Is this life? Scientist 2006, 20, 30. [89] Rasmussen, S., Chen, L., Deamer, D., Krakauer, D.C. et al., Transitions from nonliving to living matter. Science 2004, 303, 963965. [90] Smith, H. O., Hutchison, C. A. III, Pfannkoch, C., Venter, J.C., Generating a synthetic genome by whole genome assembly: FX174 bacteriophage from synthetic oligonucleotides. Proc. Natl. Acad. Sci. USA 2003, 100, 1544015445. [91] Cello, J., Paul, A. V., Wimmer, E., Chemical synthesis of poliovirus cDNA: generation of infectious virus in the absence of natural template. Science 2002, 297, 10161018. [92] Brent, R., A partnership between biology and engineering. Nat. Biotechnol. 2004, 22, 12111214. [93] Rinaldi, A., A new code for life: Expansion of the genetic code and artificial assembly of functional genomes are raising important questions about ethics and safety. EMBO Rep. 2004, 5, 336339. [94] Cho, M. K., Magnus, D., Caplan, A.L., McGee, D., et al., Ethical considerations in synthesizing a minimal genome. Science 1999, 286, 20872090. [95] Block, S., Donoho, D., Hwa, T., Joyce, G. et al., DNA barcodes and watermarks, in: Jason Report JSR-03-305, The Mitre Corporation, McLean 2004.
699