Proteins
Proteins
Proteins
The reactive or functional groups of the amino acid molecule are the basic amino group (–NH2 ) and the
acidic carboxyl group (–COOH), both bonded to the same carbon atom (see Figure B1.2.1, page 206). It is
these groups that participate in the condensation reactions that form peptide bonds, linking amino acid
residues in polypeptides and proteins. The remainder of the amino acid molecule, the side chain or –R-
group, may be very variable between different amino acids. The variety of amino acids is shown in
Figure B1.2.6. While amino acids have the same basic structure, they are all rather different in character
because of the different R-groups they carry. The categories of amino acids found in cell proteins are: l
acidic amino acids: having additional carboxyl groups (e.g. aspartic acid) l basic amino acids: having
additional amino groups (e.g. lysine) l amino acids with hydrophilic properties (water soluble): have
polar or charged R-groups (e.g. serine) l amino acids with hydrophobic properties (insoluble): have non-
polar R-groups (e.g. alanine). The joining together of different amino acids in contrasting combinations
produces proteins with very different properties. The properties of proteins depend on the different
amino side chains (Figure B1.2.6). Some proteins are hydrophobic or non-polar (the electrical charge of
the molecule is evenly distributed across the molecule); some are hydrophilic or polar (when positive
and negative poles are formed in a molecule – remember, water is a polar molecule). Some amino acids
are basic (with an amino group in the side chain) and some are acidic (with a carboxyl group in the side
chain)
Water molecules force hydrophobic R-groups together in order to minimize their disruptive effects on
the water molecules. Hydrophobic amino acids have little or no polarity in their side chains: this lack of
polarity means they have no way to interact with highly polar water molecules, where the liquid state is
formed by the hydrogen bonds between water molecules. Hydrophobic groups held together in this way
are sometimes said to be held together by ‘hydrophobic forces’, but the attraction is actually caused by
the repulsion effect from water molecules. The repulsion of hydrophobic groups from water molecules
is also important for the assembly of lipid vesicles and membranes, and protein folding. Polar amino acid
side chains tend to be displayed on the outside of the folded protein where they can interact with
water; the non-polar amino acid side chains are buried on the inside.
Protein structure
The sequence of amino acids determines the precise position of each amino acid within a structure and
the three-dimensional shape of the protein. Proteins therefore have precise, predictable and repeatable
structures, despite their complexity. There are four levels of protein structure, each of significance in
biology (Figure B1.2.7). l The primary structure of a protein is the linear sequence of the amino acids in
the molecule. This determines the shape of the protein. l The secondary structure occurs when the
protein chain interacts with itself through hydrogen bonds to form regions that are helix-shaped or
folded. l The tertiary structure refers to the formation or overall three-dimensional shape of a protein,
which is formed by hydrogen bonding and other intermolecular forces. l Some proteins have a
quaternary structure where two or more polypeptides are combined (via intermolecular forces) to form
a larger protein. An example is haemoglobin, present in mammalian blood, which consists of four
polypeptide chains held together to form a single larger protein
The tertiary and secondary structures of proteins are held together by weak intermolecular forces.
These are broken at high pH (very alkaline) or low pH (very acidic), or at high or low temperatures.
Under these conditions the protein loses its shape (conformation) and is denatured (see page 210).
Proteins vary greatly in sequence, structure and function and are therefore challenging to study.
Proteins behave differently in different conditions: some proteins or protein complexes do not fold or
assemble without other assembly factors; some proteins cannot function without co-factors present.
Modification of proteins by the cell, e.g., phosphorylation, can greatly affect their structures and
functions. We will now look at the four levels of protein structure in more detail.
The primary structure of a protein is the sequence of amino acid residues joined by peptide linkages (see
Figure B1.2.7 above). Proteins differ in the variety, number and order of their constituent amino acids.
We have seen how in the living cell the sequence of amino acids in the polypeptide chain is controlled
by the coded instructions stored in the DNA, mediated via mRNA. Changing just one amino acid in the
sequence of a protein alters its properties, often quite drastically. This sort of mistake arises by mutation
The secondary structure of a protein develops when parts of the polypeptide chain take up a particular
shape immediately after formation at the ribosome. Parts of the chain become folded or twisted, or
both, in various ways. The most common shapes are formed either by coiling to produce an α-helix or
folding into β-sheets. These shapes are permanent, held in place by hydrogen bonds
Super-secondary structure
Super-secondary structure describes the different patterns that a-helices or b-sheets commonly adopt in
proteins. For example, four α-helices often form a coiled-coil arrangement known as a four-helix bundle
(Figure B1.2.10a). Pairs of adjacent helices are often additionally stabilized by salt bridges (ionic
bonding) between charged amino acids. Two β-sheets may stack on top of each other in a ‘beta
sandwich’ such as in the immunoglobulin fold (Figure B1.2.10b).
Super-secondary structure often corresponds to a particular domain, which are independent structures
found in proteins. Protein domains are defined as compact, folded structures within a polypeptide chain
and they usually have a specific function, such as DNA-binding capability or the ability to induce
dimerisation between two proteins containing the domain. A protein dimer is a macromolecular
complex formed by two protein monomers.
■ Tertiary structure
The tertiary structure of a protein is the precise, compact structure, unique to that protein, which arises
when the molecule is further folded and held in a particular complex shape. This shape is stabilized by
four different types of bonding, which are established by interactions between R-groups and other
adjacent parts of the chain (Figure B1.2.11). Amine (–NH2 ) and carboxyl (–COOH) groups in R-groups
can become positively or negatively charged by binding or dissociation of hydrogen ions and can then
participate in ionic bonding.
Effect of polar and non-polar amino acids on tertiary structure of proteins Amino acids with polar R-
groups have hydrophilic properties. When these amino acids are built into protein in prominent
positions, they may influence the properties and functioning of the proteins in cells. Similarly, amino
acids with non-polar R-groups have hydrophobic properties. Hydrophobic amino acids are clustered in
the core of globular proteins that are soluble in water. Integral proteins have regions with hydrophobic
amino acids, helping them to embed in membranes. Integral membrane proteins are permanently
embedded within the plasma membrane. The portions of the proteins found inside the membrane are
hydrophobic, while those exposed to the cytoplasm or extracellular fluid tend to be hydrophilic. The
fatty acid tails that form the interior of the membrane are non-polar and do not repel the hydrophobic
(non-polar) parts of the integral proteins. Examples of these outcomes are illustrated in Figures B1.2.13
(for cell membrane proteins) and B1.2.14 (for an enzyme that occurs in the cytoplasm).
The quaternary structure of proteins arises when two or more polypeptide chains or proteins are held
together forming a complex, biologically active molecule. Conjugated proteins Haemoglobin is known as
a conjugated protein, which is a combination of protein and non-protein (the prosthetic group).
Conjugated proteins have other chemical groups attached, including carbohydrates (such as in the
glycocalyx of cell membranes – see page 195), lipids, bound metal ions and other organic groups.
Haemoglobin consists of four polypeptide chains (two α-chains and two β-chains). Each polypeptide
chain in the haemoglobin molecule is held around a non-protein haem group (the prosthetic group), in
which an atom of iron occurs (Figure B1.2.15).
Non-conjugated proteins
Proteins that are not associated with prosthetic groups are known as non-conjugated proteins. Insulin
and collagen are examples of non-conjugated proteins. Insulin In 1951, English biochemist Frederick
Sanger made a major discovery when he determined the first amino acid sequence of a protein, insulin.
This was important because it showed that a protein has a precisely defined amino acid sequence.
Before this finding, it was not known that one amino acid sequence characterises one protein. Insulin is
a hormone involved in glucose regulation (Chapter D3.3, page 761). It is an example of a globular protein
(see page 221, below). Insulin is composed of two chains, an A chain and a B chain (see Figure B1.2.18).
The A and B chains are linked together by two disulfide bonds, and an additional disulfide bond is
formed within the A chain. Figure B1.2.19 shows the formation of a disulfide bridge
The Cys residues in the A chain at positions 7 and 20 form inter-chain disulfide bridges to the insulin B
chain (a short polypeptide of 30 amino acids). Figure B1.2.20 shows the combined quaternary structure
of an insulin protein
Collagen
Collagen is the most abundant protein in animals. It is the substance that gives structure and essentially
holds the body together. It is found in bones, muscles, skin and tendons, and is an example of a fibrous
protein (see page 221). The quaternary structure of collagen consists of three left-handed helices
twisted into a right-handed coil (Figure B1.2.21).
Each polypeptide chain has a repeated triplet sequence of Gly-X-Y, where X and Y can be any amino acid
but are frequently proline (in the X position) and hydroxyproline (in the Y position). Every third amino
acid is a glycine (residue), which allows each helical chain to make a turn every three amino acids and
intertwine around two other chains to form a compact triple helix, as only glycine is small enough to fit
into the centre. The three helical polypeptide chains are held together by interchain hydrogen bonds
forming tropocollagen. Many triple helices lie parallel in a staggered pattern to form fibrils with covalent
bonds between neighbouring triple helix chains. The fibrils unite to form fibres
■ Fibrous proteins
Some proteins take up a tertiary structure that is a long, much-coiled chain; these are called fibrous
proteins. They have long, narrow shapes. Examples of fibrous proteins are collagen, a component of
bone and tendons (Figure B1.2.23), and keratin, found in hair, horn and nails. Fibrous proteins are often
insoluble. The collagen molecule consists of three polypeptide chains, each in the shape of a helix (see
Figure B1.2.21, previous page). The chains are wound together as a triple helix, forming a stiff cable
strengthened by numerous hydrogen bonds. The ends of individual collagen molecules are staggered so
there are no weak points in collagen fibres, giving the whole structure high tensile strength. This makes
the protein well suited to provide structural support in skin, tendons and cartilage.
■ Globular proteins
Other proteins take up a more spherical shape and are known as globular proteins. They are mostly
highly soluble in water. Examples include enzymes, such as lysozyme and catalase, and hormones, such
as insulin (Figure B1.2.24 and page 218). Insulin is a very small protein, allowing it to move quickly
through the blood. Its shape is recognized by specific receptors on its target cell surfaces. It is difficult to
make a small protein that will fold into a stable structure. This problem is solved by synthesizing a longer
protein chain, which folds into the proper structure. The extra pieces are removed (Figure B1.2.24),
leaving two small chains in the mature form. The structure is further stabilized by three disulfide
bridges.