Amino Acids and Proteins
Proteins, from the Greek proteios, meaning first, are a class of organic compounds which are present in and vital to every living cell. In the form of skin, hair, callus, cartilage, muscles, tendons and ligaments, proteins hold together, protect, and provide structure to the body of a multicelled organism. In the form of enzymes, hormones, antibodies, and globulins, they catalyze, regulate, and protect the body chemistry. In the form of hemoglobin, myoglobin and various lipoproteins, they effect the transport of oxygen and other substances within an organism.
Proteins are generally regarded
as beneficial, and are a necessary part of the diet of all animals. Humans can
become seriously ill if they do not eat enough suitable protein, the disease
kwashiorkor being an extreme form of protein deficiency. Protein based
antibiotics and vaccines help to fight disease, and we warm and protect our
bodies with clothing and shoes that are often protein in nature (e.g. wool, silk
The deadly properties of protein toxins and venoms are less widely appreciated. Botulinum toxin A, from Clostridium botulinum, is regarded as the most powerful poison known. Based on toxicology studies, a teaspoon of this toxin would be sufficient to kill a fifth of the world's population. The toxins produced by tetanus and diphtheria microorganisms are nearly as poisonous. A list of highly toxic proteins or peptides would also include the venoms of many snakes, and ricin, the toxic protein found in castor beans.
Despite the variety of their physiological function and differences in physical properties--silk is a flexible fiber, horn a tough rigid solid, and the enzyme pepsin water soluble crystals--proteins are sufficiently similar in molecular structure to warrant treating them as a single chemical family. When compared with carbohydrates and lipids, the proteins are obviously different in fundamental composition. The lipids are largely hydrocarbon in nature, generally being 75 to 85% carbon. Carbohydrates are roughly 50% oxygen, and like the lipids, usually have less than 5% nitrogen (often none at all). Proteins and peptides, on the other hand, are composed of 15 to 25% nitrogen and about an equal amount of oxygen. The distinction between proteins and peptides is their size. Peptides are in sense small proteins, having molecular weights less than 10,000.
Hydrolysis of proteins by boiling aqueous acid or base yields an assortment of small molecules identified as α-aminocarboxylic acids. More than twenty such components have been isolated, and the most common of these are listed in the following table. Those amino acids having green colored names are essential diet components, since they are not synthesized by human metabolic processes. The best food source of these nutrients is protein, but it is important to recognize that not all proteins have equal nutritional value. For example, peanuts have a higher weight content of protein than fish or eggs, but the proportion of essential amino acids in peanut protein is only a third of that from the two other sources. For reasons that will become evident when discussing the structures of proteins and peptides, each amino acid is assigned a one or three letter abbreviation.
Some common features of these amino acids should be noted. With the exception of proline, they are all 1º-amines; and with the exception of glycine, they are all chiral. The configurations of the chiral amino acids are the same when written as a Fischer projection formula, as in the drawing on the right, and this was defined as the L-configuration by Fischer. The R-substituent in this structure is the remaining structural component that varies from one amino acid to another, and in proline R is a three-carbon chain that joins the nitrogen to the alpha-carbon in a five-membered ring. Applying the Cahn-Ingold-Prelog notation, all these natural chiral amino acids, with the exception of cysteine, have an S-configuration.
For the first seven compounds in the left column the R-substituent is a hydrocarbon. The last three entries in the left column have hydroxyl functional groups, and the first two amino acids in the right column incorporate thiol and sulfide groups respectively. Lysine and arginine have basic amine functions in their side-chains; histidine and tryptophan have less basic nitrogen heterocyclic rings as substituents. Finally, carboxylic acid side-chains are substituents on aspartic and glutamic acid, and the last two compounds in the right column are their corresponding amides.
The formulas for the amino acids written above are simple covalent bond representations based upon previous understanding of mono-functional analogs. The formulas are in fact incorrect. This is evident from a comparison of the physical properties listed in the following table. All four compounds in the table are roughly the same size, and all have moderate to excellent water solublility. The first two are simple carboxylic acids, and the third is an amino alcohol. All three compounds are soluble in organic solvents (e.g. ether) and have relatively low melting points. The carboxylic acids have pKa's near 4.5, and the conjugate acid of the amine has a pKa of 10. The simple amino acid alanine is the last entry. By contrast, it is very high melting (with decomposition), insoluble in organic solvents, and a million times weaker as an acid than ordinary carboxylic acids.
These differences all point to internal salt formation by a proton transfer from the acidic carboxyl function to the basic amino group. The resulting ammonium carboxylate structure, commonly referred to as a zwitterion, is also supported by the spectroscopic characteristics of alanine.
As expected from its ionic character, the alanine zwitterion is high melting, insoluble in nonpolar solvents and has the acid strength of a 1º-ammonium ion. To the right above is a Jmol display of an L-amino acid. The model will change to its zwitterionic form by clicking the appropriate button beneath the display. Examples of a few specific amino acids may also be viewed in their favored neutral zwitterionic form. Note that in lysine the amine function farthest from the carboxyl group is more basic than the alpha-amine. Consequently, the positively charged ammonium moiety formed at the chain terminus is attracted to the negative carboxylate, resulting in a coiled conformation.
Since amino acids, as well as peptides and proteins, incorporate both acidic and basic functional groups, the predominant molecular species present in an aqueous solution will depend on the pH of the solution. In order to determine the nature of the molecular and ionic species that are present in aqueous solutions at different pH's, we make use of the Henderson-Hasselbach Equation, written below. Here, the pKa represents the acidity of a specific conjugate acid function (HA). When the pH of the solution equals pKa, the concentrations of HA and A(-) must be equal (log 1 = 0).
The titration curve for alanine, shown below, demonstrates this relationship. At a pH lower than 2, both the carboxylate and amine functions are protonated, so the alanine molecule has a net positive charge. At a pH greater than 10, the amine exists as a neutral base and the carboxyl as its conjugate base, so the alanine molecule has a net negative charge. At intermediate pH's the zwitterion concentration increases, and at a characteristic pH, called the isoelectric point (pI), the negatively and positively charged molecular species are present in equal concentration. This behavior is general for simple (difunctional) amino acids. Starting from a fully protonated state, the pKa's of the acidic functions range from 1.8 to 2.4 for -CO2H, and 8.8 to 9.7 for -NH3(+). The isoelectric points range from 5.5 to 6.2. Titration curves show the neutralization of these acids by added base, and the change in pH during the titration.
The distribution of charged species in a sample can be shown experimentally by observing the movement of solute molecules in an electric field, using the technique of electrophoresis. For such experiments an ionic buffer solution is incorporated in a solid matrix layer, composed of paper or a crosslinked gelatin-like substance. A small amount of the amino acid, peptide or protein sample is placed near the center of the matrix strip and an electric potential is applied at the ends of the strip, as shown in the following diagram. The solid structure of the matrix retards the diffusion of the solute molecules, which will remain where they are inserted, unless acted upon by the electrostatic potential. In the example shown here, four different amino acids are examined simultaneously in a pH 6.00 buffered medium.
At pH 6.00 alanine and isoleucine exist on average as neutral zwitterionic molecules, and are not influenced by the electric field. Arginine is a basic amino acid. Both base functions exist as "onium" conjugate acids in the pH 6.00 matrix. The solute molecules of arginine therefore carry an excess positive charge, and they move toward the cathode. The two carboxyl functions in aspartic acid are both ionized at pH 6.00, and the negatively charged solute molecules move toward the anode in the electric field.
It should be clear that the result of this experiment is critically dependent on the pH of the matrix buffer. If we were to repeat the electrophoresis of these compounds at a pH of 3.00, the aspartic acid would remain at its point of origin, and the other amino acids would move toward the cathode. Ignoring differences in molecular size and shape, the arginine would move twice as fast as the alanine and isoleucine because its solute molecules on average would carry a double positive charge.
As noted earlier, the titration curves of simple amino acids display two inflection points, one due to the strongly acidic carboxyl group (pKa1 = 1.8 to 2.4), and the other for the less acidic ammonium function (pKa2 = 8.8 to 9.7). For the 2º-amino acid proline, pKa2 is 10.6, reflecting the greater basicity of 2º-amines. Some amino acids have additional acidic or basic functions in their side chains. These compounds are listed in the table on the right. A third pKa, representing the acidity or basicity of the extra function, is listed in the fourth column of the table. The pI's of these amino acids (last column) are often very different from those noted above for the simpler members. As expected, such compounds display three inflection points in their titration curves, illustrated by the titrations of arginine and aspartic acid shown below. For each of these compounds four possible charged species are possible, one of which has no overall charge. Formulas for these species are written to the right of the titration curves, together with the pH at which each is expected to predominate. The very high pH required to remove the last acidic proton from arginine reflects the exceptionally high basicity of the guanidine moiety at the end of the side chain.
Isoelectric Point (pI):
The isoelectric point, pI, is the pH of an aqueous solution of an amino acid (or peptide) at which the molecules on average have no net charge. In other words, the positively charged groups are exactly balanced by the negatively charged groups. For simple amino acids such as alanine, the pI is an average of the pKa's of the carboxyl (2.34) and ammonium (9.69) groups. Thus, the pI for alanine is calculated to be: (2.34 + 9.69)/2 = 6.02, the experimentally determined value. If additional acidic or basic groups are present as side-chain functions, the pI is the average of the pKa's of the two most similar acids. To assist in determining similarity we define two classes of acids. The first consists of acids that are neutral in their protonated form (e.g. CO2H & SH). The second includes acids that are positively charged in their protonated state (e.g. -NH3+). In the case of aspartic acid, the similar acids are the alpha-carboxyl function (pKa = 2.1) and the side-chain carboxyl function (pKa = 3.9), so pI = (2.1 + 3.9)/2 = 3.0. For arginine, the similar acids are the guanidinium species on the side-chain (pKa = 12.5) and the alpha-ammonium function (pKa = 9.0), so the calculated pI = (12.5 + 9.0)/2 = 10.75.
Other Natural a-Amino acids:
The twenty alpha-amino acids listed above are the primary components of proteins, their incorporation being governed by the genetic code. Many other naturally occuring amino acids exist, and the structures of a few of these are displayed below. Some, such as hydroxylysine and hydroxyproline, are simply functionalized derivatives of a previously described compound. These two amino acids are found only in collagen, a common structural protein. Homoserine and homocysteine are higher homologs of their namesakes. The amino group in beta-alanine has moved to the end of the three-carbon chain. It is a component of pantothenic acid, HOCH2C(CH3)2CH(OH) CONHCH2CH2CO2H, a member of the vitamin B complex and an essential nutrient. Acetyl coenzyme A is a pyrophosphorylated derivative of a pantothenic acid amide. The gamma-amino homolog GABA is a neurotransmitter inhibitor and antihypertensive agent.
Many unusual amino acids, including D-enantiomers of some common acids, are produced by microorganisms. These include ornithine, which is a component of the antibiotic bacatracin A, and statin, found as part of a pentapeptide that inhibits the action of the digestive enzyme pepsin.
Reactions of a-Amino acids:
1. Carboxylic acid esterification:
Amino acids undergo most of the chemical reactions characteristic of each function, assuming the pH is adjusted to an appropriate value. Esterification of the carboxylic acid is usually conducted under acidic conditions, as shown in the two equations written below. Under such conditions, amine functions are converted to their ammonium salts and carboxyic acids are not dissociated. The first equation is a typical Fischer esterification involving methanol. The initial product is a stable ammonium salt. The amino ester formed by neutralization of this salt is unstable, due to acylation of the amine by the ester function. The second reaction illustrates benzylation of the two carboxylic acid functions of aspartic acid, using p-toluenesulfonic acid as an acid catalyst. Once the carboxyl function is esterified, zwitterionic species are no longer possible and the product behaves like any 1º-amine.
2. Amine Acylation:
In order to convert the amine function of an amino acid into an amide, the pH of the solution must be raised to 10 or higher so that free amine nucleophiles are present in the reaction system. Carboxylic acids are all converted to carboxylate anions at such a high pH, and do not interfere with amine acylation reactions. The following two reactions are illustrative. In the first, an acid chloride serves as the acylating reagent. This is a good example of the superior nucleophilicity of nitrogen in acylation reactions, since water and hydroxide anion are also present as competing nucleophiles. A similar selectivity favoring amines was observed in the Hinsberg test. The second reaction employs an anhydride-like reagent for the acylation. This is a particularly useful procedure in peptide synthesis, thanks to the ease with which the t-butylcarbonyl (t-BOC) group can be removed at a later stage. Since amides are only weakly basic ( pKa~ -1), the resulting amino acid derivatives do not display zwitterionic character, and may be converted to a variety of carboxylic acid derivatives.
3. Ninhydrin Reaction:
In addition to these common reactions of amines and carboxylic acids, common alpha-amino acids, except proline, undergo a unique reaction with the triketohydrindene hydrate known as ninhydrin. Among the products of this unusual reaction (shown on the left below) is a purple colored imino derivative, which provides as a useful color test for these amino acids, most of which are colorless. A common application of the ninhydrin test is the visualization of amino acids in paper chromatography. As shown in the graphic on the right, samples of amino acids or mixtures thereof are applied along a line near the bottom of a rectangular sheet of paper (the baseline). The bottom edge of the paper is immersed in an aqueous buffer, and this liquid climbs slowly toward the top edge. As the solvent front passes the sample spots, the compounds in each sample are carried along at a rate which is characteristic of their functionality, size and interaction with the cellulose matrix of the paper. Some compounds move rapidly up the paper, while others may scarcely move at all. The ratio of the distance a compound moves from the baseline to the distance of the solvent front from the baseline is defined as the retardation (or retention) factor Rf. Different amino acids usually have different Rf's under suitable conditions. In the example on the right, the three sample compounds (1, 2 & 3) have respective Rf values of 0.54, 0.36 & 0.78.
4. Specific Oxidation:
The mild oxidant iodine reacts selectively with certain amino acid side groups. These include the phenolic ring in tyrosine, and the heterocyclic rings in tryptophan and histidine, which all yield products of electrophilic iodination. In addition, the sulfur groups in cysteine and methionine are also oxidized by iodine. Quantitative measurent of iodine consumption has been used to determine the number of such residues in peptides. The basic functions in lysine and arginine are onium cations at pH less than 8, and are unreactive in that state. Cysteine is a thiol, and like most thiols it is oxidatively dimerized to a disulfide, which is sometimes listed as a distinct amino acid under the name cystine. Disulfide bonds of this kind are found in many peptides and proteins. For example, the two peptide chains that constitute insulin are held together by two disulfide links. Our hair consists of a fibrous protein called keratin, which contains an unusually large proportion of cysteine. In the manipulation called "permanent waving", disulfide bonds are first broken and then created after the hair has been reshaped. Treatment with dilute aqueous iodine oxidizes the methionine sulfur atom to a sulfoxide.