Quantitative Mass Spectrometry in Proteomics: A Critical Review

Download as pdf or txt
Download as pdf or txt
You are on page 1of 15

Anal Bioanal Chem (2007) 389:1017–1031

DOI 10.1007/s00216-007-1486-6

REVIEW

Quantitative mass spectrometry in proteomics:


a critical review
Marcus Bantscheff & Markus Schirle &
Gavain Sweetman & Jens Rick & Bernhard Kuster

Received: 30 April 2007 / Revised: 25 June 2007 / Accepted: 29 June 2007 / Published online: 1 August 2007
# Springer-Verlag 2007

Abstract The quantification of differences between two or Introduction


more physiological states of a biological system is among
the most important but also most challenging technical There is a clear trend in the life sciences towards the study
tasks in proteomics. In addition to the classical methods of of biological entities at the system level. This requires
differential protein gel or blot staining by dyes and analytical tools that can identify the component parts of the
fluorophores, mass-spectrometry-based quantification system and measure their responses to a changing environ-
methods have gained increasing popularity over the past ment. Towards this end, a multitude of transcriptomic,
five years. Most of these methods employ differential stable proteomic, and metabolomic profiling technologies have
isotope labeling to create a specific mass tag that can be been developed, and proteomics in particular is continuing
recognized by a mass spectrometer and at the same time to evolve rapidly. Still, out of the many thousand proteomic
provide the basis for quantification. These mass tags can be studies published to date, only a small minority has
introduced into proteins or peptides (i) metabolically, (ii) by attempted to provide a comprehensive quantitative descrip-
chemical means, (iii) enzymatically, or (iv) provided by tion of the biological system under investigation. Despite
spiked synthetic peptide standards. In contrast, label-free the phenomenal impact of mass spectrometry and peptide
quantification approaches aim to correlate the mass spec- separation techniques on proteomics, the identification and
trometric signal of intact proteolytic peptides or the number quantification of all of the proteins in a biological system is
of peptide sequencing events with the relative or absolute still an unmet technical challenge (Fig. 1). While for
protein quantity directly. In this review, we critically unicellular organisms proteomic coverage of the genome
examine the more commonly used quantitative mass has been occasionally achieved beyond 50%, coverage for
spectrometry methods for their individual merits and discuss higher organisms rarely exceeds 10%. For protein quanti-
challenges in arriving at meaningful interpretations of fication, these figures are significantly smaller due to the
quantitative proteomic data. fact that the data quality, in terms of information content,
required for quantification by far exceeds that for protein
Keywords Quantitative proteomics . Mass spectrometry . identification.
Stable isotope labeling The classical proteomic quantification methods utilizing
dyes, fluorophores, or radioactivity have provided very
good sensitivity, linearity, and dynamic range, but they
suffer from two important shortcomings: first, they require
high-resolution protein separation typically provided by 2D
gels, which limits their applicability to abundant and soluble
M. Bantscheff : M. Schirle : G. Sweetman : J. Rick : proteins; and second, they do not reveal the identity of the
B. Kuster (*) underlying protein. Both of these problems are overcome by
Cellzome AG,
Meyerhofstrasse 1,
modern LC-MS/MS techniques. However, mass spectrom-
69254 Heidelberg, Germany etry is not inherently quantitative because proteolytic
e-mail: bernhard.kuester@cellzome.com peptides exhibit a wide range of physicochemical properties
1018 Anal Bioanal Chem (2007) 389:1017–1031

proteomic workflows, this can technically be achieved in a


number of ways (Fig. 2). One major approach is based on
stable isotope dilution theory which states that a stable
isotope-labeled peptide is chemically identical to its native
counterpart and therefore the two peptides also behave
identically during chromatographic and/or mass spectro-
metric analysis. Given that a mass spectrometer can
recognize the mass difference between the labeled and
unlabeled forms of a peptide, quantification is achieved by
comparing their respective signal intensities. Stable isotope
labeling was introduced into proteomics in 1999 by three
independent laboratories [1–3] and has since been adopted
Fig. 1 Schematic representation of the fraction of a proteome that can widely in the field (for earlier reviews see, e.g., Refs. [4–
by identified or quantified by mass-spectrometry-based proteomics. 11]). Isotope labels can be introduced as an internal
Cellular proteins span a wide range of expression and current mass standard into amino acids (i) metabolically, (ii) chemically,
spectrometric technologies typically sample only a fraction of all the
proteins present in a sample. Due to limited data quality, only a or (iii) enzymatically or, alternatively, as an external
fraction of all identified proteins can also be reliably quantified standard using spiked synthetic peptides [11]. More
recently, alternative strategies—often referred to as label-
such as size, charge, hydrophobicity, etc. which lead to large free quantification—have emerged. Label-free methods aim
differences in mass spectrometric response. For accurate to compare two or more experiments by (i) comparing the
quantification, it is therefore generally required to compare direct mass spectrometric signal intensity for any given
each individual peptide between experiments. In most peptide or (ii) using the number of acquired spectra

Fig. 2 Common quantitative mass spectrometry workflows. Boxes in points at which experimental variation and thus quantification errors
blue and yellow represent two experimental conditions. Horizontal can occur (adapted with permission from Ref. [11])
lines indicate when samples are combined. Dashed lines indicate
Anal Bioanal Chem (2007) 389:1017–1031 1019

matching to a peptide/protein as an indicator for their most commonly used implementation of the method, the
respective amounts in a given sample. As we will discuss in medium contains 13C6-arginine and 13C6-lysine which
the following sections, all of the mass-spectrometry-based ensures that all tryptic cleavage products of a protein
quantification methods have their particular strengths and (except for the very C-terminal peptide) carry at least one
weaknesses (Table 1) but they are beginning to mature to an labeled amino acid resulting in a constant mass increment
extent that they can be meaningfully applied to the study of over the non-labeled counterpart. Protein identification is
biological systems on a proteomic scale. In contrast, the based on fragmentation spectra of at least one of the co-
statistical treatment and subsequent interpretation of quan- eluting ‘heavy’ and ‘light’ peptides and relative quantitation
titative proteomic data are still in their infancy, as the field is performed by comparing the intensities of isotope
is only beginning to experience the particular challenges clusters of the intact peptide in the survey spectrum. In
associated with transforming qualitative protein identifica- contrast to full metabolic protein labeling by 15N, the
tion and post-translational modification data into reliable number of incorporated labels in SILAC is defined and not
quantitative information. dependent on the peptide sequence thus facilitating data
analysis. The main advantage of all metabolic labeling
strategies is that the differentially treated samples can be
Metabolic labeling combined at the level of intact cells. This excludes all
sources of quantification error introduced by biochemical
The earliest possible point for introducing a stable isotope and mass spectrometric procedures as these will affect both
signature into proteins is by metabolic labeling during cell protein populations in the same way. Despite a number of
growth and division. Initially described for total labeling of cases that demonstrate the feasibility of total 15N metabolic
bacteria using 15N-enriched cell culture medium [2], it has protein labeling of higher organisms in vivo such as C.
gained wider popularity in the form of the stable isotope elegans, Drosophila melanogaster [13], rat [14], or plants
labeling by amino acids in cell culture (SILAC) approach [15], it is neither possible nor practical to apply this strategy
introduced by Mann and co-workers in 2002 [12]. In the routinely. The cost and time required for creating and

Table 1 Characteristics and applications of quantitative mass spectrometry methods

Application Accuracy Quantitative Linear


(process) proteome dynamic
coverage rangea

Metabolic protein labeling Complex biochemical workflows +++ ++ 1–2 logs


Comparison of 2–3 states
Cell culture systems only
Chemical protein labeling Medium to complex biochemical +++ ++ 1–2 logs
(MS) workflows
Comparison of 2–3 states
Chemical peptide labeling Medium complexity biochemical ++ ++ 2 logs
(MS) workflows
Comparison of 2–3 states
Chemical peptide labeling Medium complexity biochemical ++ ++ 2 logs
(MS/MS) workflows
Comparison of 2–8 states
Enzymatic labeling (MS) Medium complexity biochemical ++ ++ 1–2 logs
workflows
Comparison of 2 states
Spiked peptides Medium complexity biochemical ++ + 2 logs
workflows
Targeted analysis of few proteins
Label free Simple biochemical workflows + +++ 2–3 logs
(ion intensity) Whole proteome analysis
Comparison of multiple states
Label free Simple biochemical workflows + +++ 2–3 logs
(spectrum counting) Whole proteome analysis
Comparison of multiple states
a
In MRM mode, dynamic range may be extended to 4–5 logs [65]
1020 Anal Bioanal Chem (2007) 389:1017–1031

maintaining these systems is often incommensurate with the conditions typically employed for ESI- and MALDI-MS
18
value of the information provided. As a result, the main O-containing carboxyl groups of peptides are sufficiently
application of metabolic labeling in higher eukaryotes to stable. Because peptides are enzymatically labeled, arti-
date is SILAC in immortalized cell lines. Protein labeling in facts (i.e., side reactions) common to chemical labeling
excess of 90% is often achieved by 6–8 passages in can be avoided. A practical disadvantage is that full
medium supplemented with heavy amino acids [12]. While labeling is rarely achieved and that different peptides
many cell lines can be converted quite readily, some do incorporate the label at different rates which complicates
require special attention. For example, some cell lines data analysis [26, 27].
require careful titration of the amount of arginine in the In principle, every reactive amino acid side chain can be
medium in order to prevent metabolic conversion of excess used to incorporate an isotope-coded mass tag by chemical
arginine into proline which in turn complicates data means (reviewed by Ong and Mann [11]). In practice,
analysis [16]. Cell lines that are sensitive to changes in however, side chains of lysine and cysteine are primarily
media composition or are otherwise difficult to grow or used for this purpose. In their pioneering work Gygi et al.
maintain in culture may not be amenable to metabolic [1] developed the isotope-coded affinity tag (ICAT)
labeling at all. A further limitation of metabolic labeling is approach in which cysteine residues are specifically
the restricted number of available labels. For SILAC, a derivatized with a reagent containing either zero or eight
maximum of three conditions can be compared in one deuterium atoms as well as a biotin group for affinity
experiment (unlabeled, 13C6, and 13C615N4-labeled amino purification of cystein-derivatized peptides and subsequent
acids) which, albeit possible, complicates the analysis of, MS analysis. Following the initial success of the ICAT
e.g., time-course experiments. Because of the early combi- approach, several variations on this chemical reagent class
nation of samples, metabolic labeling and SILAC in emerged to improve, e.g., recovery of labeled peptides or
particular is probably the most accurate quantitative MS chromatographic properties [28–31]. Other thiol-specific
method in terms of overall experimental process. This reagents typically contain halogen-substituted carboxylic
makes it particularly suitable for assessing relatively small acids or amides [32–35] or employ the Michael-type
changes in protein levels or those of post-translational addition reaction to carbonyl groups (e.g., maleiimide
modifications [17–19]. For the latter, it should be noted esters and vinylpyridine) [36, 37]. As cysteine is a rare
though, that quantification on the peptide level is far from amino acid, ICAT and related methods significantly
trivial because all information is derived from a single or a reduce the complexity of the peptide mixture which can
few observations. be advantageous when highly complex samples are
analyzed. However, ICAT is obviously not suitable for
quantifying the significant number of proteins that do not
Protein and peptide labeling contain any (or a few) cysteine residues and is of limited
use for analysis of post-translational modifications and
Post-biosynthetic labeling of proteins and peptides is splice isoforms. Despite these drawbacks, ICAT and
performed by chemical or enzymatic derivatization in vitro. sim ilar approaches will continue to be useful in a number
An elegant and specific way to introduce an isotope label of broad (e.g., body fluid) or targeted (e.g., cysteine
into peptides is the use of trypsin- or Glu-C-catalyzed protease) analyses.
incorporation of 18O during protein digestion [20, 21]. This Another group of labeling reagents targets the peptide
has originally been employed to aid de novo sequencing of N-terminus and the epsilon-amino group of lysine resi-
peptides by mass spectrometry [22] but has recently also dues. Most of the time, this is realized via the very
been applied to quantitative proteomic applications (for a specific N-hydroxysuccinimide (NHS) chemistry or other
recent review see Ref. [23]). Enzymatic labeling can be active esters and acid anhydrides as in, e.g., the isotope-
performed either during proteolytic digestion or, more coded protein label (ICPL) [38], isotope tags for relative
commonly, after proteolysis in a second incubation step and absolute quantification (iTRAQ) [39], tandem mass
with the protease. Incorporation of 18O into C-termini of tags (TMT) [40], and acetic/succinic anhydride [41–44].
peptides results in a mass shift of 2 Da per 18O atom. Isocyanates or isothiocyanates have also been employed,
While trypsin and Glu-C introduce two oxygen atoms re- albeit to a lesser extent [45, 46]. In recent studies,
sulting in a 4 Da mass shift which is generally sufficient formaldehyde has been used for methylation of lysine
for differentiation of isotopomers, Lys-N and other residues via Schiff base formation and subsequent reduc-
enzymes incorporate only one 18O molecule and should tion by cyanoborohydride [47–49]. This reaction is very
therefore be avoided [24]. Acid- and base-catalyzed back- fast, very specific, and very cheap. However, a sufficiently
exchange with concomitant loss of the isotope label can large mass shift between ‘heavy’ and ‘light’ labeled
occur at extreme pH values [25], but under the mild acidic peptides can only be achieved with deuterated formalde-
Anal Bioanal Chem (2007) 389:1017–1031 1021

hyde which in turn leads to partial LC separation of labeled residues, which leads to significantly longer peptides that
and non-labeled peptides, thus complicating data analysis generally are more difficult to identify by MS; second,
(discussed below). very high labeling efficiencies are required in case further
In most of the aforementioned chemical modification protein separation is desired prior to MS analysis, since
techniques, relative quantification is achieved by integra- incomplete labeling impairs resolving power achievable
tion of MS signal over isotopomers of ‘heavy’ and ’light’ with, e.g., 1D and 2D gel electrophoresis. A general draw
labeled peptides in survey spectra. Isobaric mass tagging back with all chemical labeling approaches is that they are
initially introduced by Thompson and co-workers [40] prone to side reactions that can lead to unexpected
differs from this concept by introducing tags that initially products and which may adversely influence quantification
produce isobaric labeled peptides which precisely co- results.
migrate in liquid chromatography separations. Only upon
peptide fragmentation are the different tags distinguished
by the mass spectrometer. This permits the simultaneous Absolute quantification using internal standards
determination of both identity and relative abundance of
peptide pairs in tandem-mass spectra. The commercially The use of isotope-labeled synthetic standards has a long
available iTRAQ reagent [39] provides a further refinement history in quantitative mass spectrometry. Originally de-
of this approach, allowing multiplexed quantitation of up to scribed in the early 1980s [59], it is now becoming more
eight samples. This has turned out to be particularly useful broadly applied as a method commonly known as AQUA
for following biological systems over multiple time points (absolute quantification of proteins) [60]. In the simplest
or, more generally, for comparing multiple treatments in the case, absolute quantification can be achieved by the
same experiment. addition of a known quantity of a stable isotope-labeled
Carboxylic acids in side chains of glutamic and aspartic standard peptide to a protein digest and subsequent com-
acid residues as well as the C-termini of polypeptide chains parison of the mass spectrometric signal to the endogenous
can be isotopically labeled by esterification using deuterat- peptide in the sample. Unlike in metabolic labeling, where
ed alcohols [50, 51]. This reaction is particularly attractive relative quantitative information is acquired for a large
for the quantification of phospho-peptides because esterifi- number of the proteins present in a mixture, the addition of
cation has been shown to reduce binding of acidic peptides synthetic peptides to a proteome digest focuses on the
to ion metal chelate affinity chromatography (IMAC) determination of the quantity of one or a few particular
columns, thus improving the specificity of this enrichment proteins of interest. This approach is attractive for studies
procedure [52]. Other, more tailored labeling techniques aimed at, e.g., the analysis and validation of potential
have been developed, e.g., for quantification of phosphor- biomarkers in a large number of clinical samples [61] or at
ylated and glycosylated peptides. For the former, b- measuring the levels of particular peptide modifications
elimination of phosphoric acid followed by Michael such as ubiquitinylation [62].
addition using, e.g., ethanedithiol derivatives is typically The approach has been refined by constructing synthetic
employed [53–56]. For glycopeptides, hydrazide chemistry genes that express concatenated standard peptides which
replaces the carbohydrate moiety with a labeled chemical upon tryptic digestion either provide multiple peptides of
group [57]. the same protein for quantification or quantification stand-
Broadly speaking, the chemical properties of amino acid ards for a group of proteins of interest [63]. Not only does
side chains of proteins and peptides chains are rather the provision of multiple peptides increase confidence in
similar. Consequently, almost all chemical labeling methods quantification, the synthetic protein can also be added
may also be applied to intact proteins. For example, the earlier in the process than individual peptides, thus
ICPL reagent [38] has been employed for N-terminal controlling any potential bias encountered during protein
peptide labeling as well as lysine side chain labeling of digestion. One notable example of following the synthetic
intact proteins. A similar protocol has been described for gene strategy is the determination of the stoichiometry of
iTRAQ [58]. In most cases, full protein denaturation the eight-membered eIF2B-eIF2 protein complex [64].
improves labeling results but care has to be taken to avoid Given that tryptic digests of entire proteomes are very
protein precipitation (by, e.g., the use of charged reagents). complex mixtures, and that most mass spectrometers have a
Labeling of intact proteins can be quite advantageous since rather limited dynamic detection range, there are a number
it allows for further protein separation steps on the of limitations to the AQUA approach. One practical
combined samples. This may facilitate characterization of drawback is that one has to ‘guess’ how much of the
protein isoforms by, e.g., 2D gel electrophoresis [38]. labeled standard should be added to a sample. This amount
However, there are two important caveats to protein label- may be different for all proteins of interest as their
ing: one is that trypsin does not cleave modified lysine expression levels (used here in the sense of protein
1022 Anal Bioanal Chem (2007) 389:1017–1031

abundance rather than protein synthesis) may differ greatly HPLC compared to their non-deuterated counterparts [67].
within a sample. Another limitation is the specificity of the This complicates data analysis because the relative quantities
spiked standard as there are likely multiple isobaric of the two peptide species cannot be determined accurately
peptides present in the mixture. Both of these issues can from one spectrum but requires integration across the
be greatly improved by a method called multiple reaction chromatographic time scale. Retention time shifts are far
monitoring (MRM) [62] in which the (triple quadrupole) less pronounced for labels such as 13C, 15N, or 18O isotopes
mass spectrometer monitors both the intact peptide mass [68], so that the additional signal integration step over
and one or more specific fragment ions of that peptide over retention time can generally be omitted.
the course of an LC-MS experiment. The combination of Another requirement for any stable isotope labeling
retention time, peptide mass, and fragment mass practically approach is that the heavy label can be clearly distinguished
eliminates ambiguities in peptide assignments and extends from the unlabeled peptide or any other unrelated ion species
the quantification range to 4–5 orders of magnitude [65]. (Fig. 3a). For quantification in survey MS spectra, it is
Obviously, the choice of synthetic peptide standard is essential that the mass shift introduced by the label is at least
important and is mostly determined empirically. However, 4 Da in order to distinguish the isotopomer clusters of the
recent data suggest that it is possible to predict which of a labeled and unlabeled forms of the peptide. As isotopomer
protein’s tryptic peptides will be most frequently observed clusters increase in width with increasing peptide mass, the
for a given proteomic platform and thus would be a suitable application of labeling methods such as methylation and
quantification standard [66]. Despite the ability to calculate enzymatic 18O labeling becomes limited for larger peptides.
protein amounts from an AQUA experiment, there are still Reporter ions used for quantification in tandem MS spectra
question marks as to how absolute these values are as any should be designed such that interference by ordinary peptide
sample manipulation prior to adding the synthetic standard fragments is minimal. For the iTRAQ label, the m/z region of
may bias the results (losses or enrichment). Consequently, 114–117 was chosen for this reason. Still, some interferences
the amount of a protein in an experiment determined by have been identified (notably the 116.1 Da y(1) fragment ion
AQUA may not reflect the true expression levels of this of peptides containing a C-terminal proline residue [69]) and
protein in a cell. these data points have to be carefully removed in the data
analysis process.
A further parameter impacting accuracy and dynamic
LC-MS/MS analysis of stable isotope labeled peptides range of quantification is the mass spectrometric detection
system itself. In survey MS spectra, the definition of very
As described above, quantitation based on stable isotope low and very strong signals can be problematic. At very
labeling can be achieved by signal integration in survey MS low signal, peptide ions are often difficult to distinguish
spectra (e.g., SILAC) or tandem MS spectra (e.g., iTRAQ). from background noise (Fig. 3b) and for very strong
For both approaches, several points have to be considered signals, the detector may become saturated (Fig. 3c). In
in the design and analysis of an experiment. Although the practice, saturation is more often observed for quadrupole
assumption that stable isotope labeling does not alter the TOF instruments than ion traps because these latter devices
physicochemical properties of a peptide is generally valid, can control the number of ions before detection [70]. In any
it has been observed that deuterated peptides show small case, the relatively recent introduction of high-resolution/
but significant retention time differences in reversed-phase high mass accuracy mass spectrometers in proteomics has

Fig. 3 Examples illustrating mass spectral features relevant for indicate a 1:1 abundance ratio. b Example of a peptide and other
quantification. a Example of a SILAC-labeled peptide pair suitable interfering signals with signal to noise ratios too low for reliable
for quantification. The spectra displays the characteristic 6 Da (3 m/z) quantification. c Example of a peptide signal saturating the detector
mass difference between light and heavy forms of the peptide, good and thus distorting the isotope pattern to a degree that the spectrum is
signal to noise ratio and no interfering signals. Signal intensities not suitable for quantification
Anal Bioanal Chem (2007) 389:1017–1031 1023

greatly facilitated the ability to quantify proteins in complex employ a high mass accuracy mass spectrometer because
proteomes because the increased instrument performance the influence of interfering signals of similar but distinct
enables the exact discrimination of peptide isotope clusters mass can be minimized. (ii) The peptide chromatographic
from interfering signals caused by, e.g., co-eluting and near- profile should be optimized for reproducibility to ease
isobaric peptides and other chemical entities [71–73]. For finding corresponding peptides between different experi-
quantification in tandem MS spectra, saturation effects are ments. This is not a trivial task and special software has
rarely a problem. Instead, low-intensity spectra are fre- been developed to align LC-runs prior to identifying
quently obtained and may result in less robust quantitation corresponding peptides [81–84]. (iii) The right balance
values due to poor ion statistics. Unlike for quantification in between acquisition of survey and fragment spectra has to
survey spectra, the contribution of peptidic or chemical be found. While extensive peptide sequencing by tandem
background noise to quantification does not depend on the MS is required to identify as many proteins as possible in
mass resolution of the mass spectrometer but on the size of complex mixtures, a robust quantitative reading by ion
the m/z window chosen for isolation of peptides for intensities requires multiple sampling of the chromato-
sequencing (typically 2–6 m/z). All ions present in this graphic peak by survey mass spectra. Typically, multiple
window will contribute to the signal of the, e.g., iTRAQ fragment spectra are acquired for every survey spectrum at
reporter ions. As a result, it is not always clear to what acquisition rates ranging from 0.2 s/spectrum (ion traps) to
extent quantification was contributed by the peptide of 1–3 s/spectrum (quadrupole-TOF instruments). Given that
interest or by background. This can sometimes lead to a chromatographic peak widths are in the order of 10–30 s for
large underestimation of true changes, especially for very nano-LC separations, ion traps have an inherent advantage
weak peptide signals. over QTOFs because many more MS to MS/MS cycles can
Taken together, the limits to quantification of complex be performed within the available chromatographic time.
proteomes by stable isotopes is first and foremost an issue Still, even for fast sampling instruments, better quantifica-
of signal interference caused by co-eluting components of tion accuracy will inevitably mean poorer proteome
similar mass. Therefore, the most straightforward way for coverage and vice versa. This dilemma has led some
optimizing quantitative analyses is to decrease sample laboratories to conduct two separate experiments for each
complexity by increasing HPLC gradient times or by sample: one which focuses on identifying as many peptides
biochemical fractionation prior to LC-MS analysis. as possible by MS/MS and a second performed in MS-only
mode in order to optimize sampling of intact peptide
signals. In these approaches, matching of integrated peak
Label-free quantification intensities to identified peptides is performed by using a
combination of accurate mass and retention time [84–86].
Currently, two widely used but fundamentally different An alternative has been proposed in which the mass
label-free quantification strategies can be distinguished: (a) spectrometer no longer cycles between MS and MS/MS
measuring and comparing the mass spectrometric signal mode but aims to detect and fragment all peptides in a
intensity of peptide precursor ions belonging to a particular chromatographic window simultaneously by rapidly alter-
protein and (b) counting and comparing the number of nating between high- and low-energy conditions in the
fragment spectra identifying peptides of a given protein. In mass spectrometer [87–90]. Obviously, there are challenges
the former approach, the ion chromatograms for every with analyzing such data from complex samples as many
peptide are extracted from an LC-MS/MS run and their fragmentation spectra will be populated with sequence ions
mass spectrometric peak areas are integrated over the from multiple peptides each contributing differently to the
chromatographic time scale. For low-resolution mass overall spectral content.
spectra this is typically done by creating extracted ion The peptide or more recently introduced spectral
chromatograms (XICs) for the mass to charge ratios counting approach [91–93] is based on the empirical
determined for each peptide [74]. More recently, this observation that the more of a particular protein is present
concept has been extended to high-resolution data to in a sample, the more tandem MS spectra are collected for
include contributions of 13C isotopes to the overall signal peptides of that protein. Hence, relative quantification can
intensities [75]. The intensity value for each peptide in one be achieved by comparing the number of such spectra
experiment can then be compared to the respective signals between a set of experiments. In contrast to quantification
in one or more other experiments to yield relative by peptide ion intensities, spectral counting benefits from
quantitative information [74, 76–80]. For proteomic analy- extensive MS/MS data acquisition across the chromato-
sis of very complex peptide mixtures, three important graphic time scale both for protein identification as well as
experimental parameters affect the analytical accuracy of protein quantification. However, the commonly employed
quantification by ion intensities. (i) It is advantageous to dynamic exclusion of ions that have already been selected
1024 Anal Bioanal Chem (2007) 389:1017–1031

for fragmentation is detrimental for accurate quantification advantage over stable isotope labeling techniques that are
[94]. Although very intuitive and attractive in practical typically limited to 2–8 experiments that can be directly
terms, the spectrum counting approach is still controversial compared. (ii) Unlike for most stable isotope labeling
because it does not measure any direct physical property of techniques, mass spectral complexity (in terms of detected
a peptide. It further assumes that the linearity of response is peptide species within a particular chromatographic time
the same for every protein. In fact, the spectrum count window) is not increased which, in turn, might provide for
response is different for every peptide because, e.g., the more analytical depth (i.e., number of detected peptides/
chromatographic behavior (retention time, peak width) proteins in an experiment) because the mass spectrometer is
varies for every peptide. Therefore, even reasonable not occupied with fragmenting all forms of the labeled
quantification requires the observation of many spectra for peptide. (iii) There is evidence that label-free methods
a given protein. Old et al. [94] have shown that although it provide higher dynamic range of quantification than stable
is possible to detect threefold protein changes with as few isotope labeling (Table 1) and therefore may be advanta-
as four spectra; this number increases exponentially for geous when large and global protein changes between
smaller changes (ca.15 spectra for twofold). At the same experiments are observed. However, particularly for spec-
time, saturation effects will be observed at higher spectral tral counting, this comes at the cost of unclear linearity and
counts and saturation levels will be different for all proteins relatively poor accuracy [94].
which renders the assessment of the dynamic range of
observed changes difficult.
Nevertheless, the correlation between amount of protein Analysis of quantitative MS data
and number of tandem mass spectra does hold and has led
researchers to extend the concept to the estimation of When contemplating a data analysis strategy for proteomic
absolute protein expression levels. In the first of a series of data generated by quantitative mass spectrometry, it is
papers, Rappsilber et al. [95] computed a protein abundance worth reconsidering a couple of principle points. Quantita-
index (PAI) by dividing the number of observed peptides tive proteomic data are typically very complex, and often of
by the number of all possible tryptic peptides from a variable quality. This is in part because the data are
particular protein that are within the mass range of the incomplete: even the most advanced mass spectrometers,
employed mass spectrometer. In a subsequent refinement, which can acquire several tandem MS spectra per second,
the same group transformed the PAI into an exponentially are often overwhelmed by the number of peptides present
modified form (emPAI) [96] which showed a better in a sample. As a consequence, only a subset of all proteins
correlation to known protein amounts. Further advances present can be identified in any one analysis [100]. For
have been made by using computational models that predict protein quantification, it is further mandatory to detect a
which peptides of a given protein are likely to be detected protein in all experiments that should be compared. As a
by the mass spectrometer in the first place and thus would result, often only a subset of identified proteins can actually
form a better basis for quantification [97–99, 66]. For be quantified (Fig. 1) [92]. Identification and quantification
example, results obtained by the absolute protein expres- rates are direct functions of sample complexity. While a
sion profiling (APEX) method [99] suggest that absolute large fraction of proteins present in, e.g., affinity purifica-
protein expression can be determined to within the correct tions can be identified and quantified using a reasonable
order of magnitude. number of acquired spectra, a much smaller fraction of the
Label-free approaches are certainly the least accurate content of whole proteome shotgun experiments will be
among the mass spectrometric quantification techniques covered and with fewer spectra for each protein. This clearly
when considering the overall experimental process because limits the confidence in quantification results.
all the systematic and non-systematic variations between These general considerations aside, practitioners of
experiments are reflected in the obtained data (Fig. 2). proteomics will soon face a number of practical challenges
Consequently, the number of experimental steps should be in analyzing quantitative mass spectrometric data: (i)
kept to a minimum and every effort should be made to quantitative readings must be extracted from MS or MS/
control reproducibility at each step. Nonetheless, label-free MS spectra; (ii) peptide and protein identification must be
quantification is worth considering for a number of reasons. performed; (iii) the two types of information must be
In simple practical terms, the time-consuming steps of merged and quality controlled; (iv) the applicable statistical
introducing a label into proteins or peptides can be omitted methods have to be identified; and (v) the individual steps
and there are no costs for labeling reagents. In terms of have to be combined into a workflow which bridges gaps
analytical strategy, the following points may also be between commercially available software and custom-built
important: (i) there is no principle limit to the number of tools and which ideally also allows for automating most of
experiments that can be compared. This is certainly an the tasks (Fig. 4).
Anal Bioanal Chem (2007) 389:1017–1031 1025

For protein quantification based on spectrum counting, isotopomers of a peptide. Each method has its merits and
the data processing steps are basically identical to the detractions: monoisotopic peak integration is relatively
general protein identification workflow in proteomics straightforward to implement but not very sensitive partic-
which is one of the reasons why this approach has become ularly for larger peptides for which the monoisotopic peaks
so popular. Researchers can choose from a variety of only constitute a minority of the total signal intensity. In
methods available for automated protein identification and addition, the use of heavy isotopes distorts the relative
subsequent (probabilistic) validation of spectrum-to-peptide isotope distribution of peptides which leads to inaccuracies.
matches (for a recent review see Ref. [101]). It should be In contrast, the summed area of the entire isotope cluster is
emphasized that for any quantification method it is the most sensitive and accurate method [102] as it utilizes
mandatory to consider only those spectrum-to-peptide all of the data but is more difficult to implement
matches that are unique for a particular protein [11]. computationally. As discussed in a previous section, signal
intensity integration over the chromatographic time scale is
Extracting quantitative information from MS primarily required for label-free quantification as well as
and MS/MS spectra those stable isotope reagents that lead to significant differ-
ences in chromatographic behavior. For methods which do
Quantification methods based on ion intensities, regardless not suffer from this shortcoming, time integration can be
of whether employing stable isotope labeling or not, require performed but is not required. Instead, collection of several
a number of additional steps prior to protein quantification spectra for each peptide is generally useful in order to
(boxed area in Fig. 4). Two particular elements are obtain several quantitative readings.
important to mention here: intensity integration (i) within
the mass spectrum (centroiding) and (ii) across the Quality control of raw MS data
chromatographic peak. For low-resolution MS data, both
aspects are carried out in one operation by extracting the ion There are several sources of potential error in the mass
chromatograms from the LC-MS data. For high-resolution spectrometric readout of an LC-MS experiment that can
MS data, the procedure is more complex and typically negatively affect the results of peptide quantification.
performed in two steps. Signal intensity integration within Spectra for which these errors are detected should be
the mass spectrum can either utilize the intensity/area of the filtered out prior to computing quantification values. The
monoisotopic peak or the sum of the intensities/areas of all first of these issues is the presence and variability of

Fig. 4 Generic data processing


and analysis workflow for
quantitative mass spectrometry.
Yellow icons indicate steps
common to all quantification
approaches with or without the
use of stable isotopes. Blue
icons in the boxed area refer to
extra steps required when using
mass spectrometric signal inten-
sity values for quantification
1026 Anal Bioanal Chem (2007) 389:1017–1031

spectral background noise (Fig. 3b) which can be filtered corrected for, isotope impurities lead to increased spectral
out by most if not all available commercial and academic interferences and, more importantly, limit the dynamic
data processing packages. A second common issue is the range of detectable differences between samples. A similar
presence of interfering signals other than background noise argument applies to incomplete incorporation of the isotope
(Fig. 3b). For very complex peptide mixtures, these often label into proteins and peptides. Again, while isotope
constitute co-eluting peptides of very similar m/z values incorporation can be measured and correction factors can
which in turn will render the correct assignment of signal be applied, the combination of the above items limits the
intensities to particular peptide ions difficult. This is true dynamic range of detectable differences between samples to
for quantification in both MS and MS/MS spectra and such approximately 20–30:1. Consequently, determined changes
spectra should be removed from the analysis. Third, strong are often smaller than their true values. It is important to
signal intensities can lead to detector saturation for some keep in mind that this effect can be much more pronounced
mass spectrometers (particularly quadrupole TOF instru- when spectral background contributes significantly to
ments, Fig. 3c) which distorts the natural isotope intensity overall spectral intensity.
distribution and thus leads to false quantitative readings.
For stable isotope labeling, further quality criteria must From spectra to relative protein quantification
be considered. One very simple and often incurred problem
is systematic bias introduced by imperfections in mixing For the spectrum counting approach, relative protein
the two protein populations. Mixing errors can most of the quantification between two or more samples is simply
time be determined experimentally and apply uniformly to performed by comparing the respective numbers. If ten
all protein quantification values and are thus easily spectra are observed for a protein under condition 1 and 15
corrected for. A second systematic error is represented by spectra under condition 2, the change between the two
the isotope purity of the employed labeling reagent which conditions is 1.5-fold. In contrast, for all approaches that
rarely exceeds 95–98%. Although this may not appear to be measure signal intensities of peptide spectra, a quantitative
a significant source of uncertainty and, again, can be easily reading is obtained for each spectrum. Obviously the accuracy

Fig. 5 a Distribution of measured changes from peptide spectra as a tion (expressed as relative standard deviation, RSD) and the number of
function of spectrum intensity for a single protein mixed in a 2:1 ratio. observed peptide spectra for a given protein from replicate experi-
Diamonds represent intensity readings from individual spectra. The ments. Not surprisingly, precision increases with increasing number of
red line indicates the expected ratio of 2. It is evident that variations in spectra. d Change distribution for approximately 1,000 proteins
change determination are much larger for low-intensity spectra than identified and quantified between two experimental conditions in a
for medium- or high-intensity spectra. b Protein change determination single experiment. Diamonds represent individual protein fold
by linear regression analysis. Diamonds represent intensity readings changes in ascending order. In the absence of replicate experiments,
from individual spectra for samples 1 and 2 (same data as in a). The data points between yellow lines (arbitrarily set at 2σ) are typically not
slope of the two-sided regression line approximates the expected considered to change significantly. However, these data points may
twofold difference in protein quantity between the two samples. c contain many false negatives (small but significant changes)
Histogram showing the relationship between precision of quantifica-
Anal Bioanal Chem (2007) 389:1017–1031 1027

of the protein quantification is determined by the accuracy been applied to gene expression analysis but can often also
of each peptide (spectrum) determination. The resulting be applied to quantitative MS data. However, the required
data are spectrum-related quantity measures of varying data preparation steps such as normalization might be
precision. As an experiment typically produces a number of significantly different.
spectra per protein, these measurements have to be
aggregated in a way that returns the best (i.e., most precise) Data preparation
protein quantification measure. Most publications to date
rely on simple averaging of ratios [103], but as exemplified Raw data from quantitative MS experiments are generally
in Fig. 5a, variation of change determination is a function not suitable for statistical analysis, thus a number of
of signal intensity. Thus, low-intensity or noisy data may preparative steps are required. First, raw data are typically
easily distort the mean value of computed ratios [104]. To not normally distributed, an assumption made by many
overcome this problem, intensity thresholds have been statistical tests. Therefore, data are frequently log-trans-
employed [65]. However, these mostly arbitrary thresholds formed assuming that the data are lognormal-distributed.
may also lead to arbitrary reduction of proteins that can be This operation typically also harmonizes the variance of
quantified. As an alternative, results can be improved either data (otherwise high values would have large variances and
by calculation of an intensity weighted average, by vice versa). If replicates of the experiment have been
summing up of all measured quantities followed by generated, normalization of their data is mandatory because
calculation the protein ratio [103, 75], or by calculating a technical bias may overshadow the underlying biological
linear regression (allowing for two dimensions of freedom) effects (for details on normalization techniques, see Refs.
to determine the protein ratio (Fig. 5b) [105]. Apart from [106–108]). As discussed above, technical effects include
mass spectrometric signal strength, accuracy of quantifica- sample mixing errors, incomplete isotope incorporation, or
tion also benefits from the availability of multiple spectra isotope impurity. In many cases, systematic technical bias
for a given protein (Fig. 5c). can be measured directly but in some cases requires
dedicated experimentation (e.g., by a label swap experiment
[109, 110]) to determine its source. The resulting informa-
Statistical analysis of experimental data tion is used to build correction functions that are consec-
utively applied to the data. It should be noted that it is very
Proteomic experiments comparing a number of states of a likely that not all manifesting sources of systematic error
biological system typically generate complex data. An have been described yet or that these are not readily
understanding of the experimental setup and the nature amenable to determination (e.g., background contribution
and quality of the obtained data are required to devise in iTRAQ experiments). It can be expected though that with
appropriate statistical methods. Experiments typically fall the rapid evolution of proteomic technologies, many of
into two distinct categories: either the interrelation between these yet unknown sources of error will be uncovered and
a protein’s abundance (or another property) and a certain the learnings subsequently used to sharpen the data which,
sample condition is examined or the interaction between in turn, increases data quality.
proteins is analyzed. Table 2 lists examples of such Another challenge to a statistical treatment of proteomic
questions and some appropriate statistical strategies that data is the mostly random sequencing of peptides by the
have been applied to answer them. The detection of protein mass spectrometer. As a result, not every available peptide
abundance changes is discussed in more detail below as it is identified in every experiment. This effect is more pro-
represents one of the major applications of proteomics. nounced for peptides of low abundance and poor detectabil-
Most of the available statistical methods have previously ity, resulting in many missing values in an experiment.

Table 2 Statistical methods for proteomics

Category Question Analysis suggestions

Protein change between Does a protein behave significantly different Multiple hypothesis testing
conditions between two samples?
Does a protein exhibit time-dependent change? Analysis of variance (ANOVA)
Is the sample a member of a defined class of Classification methods (e.g., linear discriminant analysis,
samples? support vector machines)
Dependencies between Which proteins behave similarly in the Cluster analysis
proteins experiment?
1028 Anal Bioanal Chem (2007) 389:1017–1031

However, statistical methods often require complete data. In spectra indicating this change, there are two important
such cases, missing values may be estimated by, e.g., caveats. In this representation, small but potentially
averaging available values of the protein from other significant changes go unnoticed (false negatives) and, in
replicates or using related values from other proteins from the absence of repeating the experiment, there is no way of
the same experiment. It should be noted though that assessing if the observed large protein changes that are
estimating values inevitably results in decreased statistical backed by few spectral observations can be reproduced
power [111, 112]. (false positive). Even small numbers of repetitions can
Values that are grossly different from comparable increase confidence in the results considerably. In addition,
observations (outliers) require special attention. They can the use of statistical testing methods adds options to
either indicate a true observation of a particular peptide determine the probability of false decisions. A typical
species, e.g., a regulated post-translational modification, or situation is the comparison of protein levels between two
a false reading. In both cases, these data points should different samples with the goal to detect those proteins that
initially be excluded from the calculation of protein are significantly changed between conditions. This biolog-
quantities but not categorically rejected. A common way ical question can be formulated as a problem in multiple
to spot outliers is visual inspection by the investigator, hypotheses testing that describes a simultaneous test for
leaving considerable room for subjective judgement. Dur- each protein on the null hypothesis of no change in protein
ing calculation of protein values from individual spectra by measure between the two conditions. A standard approach
linear regression (see above) outlier detection on the to such a multiple testing problem consists of two aspects:
spectrum level is possible using established methods [113, (i) computing a test statistics and (ii) applying a multiple
114] but may result in loss of valuable data. For data testing procedure to determine which hypothesis to reject
correction at the protein level, methods for multivariate data (change or no change) while controlling a defined false
can also be adapted [115, 116]. positive error rate [117]. Computing the test statistic for each
protein can be carried out, e.g., by employing the frequently
Detection of differential protein expression used t-test. This test expects the data to be normally
distributed, an assumption that is not always justified and
It is not uncommon that publications reporting results of requires a significant number of replicates in order to return
proteomic experiments using quantitative mass spectrome- reliable results (Table 3). For lower replication numbers (2–
try base conclusions on measurements generated in one or 3) the so-called local-pooled-error test (LPE) has been found
two experiments. This is understandable given the often to be useful provided that protein changes are not too small
limited availability of specimen as well as the cost and time [118–120]. For data with unknown distribution character-
required to perform and analyze these samples. However, in istics, non-parametric tests can also be used that are agnostic
light of the often considerable experimental variation, it is towards the data’s distribution but come at the expense of
likely that those studies will not realise their full potential. statistical power [121].
For example, the graph shown in Fig. 5d represents a rank In the proteomics case where many proteins are tested
order list of the observed changes between two experiments simultaneously, the probability of committing an error
for approximately 1,000 proteins. Proteins at the extremes increases often dramatically. For example, when consider-
of the distribution change the most and are therefore often ing a list of hundreds of proteins at a defined error rate of,
considered to be the most interesting. While this might e.g., 0.01, it is likely that several false positives will occur
often be true when these observations are backed by many by chance. However, when setting the thresholds too

Table 3 Characteristics and applications of statistical tests

Test Requirements Statistical power Application

Tests for experiments with replicates


t-test Replications, n>3 +++ All quantitation methods
Data normally distributed
LPE-test Replications, n>1 ++ All quantitation methods
2–3 replicates
Strong changes
Tests for experiments without replicates
G-test (Very) large number of peptide spectra + Spectrum counting
Fischer’s exact test (Very) large number of peptide spectra + Spectrum counting
AC-test (Very) large number of peptide spectra + Spectrum counting
Anal Bioanal Chem (2007) 389:1017–1031 1029

conservatively to minimize false positive rate (i.e., the rate creased sampling (total number of spectrum counts in an
that truly null features are called significant), this often experiment). Despite the fact that the commonly used
leads to an unacceptable increase in the false negative rate dynamic exclusion option during LC-MS analysis violates
(i.e., the rate that truly significant features are called null). random sampling, Zhang et al. showed that the approach
Commonly used alternative measures of error rates in can be generally useful [120].
multiple testing procedures are the family wise error rate In contrast to statistical estimation, the performance of a
(FWER; i.e., the rate that one truly null feature is called chosen statistical test can often also be assessed experi-
significant among all tests) and the false discovery rate mentally by means other than multiple repetitions. One way
(FDR; i.e., the rate that features called significant are truly of measuring errors directly and under the same analytical
null) which break up the direct dependency between false conditions is to offset the measurement of a particular
positive and false negative rates. Instead of simply report- sample to a dilution of the very same sample [126]. Also,
ing rejection or acceptance of the specified hypothesis spiked proteins have been used to generate reference data
using these methods, a p-value connected to the test can be for a set of proteins with known behavior that can be
defined which describes the significance of a test as the utilized for ‘calibrating’ an experiment type [92]. Once the
smallest possible significance level at which the null statistical parameters have been learned, these may be
hypothesis would be rejected. Various procedures for applied to subsequent experiments without the need for
deriving adjusted p-values for multiple hypothesis testing repetition. Although the statistical power of such approaches
have been suggested, e.g., the Bonferroni adjusted p-value is lower than those based on multiple repetitions of the same
for FWER and the q-value for FDR [122]. q-Values have experiment, the former may be sufficient particularly for
since also been adopted in proteomics research [123, 124]. samples of low protein complexity (e.g., affinity purifica-
A detailed overview of multiple hypothesis testing has been tions). Further assessment of data significance may be
given by Dudoit and co-workers [125]. provided by curve fitting methods (e.g., the LOWESS fit)
which can reveal regions of random experimental error in the
Sampling statistics observed dataset [123].

For a number of proteomic applications, sampling statistics


(e.g., spectrum count, peptide count, sequence coverage) Concluding remarks
shows increasing potential. Zhang and co-workers [120]
recently compared the aforementioned three approaches A multitude of methods has emerged for the analysis of
and found that the spectrum counting approach offered the simple and complex (sub-)proteomes using quantitative
greatest reproducibility. This is probably not surprising mass spectrometry, and the field is beginning to learn for
given that this approach generates many more data points which type of study these methods can be meaningfully
than peptide counting or measuring sequence coverage. In applied. However, significant further improvements to
addition this paper explores a number of statistical methods experimental strategies are required particularly for the
for data analysis. For experiments that feature three or more quantitative analysis of post-translational modifications. It
replicates of each condition, statistical difference can be is probably fair to say that the field is still far from being
assessed by the t-test as described above. However, if able to generate quantitative proteomic data at a scale
repetitions are not available, other statistical options have to which would allow the comprehensive investigation of a
be considered. To that end, tests may be applicable that biological phenomenon. At the same time, the recent
attempt to mimic replicates by pooling certain features. For exponential increase in data volume and complexity
example, for each detected protein, spectral counts from a demands the development of appropriate statistical
pair-wise experiment can be arranged in a two-way table approaches in order to arrive at meaningful interpretations
(proteins vs. conditions). A protein is then called differen- of the results. This can only be achieved if the influence of
tially expressed if its proportion of spectrum counts to the the employed technologies on the results obtained is well
total spectrum count in the experiment is significantly understood and by ensuring that experimental design
different between both conditions. There are a number of follows the biological context so that the ‘right statistics’
possible statistical tests using different hypotheses for this can be developed for the problem at hand in order to
approach (Table 3, bottom). The authors of the aforemen- generate scientific insight.
tioned paper conclude that Fisher’s exact test, the AC-test,
and the G-test return comparable results. However the G- Acknowledgements The authors wish to thank David Simmons and
Ulrich Kruse for critically reading the manuscript and Frank
test is computationally simpler and can be generalized for
Weisbrodt for help with preparing the figures. We are grateful to
multi-condition experiments and thus may be the more Nature Publishing Group for granting permission to reproduce and
versatile approach. Results typically improve with in- adapt previously published material.
1030 Anal Bioanal Chem (2007) 389:1017–1031

References 36. Qiu Y, Sousa EA, Hewick RM, Wang JH (2002) Anal Chem
74:4969–4979
37. Sebastiano R, Citterio A, Lapadula M, Righetti PG (2003) Rapid
1. Gygi SP, Rist B, Gerber SA, Turecek F, Gelb MH, Aebersold R Commun Mass Spectrom 17:2380–2386
(1999) Nat Biotechnol 17:994–999 38. Schmidt A, Kellermann J, Lottspeich F (2005) Proteomics 5:
2. Oda Y, Huang K, Cross FR, Cowburn D, Chait BT (1999) Proc 4–15
Natl Acad Sci U S A 96:6591–6596 39. Ross PL, Huang YN, Marchese JN, Williamson B, Parker K,
3. Pasa-Tolic L, Jensen PK, Anderson GA, Lipton MS, Peden KK, Hattan S, Khainovski N, Pillai S, Dey S, Daniels S, Purkayastha
Martinovic S, Tolic N, Bruce JE, Smith RD (1999) J Am Chem S, Juhasz P, Martin S, Bartlet-Jones M, He F, Jacobson A,
Soc 121:7949–7950 Pappin DJ (2004) Mol Cell Proteomics 3:1154–1169
4. Aebersold R, Mann M (2003) Nature 422:198–207 40. Thompson A, Schafer J, Kuhn K, Kienle S, Schwarz J, Schmidt
5. Gygi SP, Rist B, Aebersold R (2000) Curr Opin Biotechnol G, Neumann T, Johnstone R, Mohammed AK, Hamon C (2003)
11:396–401 Anal Chem 75:1895–1904
6. Heck AJ, Krijgsveld J (2004) Expert Rev Proteomics 1:317–326 41. Ji J, Chakraborty A, Geng M, Zhang X, Amini A, Bina M,
7. Ong SE, Foster LJ, Mann M (2003) Methods 29:124–130 Regnier F (2000) J Chromatogr B Biomed Sci Appl 745:197–210
8. Righetti PG, Campostrini N, Pascali J, Hamdan M, Astner H 42. Che FY, Fricker LD (2002) Anal Chem 74:3190–3198
(2004) Eur J Mass Spectrom (Chichester, Eng) 10:335–348 43. Zhang X, Jin QK, Carr SA, Annan RS (2002) Rapid Commun
9. Sechi S, Oda Y (2003) Curr Opin Chem Biol 7:70–77 Mass Spectrom 16:2325–2332
10. Tao WA, Aebersold R (2003) Curr Opin Biotechnol 14:110–118 44. Glocker MO, Borchers C, Fiedler W, Suckau D, Przybylski M
11. Ong SE, Mann M (2005) Nat Chem Biol 1:252–262 (1994) Bioconjug Chem 5:583–590
12. Ong SE, Blagoev B, Kratchmarova I, Kristensen DB, Steen H, 45. Mason DE, Liebler DC (2003) J Proteome Res 2:265–272
Pandey A, Mann M (2002) Mol Cell Proteomics 1:376–386 46. Lee YH, Han H, Chang SB, Lee SW (2004) Rapid Commun
13. Krijgsveld J, Ketting RF, Mahmoudi T, Johansen J, Artal-Sanz Mass Spectrom 18:3019–3027
M, Verrijzer CP, Plasterk RH, Heck AJ (2003) Nat Biotechnol 47. Hsu JL, Huang SY, Chow NH, Chen SH (2003) Anal Chem
21:927–931 75:6843–6852
14. Wu CC, MacCoss MJ, Howell KE, Matthews DE, Yates JR III 48. Ji C, Guo N, Li L (2005) J Proteome Res 4:2099–2108
(2004) Anal Chem 76:4951–4959 49. Hsu JL, Huang SY, Chen SH (2006) Electrophoresis 27:3652–
15. Gruhler A, Schulze WX, Matthiesen R, Mann M, Jensen ON 3660
(2005) Mol Cell Proteomics 4:1697–1709 50. Goodlett DR, Keller A, Watts JD, Newitt R, Yi EC, Purvine S,
16. Ong SE, Kratchmarova I, Mann M (2003) J Proteome Res 2:173– Eng JK, von Haller P, Aebersold R, Kolker E (2001) Rapid
181 Commun Mass Spectrom 15:1214–1221
17. Blagoev B, Ong SE, Kratchmarova I, Mann M (2004) Nat 51. Syka JE, Marto JA, Bai DL, Horning S, Senko MW, Schwartz
Biotechnol 22:1139–1145 JC, Ueberheide B, Garcia B, Busby S, Muratore T, Shabanowitz
18. Park KS, Mohapatra DP, Misonou H, Trimmer JS (2006) Science J, Hunt DF (2004) J Proteome Res 3:621–626
313:976–979 52. Salomon AR, Ficarro SB, Brill LM, Brinker A, Phung QT,
19. Olsen JV, Blagoev B, Gnad F, Macek B, Kumar C, Mortensen P, Ericson C, Sauer K, Brock A, Horn DM, Schultz PG, Peters EC
Mann M (2006) Cell 127:635–648 (2003) Proc Natl Acad Sci U S A 100:443–448
20. Yao X, Freas A, Ramirez J, Demirev PA, Fenselau C (2001) 53. Goshe MB, Conrads TP, Panisko EA, Angell NH, Veenstra TD,
Anal Chem 73:2836–2842 Smith RD (2001) Anal Chem 73:2578–2586
21. Reynolds KJ, Yao X, Fenselau C (2002) J Proteome Res 1:27–33 54. Goshe MB, Veenstra TD, Panisko EA, Conrads TP, Angell NH,
22. Rose K, Simona MG, Offord RE, Prior CP, Otto B, Thatcher DR Smith RD (2002) Anal Chem 74:607–616
(1983) Biochem J 215:273–277 55. Qian WJ, Goshe MB, Camp DG, Yu LR, Tang K, Smith RD
23. Miyagi M, Rao KC (2007) Mass Spectrom Rev 26:121–136 (2003) Anal Chem. 75:5441–5450
24. Rao KC, Carruth RT, Miyagi M (2005) J Proteome Res 4:507–514 56. Tao WA, Wollscheid B, O’Brien R, Eng JK, Li XJ, Bodenmiller
25. Schnolzer M, Jedrzejewski P, Lehmann WD (1996) Electro- B, Watts JD, Hood L, Aebersold R (2005) Nat Methods 2:591–
phoresis 17:945–953 598
26. Johnson KL, Muddiman DC (2004) J Am Soc Mass Spectrom 57. Zhang H, Li XJ, Martin DB, Aebersold R (2003) Nat Biotechnol
15:437–445 21:660–666
27. Ramos-Fernandez A, Lopez-Ferrer D, Vazquez J (2007) Mol 58. Wiese S, Reidegeld KA, Meyer HE, Warscheid B (2007)
Cell Proteomics 6(7):1274–1286 Proteomics 7:1004
28. Oda Y, Owa T, Sato T, Boucher B, Daniels S, Yamanaka H, 59. Desiderio DM, Kai M (1983) Biomed Mass Spectrom 10:471–479
Shinohara Y, Yokoi A, Kuromitsu J, Nagasu T (2003) Anal 60. Gerber SA, Rush J, Stemman O, Kirschner MW, Gygi SP (2003)
Chem 75:2159–2165 Proc Natl Acad Sci U S A 100:6940–6945
29. Hansen KC, Schmitt-Ulms G, Chalkley RJ, Hirsch J, Baldwin 61. Pan S, Zhang H, Rush J, Eng J, Zhang N, Patterson D, Comb
MA, Burlingame AL (2003) Mol Cell Proteomics 2:299–314 MJ, Aebersold R (2005) Mol Cell Proteomics 4:182–190
30. Li J, Steen H, Gygi SP (2003) Mol Cell Proteomics 2:1198–1204 62. Kirkpatrick DS, Gerber SA, Gygi SP (2005) Methods 35:265–273
31. Yi EC, Li XJ, Cooke K, Lee H, Raught B, Page A, Aneliunas V, 63. Beynon RJ, Doherty MK, Pratt JM, Gaskell SJ (2005) Nat
Hieter P, Goodlett DR, Aebersold R (2005) Proteomics 5:380–387 Methods 2:587–589
32. Shen M, Guo L, Wallace A, Fitzner J, Eisenman J, Jacobson E, 64. Kito K, Ota K, Fujita T, Ito T (2007) J Proteome Res 6:792–800
Johnson RS (2003) Mol Cell Proteomics 2:315–324 65. Wolf-Yadlin A, Hautaniemi S, Lauffenburger DA, White FM
33. Pasquarello C, Sanchez JC, Hochstrasser DF, Corthals GL (2007) Proc Natl Acad Sci U S A 104:5860–5865
(2004) Rapid Commun Mass Spectrom 18:117–127 66. Mallick P, Schirle M, Chen SS, Flory MR, Lee H, Martin D,
34. Shi Y, Xiang R, Crawford JK, Colangelo CM, Horvath C, Ranish J, Raught B, Schmitt R, Werner T, Kuster B, Aebersold R
Wilkins JA (2004) J Proteome Res 3:104–111 (2007) Nat Biotechnol 25:125–131
35. Shi Y, Xiang R, Horvath C, Wilkins JA (2005) J Proteome Res 67. Zhang R, Sioma CS, Wang S, Regnier FE (2001) Anal Chem
4:1427–1433 73:5142–5149
Anal Bioanal Chem (2007) 389:1017–1031 1031

68. Zhang R, Regnier FE (2002) J Proteome Res 1:139–147 96. Ishihama Y, Oda Y, Tabata T, Sato T, Nagasu T, Rappsilber J,
69. Roepstorff P, Fohlman J (1984) Biomed Mass Spectrom 11:601 Mann M (2005) Mol Cell Proteomics 4:1265–1272
70. Belov ME, Rakov VS, Nikolaev EN, Goshe MB, Anderson GA, 97. Craig R, Cortens JP, Beavis RC (2005) Rapid Commun Mass
Smith RD (2003) Rapid Commun Mass Spectrom 17:627–636 Spectrom 19:1844–1850
71. Olsen JV, de Godoy LM, Li G, Macek B, Mortensen P, Pesch R, 98. Tang H, Arnold RJ, Alves P, Xun Z, Clemmer DE, Novotny MV,
Makarov A, Lange O, Horning S, Mann M (2005) Mol Cell Reilly JP, Radivojac P (2006) Bioinformatics 22:e481–e488
Proteomics 4:2010–2021 99. Lu P, Vogel C, Wang R, Yao X, Marcotte EM (2007) Nat
72. Zubarev R, Mann M (2007) Mol Cell Proteomics 6:377–381 Biotechnol 25:117–124
73. Venable JD, Wohlschlegel J, McClatchy DB, Park SK, Yates JR 100. Aebersold R (2003) Nature 422:115–116
III (2007) Anal Chem 79:3056–3064 101. Nesvizhskii AI (2006) Methods Mol Biol 367:87–120
74. Bondarenko PV, Chelius D, Shaler TA (2002) Anal Chem 102. Chalkley RJ, Hansen KC, Baldwin MA (2005) Methods
74:4741–4749 Enzymol 402:289–312
75. Ono M, Shitashige M, Honda K, Isobe T, Kuwabara H, 103. Saito A, Nagasaki M, Oyama M, Kozuka-Hata H, Semba K, Sugano
Matsuzuki H, Hirohashi S, Yamada T (2006) Mol Cell S, Yamamoto T, Miyano S (2007) BMC Bioinformatics 8:15
Proteomics 5:1338–1347 104. Carrillo B, Yanofsky C, Boismenu D, Latterich M, Kearney RE
76. Chelius D, Bondarenko PV (2002) J Proteome Res 1:317–323 (2006) Statistical limits of isotopic/isobaric quantification in
77. Wang W, Zhou H, Lin H, Roy S, Shaler TA, Hill LR, Norton S, counting detectors. Proceedings of the American Society for
Kumar P, Anderle M, Becker CH (2003) Anal Chem 75:4818– Mass Spectrometry 174
4826 105. Parish RC (1989) Ann Pharmacother 23:891–989
78. Wiener MC, Sachs JR, Deyanova EG, Yates NA (2004) Anal 106. Li C, Wong WH (2001) Proc Natl Acad Sci U S A 98:31–36
Chem 76:6085–6096 107. Yang YH, Dudoit S, Luu P, Lin DM, Peng V, Ngai J, Speed TP
79. Higgs RE, Knierman MD, Gelfanova V, Butler JP, Hale JE (2002) Nucleic Acids Res 30:e15
(2005) J Proteome Res 4:1442–1450 108. Kreil DP, Karp NA, Lilley KS (2004) Bioinformatics 20:2026–
80. Wang G, Wu WW, Zeng W, Chou CL, Shen RF (2006) J 2034
Proteome Res 5:1214–1223 109. Wang YK, Ma Z, Quinn DF, Fu EW (2001) Anal Chem
81. Bylund D, Danielsson R, Malmquist G, Markides KE (2002) J 73:3742–3750
Chromatogr A 961:237–244 110. Wang YK, Ma Z, Quinn DF, Fu EW (2002) Rapid Commun
82. Wang P, Tang H, Fitzgibbon MP, McIntosh M, Coram M, Zhang Mass Spectrom 16:1389–1397
H, Yi E, Aebersold R (2007) Biostatistics 8:357–367 111. Jung K, Gannoun A, Stühler K, Sitek B, Meyer HE, Urfer W
83. Jaitly N, Monroe ME, Petyuk VA, Clauss TR, Adkins JN, Smith (2005) RevStat-Statistical Journal 3:99–111
RD (2006) Anal Chem 78:7397–7409 112. Troyanskaya O, Cantor M, Sherlock G, Brown P, Hastie T,
84. Strittmatter EF, Ferguson PL, Tang K, Smith RD (2003) J Am Tibshirani R, Botstein D, Altman RB (2001) Bioinformatics
Soc Mass Spectrom 14:980–991 17:520–525
85. Silva JC, Denny R, Dorschel CA, Gorenstein M, Kass IJ, Li GZ, 113. Ellenberg JH (1976) Biometrics 32:637–645
McKenna T, Nold MJ, Richardson K, Young P, Geromanos S 114. Wisnowski JW, Montgomery DC, Simpson JR (2001) Comput
(2005) Anal Chem 77:2187–2200 Stat Data Anal 36:351–382
86. Zimmer JS, Monroe ME, Qian WJ, Smith RD (2006) Mass 115. Zhao HY, Yue PY, Fang KT (2004) J Biopharm Stat 14:629–646
Spectrom Rev 25:450–482 116. Egan WJ, Morgan SL (1998) Anal Chem 70:2372–2379
87. Bateman RH, Carruthers R, Hoyes JB, Jones C, Langridge JI, 117. Dudoit S, van der Laan MJ, Pollard KS (2004) Stat Appl Genet
Millar A, Vissers JP (2002) J Am Soc Mass Spectrom 13:792–803 Mol Biol 3:Article13
88. Nakamura T, Dohmae N, Takio K (2004) Proteomics 4:2558–2566 118. Jain N, Thatte J, Braciale T, Ley K, O’Connell M, Lee JK (2003)
89. Niggeweg R, Kocher T, Gentzel M, Buscaino A, Taipale M, Bioinformatics 19:1945–1951
Akhtar A, Wilm M (2006) Proteomics 6:41–53 119. Jain N, Cho H, O’Connell M, Lee JK (2005) BMC Bioinformatics
90. Silva JC, Denny R, Dorschel C, Gorenstein MV, Li GZ, 6:187
Richardson K, Wall D, Geromanos SJ (2006) Mol Cell 120. Zhang B, VerBerkmoes NC, Langston MA, Uberbacher E,
Proteomics 5:589–607 Hettich RL, Samatova NF (2006) J Proteome Res 5:2909–2918
91. Washburn MP, Wolters D, Yates JR III (2001) Nat Biotechnol 121. Dudbridge F, Gusnanto A, Koeleman BP (2006) Hum Genomics
19:242–247 2:310–317
92. Liu H, Sadygov RG, Yates JR III (2004) Anal Chem 76:4193–4201 122. Storey JD, Tibshirani R (2003) Proc Natl Acad Sci U S A
93. Gilchrist A, Au CE, Hiding J, Bell AW, Fernandez-Rodriguez J, 100:9440–9445
Lesimple S, Nagaya H, Roy L, Gosline SJ, Hallett M, Paiement J, 123. Xia Q, Wang T, Park Y, Lamont RJ, Hackett M (2007) Int J Mass
Kearney RE, Nilsson T, Bergeron JJ (2006) Cell 127:1265–1281 Spectrom 259:105–116
94. Old WM, Meyer-Arendt K, veline-Wolf L, Pierce KG, Mendoza 124. Hendrickson EL, Xia Q, Wang T, Leigh JA, Hackett M (2006)
A, Sevinsky JR, Resing KA, Ahn NG (2005) Mol Cell Analyst 131:1335–1341
Proteomics 4:1487–1502 125. Dudoit S, Shaffer JP, Boldrick JC (2003) Stat Sci 18:71–103
95. Rappsilber J, Ryder U, Lamond AI, Mann M (2002) Genome 126. Rinner O, Mueller LN, Hubalek M, Muller M, Gstaiger M,
Res 12:1231–1245 Aebersold R (2007) Nat Biotechnol 25:345–352

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy