Computational and Statistical Methods for Protein Quantification by Mass Spectrometry

Inbunden, Engelska, 2013

AvIngvar Eidhammer,Harald Barsnes,Geir Egil Eide,Lennart Martens

1 449 kr

Beställningsvara. Skickas inom 5-8 vardagar. Fri frakt för medlemmar vid köp för minst 249 kr.

The definitive introduction to data analysis in quantitative proteomicsThis book provides all the necessary knowledge about mass spectrometry based proteomics methods and computational and statistical approaches to pursue the planning, design and analysis of quantitative proteomics experiments. The author’s carefully constructed approach allows readers to easily make the transition into the field of quantitative proteomics. Through detailed descriptions of wet-lab methods, computational approaches and statistical tools, this book covers the full scope of a quantitative experiment, allowing readers to acquire new knowledge as well as acting as a useful reference work for more advanced readers.Computational and Statistical Methods for Protein Quantification by Mass Spectrometry: Introduces the use of mass spectrometry in protein quantification and how the bioinformatics challenges in this field can be solved using statistical methods and various software programs.Is illustrated by a large number of figures and examples as well as numerous exercises.Provides both clear and rigorous descriptions of methods and approaches.Is thoroughly indexed and cross-referenced, combining the strengths of a text book with the utility of a reference work.Features detailed discussions of both wet-lab approaches and statistical and computational methods.With clear and thorough descriptions of the various methods and approaches, this book is accessible to biologists, informaticians, and statisticians alike and is aimed at readers across the academic spectrum, from advanced undergraduate students to post doctorates entering the field.

Produktinformation

Utgivningsdatum2013-01-04
Mått160 x 239 x 23 mm
Vikt576 g
FormatInbunden
SpråkEngelska
Antal sidor360
FörlagJohn Wiley & Sons Inc
ISBN9781119964001

Tillhör följande kategorier

Matematisk statistik inom Naturvetenskap och teknik
Analytisk kemi inom Naturvetenskap och teknik
Biologi inom Naturvetenskap och teknik
Biokemi inom Naturvetenskap och teknik

Preface xvTerminology xviiAcknowledgements xix1 Introduction 11.1 The composition of an organism 11.1.1 A simple model of an organism 11.1.2 Composition of cells 31.2 Homeostasis, physiology, and pathology 41.3 Protein synthesis 41.4 Site, sample, state, and environment 41.5 Abundance and expression – protein and proteome profiles 51.5.1 The protein dynamic range 61.6 The importance of exact specification of sites and states 61.6.1 Biological features 71.6.2 Physiological and pathological features 71.6.3 Input features 71.6.4 External features 71.6.5 Activity features 71.6.6 The cell cycle 81.7 Relative and absolute quantification 81.7.1 Relative quantification 81.7.2 Absolute quantification 91.8 In vivo and in vitro experiments 91.9 Goals for quantitative protein experiments 101.10 Exercises 102 Correlations of mRNA and protein abundances 122.1 Investigating the correlation 122.2 Codon bias 142.3 Main results from experiments 152.4 The ideal case for mRNA-protein comparison 162.5 Exploring correlation across genes 172.6 Exploring correlation within one gene 182.7 Correlation across subsets 182.8 Comparing mRNA and protein abundances across genes from two situations 192.9 Exercises 202.10 Bibliographic notes 213 Protein level quantification 223.1 Two-dimensional gels 223.1.1 Comparing results from different experiments – DIGE 233.2 Protein arrays 233.2.1 Forward arrays 243.2.2 Reverse arrays 253.2.3 Detection of binding molecules 253.2.4 Analysis of protein array readouts 253.3 Western blotting 253.4 ELISA – Enzyme-Linked Immunosorbent Assay 263.5 Bibliographic notes 264 Mass spectrometry and protein identification 274.1 Mass spectrometry 274.1.1 Peptide mass fingerprinting (PMF) 284.1.2 MS/MS – tandem MS 294.1.3 Mass spectrometers 294.2 Isotope composition of peptides 324.2.1 Predicting the isotope intensity distribution 344.2.2 Estimating the charge 344.2.3 Revealing isotope patterns 344.3 Presenting the intensities – the spectra 364.4 Peak intensity calculation 384.5 Peptide identification by MS/MS spectra 384.5.1 Spectral comparison 414.5.2 Sequential comparison 414.5.3 Scoring 424.5.4 Statistical significance 424.6 The protein inference problem 424.6.1 Determining maximal explanatory sets 444.6.2 Determining minimal explanatory sets 444.7 False discovery rate for the identifications 444.7.1 Constructing the decoy database 454.7.2 Separate or composite search 464.8 Exercises 464.9 Bibliographic notes 475 Protein quantification by mass spectrometry 485.1 Situations, protein, and peptide variants 485.1.1 Situation 485.1.2 Protein variants – peptide variants 485.2 Replicates 495.3 Run – experiment – project 505.3.1 LC-MS/MS run 505.3.2 Quantification run 515.3.3 Quantification experiment 525.3.4 Quantification project 525.3.5 Planning quantification experiments 525.4 Comparing quantification approaches/methods 545.4.1 Accuracy 545.4.2 Precision 555.4.3 Repeatability and reproducibility 565.4.4 Dynamic range and linear dynamic range 565.4.5 Limit of blank – LOB 565.4.6 Limit of detection – LOD 575.4.7 Limit of quantification – LOQ 575.4.8 Sensitivity 575.4.9 Selectivity 575.5 Classification of approaches for quantification using LC-MS/MS 575.5.1 Discovery or targeted protein quantification 585.5.2 Label based vs. label free quantification 595.5.3 Abundance determination – ion current vs. peptide identification 605.5.4 Classification 605.6 The peptide (occurrence) space 605.7 Ion chromatograms 625.8 From peptides to protein abundances 625.8.1 Combined single abundance from single abundances 645.8.2 Relative abundance from single abundances 655.8.3 Combined relative abundance from relative abundances 665.9 Protein inference and protein abundance calculation 675.9.1 Use of the peptides in protein abundance calculation 675.9.2 Classifying the proteins 685.9.3 Can shared peptides be used for quantification? 685.10 Peptide tables 705.11 Assumptions for relative quantification 705.12 Analysis for differentially abundant proteins 715.13 Normalization of data 715.14 Exercises 725.15 Bibliographic notes 746 Statistical normalization 756.1 Some illustrative examples 756.2 Non-normally distributed populations 766.2.1 Skewed distributions 766.2.2 Measures of skewness 766.2.3 Steepness of the peak – kurtosis 776.3 Testing for normality 786.3.1 Normal probability plot 796.3.2 Some test statistics for normality testing 816.4 Outliers 826.4.1 Test statistics for the identification of a single outlier 836.4.2 Testing for more than one outlier 866.4.3 Robust statistics for mean and standard deviation 886.4.4 Outliers in regression 896.5 Variance inequality 906.6 Normalization and logarithmic transformation 906.6.1 The logarithmic function 906.6.2 Choosing the base 916.6.3 Logarithmic normalization of peptide/protein ratios 916.6.4 Pitfalls of logarithmic transformations 926.6.5 Variance stabilization by logarithmic transformation 926.6.6 Logarithmic scale for presentation 936.7 Exercises 946.8 Bibliographic notes 957 Experimental normalization 967.1 Sources of variation and level of normalization 967.2 Spectral normalization 987.2.1 Scale based normalization 997.2.2 Rank based normalization 1017.2.3 Combining scale based and rank based normalization 1017.2.4 Reproducibility of the normalization methods 1027.3 Normalization at the peptide and protein level 1037.4 Normalizing using sum, mean, and median 1047.5 MA-plot for normalization 1047.5.1 Global intensity normalization 1057.5.2 Linear regression normalization 1067.6 Local regression normalization – LOWESS 1067.7 Quantile normalization 1077.8 Overfitting 1087.9 Exercises 1097.10 Bibliographic notes 1098 Statistical analysis 1108.1 Use of replicates for statistical analysis 1108.2 Using a set of proteins for statistical analysis 1118.2.1 Z-variable 1118.2.2 G-statistic 1128.2.3 Fisher–Irwin exact test 1158.3 Missing values 1168.3.1 Reasons for missing values 1168.3.2 Handling missing values 1188.4 Prediction and hypothesis testing 1188.4.1 Prediction errors 1198.4.2 Hypothesis testing 1208.5 Statistical significance for multiple testing 1218.5.1 False positive rate control 1228.5.2 False discovery rate control 1238.6 Exercises 1278.7 Bibliographic notes 1289 Label based quantification 1299.1 Labeling techniques for label based quantification 1299.2 Label requirements 1309.3 Labels and labeling properties 1309.3.1 Quantification level 1309.3.2 Label incorporation 1319.3.3 Incorporation level 1319.3.4 Number of compared samples 1329.3.5 Common labels 1329.4 Experimental requirements 1329.5 Recognizing corresponding peptide variants 1339.5.1 Recognizing peptide variants in MS spectra 1339.5.2 Recognizing peptide variants in MS/MS spectra 1349.6 Reference free vs. reference based 1359.6.1 Reference free quantification 1359.6.2 Reference based quantification 1359.7 Labeling considerations 1369.8 Exercises 1369.9 Bibliographic notes 13710 Reporter based MS/MS quantification 13810.1 Isobaric labels 13810.2 iTRAQ 14010.2.1 Fragmentation 14110.2.2 Reporter ion intensities 14310.2.3 iTRAQ 8-plex 14410.3 TMT – Tandem Mass Tag 14510.4 Reporter based quantification runs 14510.5 Identification and quantification 14510.6 Peptide table 14710.7 Reporter based quantification experiments 14710.7.1 Normalization across LC-MS/MS runs – use of a reference sample 14710.7.2 Normalizing within an LC-MS/MS run 14910.7.3 From reporter intensities to protein abundances 14910.7.4 Finding differentially abundant proteins 15010.7.5 Distributing the replicates on the quantification runs 15110.7.6 Protocols 15210.8 Exercises 15210.9 Bibliographic notes 15311 Fragment based MS/MS quantification 15511.1 The label masses 15511.2 Identification 15711.3 Peptide and protein quantification 15811.4 Exercises 15811.5 Bibliographic notes 15912 Label based quantification by MS spectra 16012.1 Different labeling techniques 16012.1.1 Metabolic labeling – SILAC 16012.1.2 Chemical labeling 16212.1.3 Enzymatic labeling – 18O 16512.2 Experimental setup 16612.3 MaxQuant as a model 16712.3.1 HL-pairs 16712.3.2 Reliability of HL-pairs 16912.3.3 Reliable protein results 16912.4 The MaxQuant procedure 16912.4.1 Recognize HL-pairs 16912.4.2 Estimate HL-ratios 17612.4.3 Identify HL-pairs by database search 17712.4.4 Infer protein data 18112.5 Exercises 18312.6 Bibliographic notes 18413 Label free quantification by MS spectra 18513.1 An ideal case – two protein samples 18513.2 The real world 18613.2.1 Multiple samples 18713.3 Experimental setup 18713.4 Forms 18713.5 The quantification process 18813.6 Form detection 18913.7 Pair-wise retention time correction 19113.7.1 Determining potentially corresponding forms 19113.7.2 Linear corrections 19213.7.3 Nonlinear corrections 19213.8 Approaches for form tuple detection 19313.9 Pair-wise alignment 19313.9.1 Distance between forms 19413.9.2 Finding an optimal alignment 19513.10 Using a reference run for alignment 19613.11 Complete pair-wise alignment 19713.12 Hierarchical progressive alignment 19713.12.1 Measuring the similarity or the distance of two runs 19813.12.2 Constructing static guide trees 19813.12.3 Constructing dynamic guide trees 19913.12.4 Aligning subalignments 19913.12.5 SuperHirn 19913.13 Simultaneous iterative alignment 20013.13.1 Constructing the initial alignment in XCMS 20013.13.2 Changing the initial alignment 20113.14 The end result and further analysis 20213.15 Exercises 20213.16 Bibliographic notes 20414 Label free quantification by MS/MS spectra 20514.1 Abundance measurements 20514.2 Normalization 20714.3 Proposed methods 20714.4 Methods for single abundance calculation 20714.4.1 emPAI 20814.4.2 PMSS 20814.4.3 NSAF 20914.4.4 SI 20914.5 Methods for relative abundance calculation 21014.5.1 PASC 21014.5.2 RIBAR 21014.5.3 xRIBAR 21114.6 Comparing methods 21214.6.1 An analysis by Griffin 21214.6.2 An analysis by Colaert 21314.7 Improving the reliability of spectral count quantification 21314.8 Handling shared peptides 21414.9 Statistical analysis 21514.10 Exercises 21514.11 Bibliographic notes 21615 Targeted quantification – Selected Reaction Monitoring 21815.1 Selected Reaction Monitoring – the concept 21815.2 A suitable instrument 21915.3 The LC-MS/MS run 22015.3.1 Sensitivity and accuracy 22215.4 Label free and label based quantification 22415.4.1 Label free SRM based quantification 22415.4.2 Label based SRM based quantification 22515.5 Requirements for SRM transitions 22715.5.1 Requirements for the peptides 22715.5.2 Requirements for the fragment ions 22815.6 Finding optimal transitions 22915.7 Validating transitions 23015.7.1 Testing linearity 23015.7.2 Determining retention time 23115.7.3 Limit of detection/quantification 23115.7.4 Dealing with low abundant proteins 23115.7.5 Checking for interference 23215.8 Assay development 23215.9 Exercises 23315.10 Bibliographic notes 23416 Absolute quantification 23516.1 Performing absolute quantification 23516.1.1 Linear dependency between the calculated and the real abundances 23616.2 Label based absolute quantification 23616.2.1 Stable isotope-labeled peptide standards 23716.2.2 Stable isotope-labeled concatenated peptide standards 23816.2.3 Stable isotope-labeled intact protein standards 23916.3 Label free absolute quantification 23916.3.1 Quantification by MS spectra 23916.3.2 Quantification by the number of MS/MS spectra 24116.4 Exercises 24216.5 Bibliographic notes 24217 Quantification of post-translational modifications 24417.1 PTM and mass spectrometry 24417.2 Modification degree 24517.3 Absolute modification degree 24617.3.1 Reversing the modification 24617.3.2 Use of two standards 24817.3.3 Label free modification degree analysis 24917.4 Relative modification degree 25017.5 Discovery based modification stoichiometry 25117.5.1 Separate LC-MS/MS experiments for modified and unmodified peptides 25117.5.2 Common LC-MS/MS experiment for modified and unmodified peptides 25217.5.3 Reliable results and significant differences 25217.6 Exercises 25317.7 Bibliographic notes 25318 Biomarkers 25418.1 Evaluation of potential biomarkers 25418.1.1 Taking disease prevalence into account 25518.2 Evaluating threshold values for biomarkers 25718.3 Exercises 25818.4 Bibliographic notes 25819 Standards and databases 25919.1 Standard data formats for (quantitative) proteomics 25919.1.1 Controlled vocabularies (CVs) 26019.1.2 Benefits of using CV terms to annotate metadata 26019.1.3 A standard for quantitative proteomics data 26119.1.4 HUPO PSI 26219.2 Databases for proteomics data 26219.3 Bibliographic notes 26320 Appendix A: Statistics 26420.1 Samples, populations, and statistics 26420.2 Population parameter estimation 26520.2.1 Estimating the mean of a population 26620.3 Hypothesis testing 26720.3.1 Two types of errors 26820.4 Performing the test – test statistics and p-values 26820.4.1 Parametric test statistics 26920.4.2 Nonparametric test statistics 26920.4.3 Confidence intervals and hypothesis testing 27020.5 Comparing means of populations 27120.5.1 Analyzing the mean of a single population 27120.5.2 Comparing the means from two populations 27220.5.3 Comparing means of paired populations 27520.5.4 Multiple populations 27520.5.5 Multiple testing 27620.6 Comparing variances 27620.6.1 Testing the variance of a single population 27620.6.2 Testing the variances of two populations 27720.7 Percentiles and quantiles 27820.7.1 A straightforward method for estimating the percentiles 27920.7.2 Quantiles 27920.7.3 Box plots 28020.8 Correlation 28020.8.1 Pearson’s product-moment correlation coefficient 28320.8.2 Spearman’s rank correlation coefficient 28520.8.3 Correlation line 28620.9 Regression analysis 28720.9.1 Regression line 28820.9.2 Relation between Pearson’s correlation coefficient and the regression parameters 28920.10 Types of values and variables 29021 Appendix B: Clustering and discriminant analysis 29221.1 Clustering 29221.1.1 Distances and similarities 29321.1.2 Distance measures 29421.1.3 Similarity measures 29521.1.4 Distances between an object and a class 29521.1.5 Distances between two classes 29621.1.6 Missing data 29721.1.7 Clustering approaches 29721.1.8 Sequential clustering 29821.1.9 Hierarchical clustering 30021.2 Discriminant analysis 30321.2.1 Step-wise feature selection 30421.2.2 Linear discriminant analysis using original features 30721.2.3 Canonical discriminant analysis 30921.3 Bibliographic notes 312Bibliography 313Index 327

“Computational and Statistical Methods for Protein Quantification by Mass Spectrometry is a book that can be used by undergraduate students in both analytical chemistry and biochemistry, as well as by scientists who are familiar with the field. The book teaches the reader how to perform proteomic analysis by mass spectrometry and how to interpret the large amount of data collected.” (Analytical and Bioanalytical Chemistry, 10 January 2014)