Biostatistical Design and Analysis Using R

A Practical Guide

Häftad, Engelska, 2010

959 kr

Beställningsvara. Skickas inom 7-10 vardagar

Fri frakt för medlemmar vid köp för minst 249 kr.

R — the statistical and graphical environment is rapidly emerging as an important set of teaching and research tools for biologists. This book draws upon the popularity and free availability of R to couple the theory and practice of biostatistics into a single treatment, so as to provide a textbook for biologists learning statistics, R, or both. An abridged description of biostatistical principles and analysis sequence keys are combined together with worked examples of the practical use of R into a complete practical guide to designing and analyzing real biological research. Topics covered include: simple hypothesis testing, graphingexploratory data analysis and graphical summariesregression (linear, multi and non-linear)simple and complex ANOVA and ANCOVA designs (including nested, factorial, blocking, spit-plot and repeated measures)frequency analysis and generalized linear models.Linear mixed effects modeling is also incorporated extensively throughout as an alternative to traditional modeling techniques.The book is accompanied by a companion website www.wiley.com/go/logan/r with an extensive set of resources comprising all R scripts and data sets used in the book, additional worked examples, the biology package, and other instructional materials and links.

Produktinformation

Utgivningsdatum2010-04-13
Mått160 x 241 x 31 mm
Vikt907 g
FormatHäftad
SpråkEngelska
Antal sidor576
FörlagJohn Wiley and Sons Ltd
ISBN9781405190084

Tillhör följande kategorier

Biologi inom Naturvetenskap och teknik

Preface xvR quick reference card xixGeneral key to statistical methods xxvii1 Introduction to R 11.1 Why R? 11.2 Installing R 21.2.1 Windows 21.2.2 Unix/Linux 21.2.3 MacOSX 31.3 The R environment 31.3.1 The console (command line) 41.4 Object names 41.5 Expressions, Assignment and Arithmetic 51.6 R Sessions and workspaces 61.6.1 Cleaning up 61.6.2 Workspaces 71.6.3 Current working directory 71.6.4 Quitting R 81.7 Getting help 81.8 Functions 91.9 Precedence 101.10 Vectors - variables 111.10.1 Regular or patterned sequences 121.10.2 Character vectors 131.10.3 Factors 151.11 Matrices, lists and data frames 161.11.1 Matrices 161.11.2 Lists 171.11.3 Data frames - data sets 181.12 Object information and conversion 181.12.1 Object information 181.12.2 Object conversion 201.13 Indexing vectors, matrices and lists 201.13.1 Vector indexing 211.13.2 Matrix indexing 221.13.3 List indexing 231.14 Pattern matching and replacement (character search and replace) 241.14.1 grep - pattern searching 241.14.2 regexpr - position and length of match 251.14.3 gsub - pattern replacement 261.15 Data manipulation 261.15.1 Sorting 261.15.2 Formatting data 271.16 Functions that perform other functions repeatedly 281.16.1 Along matrix margins 291.16.2 By factorial groups 301.16.3 By objects 301.17 Programming in R 301.17.1 Grouped expressions 311.17.2 Conditional execution – if and ifelse 311.17.3 Repeated execution – looping 321.17.4 Writing functions 341.18 An introduction to the R graphical environment 351.18.1 The plot() function 361.18.2 Graphical devices 391.18.3 Multiple graphics devices 401.19 Packages 421.19.1 Manual package management 421.19.2 Loading packages 451.20 Working with scripts 451.21 Citing R in publications 461.22 Further reading 472 Datasets 482.1 Constructing data frames 482.2 Reviewingadataframe-fix() 492.3 Importing (reading) data 502.3.1 Import from text file 502.3.2 Importing from the clipboard 512.3.3 Import from other software 512.4 Exporting (writing) data 522.5 Saving and loading of R objects 532.6 Data frame vectors 542.6.1 Factor levels 542.7 Manipulating data sets 562.7.1 Subsets of data frames – data frame indexing 562.7.2 The %in% matching operator 572.7.3 Pivot tables and aggregating datasets 582.7.4 Sorting datasets 582.7.5 Accessing and evaluating expressions within the context of a dataframe 592.7.6 Reshaping dataframes 592.8 Dummy data sets - generating random data 623 Introductory Statistical Principles 653.1 Distributions 663.1.1 The normal distribution 673.1.2 Log-normal distribution 683.2 Scale transformations 683.3 Measures of location 693.4 Measures of dispersion and variability 703.5 Measures of the precision of estimates - standard errors and confidence intervals 713.6 Degrees of freedom 733.7 Methods of estimation 733.7.1 Least squares (LS) 733.7.2 Maximum likelihood (ML) 743.8 Outliers 753.9 Further reading 754 Sampling and Experimental Design with R 764.1 Random sampling 764.2 Experimental design 834.2.1 Fully randomized treatment allocation 834.2.2 Randomized complete block treatment allocation 845 Graphical Data Presentation 855.1 The plot() function 865.1.1 The type parameter 865.1.2 The xlim and ylim parameters 875.1.3 The xlab and ylab parameters 885.1.4 The axes and ann parameters 885.1.5 The log parameter 885.2 Graphical Parameters 895.2.1 Plot dimensional and layout parameters 905.2.2 Axis characteristics 925.2.3 Character sizes 935.2.4 Line characteristics 935.2.5 Plotting character parameter - pch 935.2.6 Fonts 965.2.7 Text orientation and justification 985.2.8 Colors 985.3 Enhancing and customizing plots with low-level plotting functions 995.3.1 Adding points - points() 995.3.2 Adding text within a plot - text() 1005.3.3 Adding text to plot margins - mtext() 1015.3.4 Adding a legend - legend() 1025.3.5 More advanced text formatting 1045.3.6 Adding axes - axis() 1075.3.7 Adding lines and shapes within a plot 1085.4 Interactive graphics 1135.4.1 Identifying points - identify() 1135.4.2 Retrieving coordinates - locator() 1145.5 Exporting graphics 1145.5.1 Postscript - poscript() and pdf() 1145.5.2 Bitmaps - jpeg() and png() 1155.5.3 Copying devices - dev.copy() 1155.6 Working with multiple graphical devices 1155.7 High-level plotting functions for univariate (single variable) data 1165.7.1 Histogram 1165.7.2 Density functions 1175.7.3 Q-Q plots 1185.7.4 Boxplots 1195.7.5 Rug charts 1205.8 Presenting relationships 1205.8.1 Scatterplots 1205.9 Presenting grouped data 1255.9.1 Boxplots 1255.9.2 Boxplots for grouped means 1255.9.3 Interaction plots - means plots 1265.9.4 Bargraphs 1275.9.5 Violin plots 1285.10 Presenting categorical data 1285.10.1 Mosaic plots 1285.10.2 Association plots 1295.11 Trellis graphics 1295.11.1 scales() parameters 1325.12 Further reading 1336 Simple Hypothesis Testing – One and Two Population Tests 1346.1 Hypothesis testing 1346.2 One- and two-tailed tests 1366.3 t-tests 1366.4 Assumptions 1376.5 Statistical decision and power 1376.6 Robust tests 1396.7 Further reading 1396.8 Key for simple hypothesis testing 1406.9 Worked examples of real biological data sets 1427 Introduction to Linear Models 1517.1 Linear models 1527.2 Linear models in R 1547.3 Estimating linear model parameters 1567.3.1 Linear models with factorial variables 1567.3.2 Linear model hypothesis testing 1627.4 Comments about the importance of understanding the structure and parameterization of linear models 1648 Correlation and Simple Linear Regression 1678.1 Correlation 1688.1.1 Product moment correlation coefficient 1698.1.2 Null hypothesis 1698.1.3 Assumptions 1698.1.4 Robust correlation 1698.1.5 Confidence ellipses 1708.2 Simple linear regression 1708.2.1 Linear model 1718.2.2 Null hypotheses 1718.2.3 Assumptions 1728.2.4 Multiple responses for each level of the predictor 1738.2.5 Model I and II regression 1738.2.6 Regression diagnostics 1768.2.7 Robust regression 1768.2.8 Power and sample size determination 1778.3 Smoothers and local regression 1788.4 Correlation and regression in R 1788.5 Further reading 1798.6 Key for correlation and regression 1808.7 Worked examples of real biological data sets 1849 Multiple and Curvilinear Regression 2089.1 Multiple linear regression 2089.2 Linear models 2099.3 Null hypotheses 2099.4 Assumptions 2109.5 Curvilinear models 2119.5.1 Polynomial regression 2119.5.2 Nonlinear regression 2149.5.3 Diagnostics 2149.6 Robust regression 2149.7 Model selection 2149.7.1 Model averaging 2159.7.2 Hierarchical partitioning 2189.8 Regression trees 2189.9 Further reading 2199.10 Key and analysis sequence for multiple and complex regression 2199.11 Worked examples of real biological data sets 22410 Single Factor Classification (ANOVA) 25410.0.1 Fixed versus random factors 25410.1 Null hypotheses 25510.2 Linear model 25510.3 Analysis of variance 25610.4 Assumptions 25810.5 Robust classification (ANOVA) 25910.6 Tests of trends and means comparisons 25910.7 Power and sample size determination 26110.8 ANOVA in R 26110.9 Further reading 26210.10 Key for single factor classification (ANOVA) 26210.11 Worked examples of real biological data sets 26511 Nested ANOVA 28311.1 Linear models 28411.2 Null hypotheses 28511.2.1 Factor A - the main treatment effect 28511.2.2 Factor B - the nested factor 28511.3 Analysis of variance 28611.4 Variance components 28611.5 Assumptions 28911.6 Pooling denominator terms 28911.7 Unbalanced nested designs 29011.8 Linear mixed effects models 29011.9 Robust alternatives 29211.10 Power and optimisation of resource allocation 29211.11 Nested ANOVA in R 29311.11.1 Error strata (aov) 29311.11.2 Linear mixed effects models (lme and lmer) 29411.12 Further reading 29411.13 Key for nested ANOVA 29411.14 Worked examples of real biological data sets 29812 Factorial ANOVA 31312.1 Linear models 31412.2 Null hypotheses 31412.2.1 Model 1 - fixed effects 31512.2.2 Model 2 - random effects 31612.2.3 Model 3 - mixed effects 31712.3 Analysis of variance 31712.3.1 Quasi F-ratios 32012.3.2 Interactions and main effects tests 32112.4 Assumptions 32112.5 Planned and unplanned comparisons 32112.6 Unbalanced designs 32212.6.1 Missing observations 32212.6.2 Missing combinations - missing cells 32412.7 Robust factorial ANOVA 32512.8 Power and sample sizes 32712.9 Factorial ANOVA in R 32712.10 Further reading 32712.11 Key for factorial ANOVA 32812.12 Worked examples of real biological data sets 33413 Unreplicated Factorial Designs – Randomized Block and Simple Repeated Measures 36013.1 Linear models 36313.2 Null hypotheses 36313.2.1 Factor A - the main within block treatment effect 36413.2.2 Factor B - the blocking factor 36413.3 Analysis of variance 36413.4 Assumptions 36513.4.1 Sphericity 36613.4.2 Block by treatment interactions 36813.5 Specific comparisons 37013.6 Unbalanced un-replicated factorial designs 37013.7 Robust alternatives 37113.8 Power and blocking efficiency 37113.9 Unreplicated factorial ANOVA in R 37113.10 Further reading 37113.11 Key for randomized block and simple repeated measures ANOVA 37213.12 Worked examples of real biological data sets 37614 Partly Nested Designs: Split Plot and Complex Repeated Measures 39914.1 Null hypotheses 40014.1.1 Factor A - the main between block treatment effect 40014.1.2 Factor B - the blocking factor 40114.1.3 Factor C - the main within block treatment effect 40114.1.4 AC interaction - the within block interaction effect 40214.1.5 BC interaction - the within block interaction effect 40214.2 Linear models 40214.2.1 One between (α), one within (γ) block effect 40214.2.2 Two between (α, γ), one within (δ) block effect 40214.2.3 One between (α), two within (γ , δ) block effects 40314.3 Analysis of variance 40314.4 Assumptions 40314.5 Other issues 40814.5.1 Robust alternatives 40814.6 Further reading 40814.7 Key for partly nested ANOVA 40914.8 Worked examples of real biological data sets 41315 Analysis of Covariance (ANCOVA) 44815.1 Null hypotheses 45015.1.1 Factor A - the main treatment effect 45015.1.2 Factor B - the covariate effect 45015.2 Linear models 45015.3 Analysis of variance 45115.4 Assumptions 45215.4.1 Homogeneity of slopes 45315.4.2 Similar covariate ranges 45415.5 Robust ANCOVA 45515.6 Specific comparisons 45515.7 Further reading 45515.8 Key for ANCOVA 45515.9 Worked examples of real biological data sets 45716 Simple Frequency Analysis 46616.1 The chi-square statistic 46716.1.1 Assumptions 46916.2 Goodness of fit tests 46916.2.1 Homogeneous frequencies tests 46916.2.2 Distributional conformity - Kolmogorov-Smirnov tests 46916.3 Contingency tables 46916.3.1 Odds ratios 47016.3.2 Residuals 47216.4 G-tests 47216.5 Small sample sizes 47316.6 Alternatives 47416.7 Power analysis 47416.8 Simple frequency analysis in R 47516.9 Further reading 47516.10 Key for Analysing frequencies 47516.11 Worked examples of real biological data sets 47717 Generalized Linear Models (GLM) 48317.1 Dispersion (over or under) 48517.2 Binary data - logistic (logit) regression 48517.2.1 Logistic model 48517.2.2 Null hypotheses 48717.2.3 Analysis of deviance 48817.2.4 Multiple logistic regression 48817.3 Count data - Poisson generalized linear models 48917.3.1 Poisson regression 48917.3.2 Log-linear Modelling 48917.4 Assumptions 49217.5 Generalized additive models (GAM’s) - non-parametric GLM 49317.6 GLM and R 49417.7 Further reading 49517.8 Key for GLM 49517.9 Worked examples of real biological data sets 498Bibliography 531R index 535Statistics index 541

“If you want to do more than just the basics then Biostatistical Design and Analysis using Ris an excellent guide, helping you climb the steep learning curve.” (British Ecological Society Bulletin, 1 March 2012)"Overall, this is an excellent reference for biologists and biostatisticians; it is also a very good supplemental textbook for a graduate-level biostatistics course." (The Quarterly Review of Biology, 2011)