Criterion-referenced Test Development
Technical and Legal Guidelines for Corporate Training
Häftad, Engelska, 2014
Av Sharon A. Shrock, William C. Coscarelli, Sharon A Shrock, William C Coscarelli
659 kr
Produktinformation
- Utgivningsdatum2014-08-08
- Mått152 x 229 x 25 mm
- Vikt656 g
- FormatHäftad
- SpråkEngelska
- Antal sidor494
- Upplaga3
- FörlagJohn Wiley & Sons Inc
- ISBN9781118943403
Tillhör följande kategorier
Sharon Shrock is professor of Instructional Design and Technology at Southern Illinois University, Carbondale, where she coordinates graduate programs in ID/IT. She is the former co-director of the Hewlett-Packard World Wide?Test Development Center. She is a past president of the Association for Educational Communications and Technology's Division of Instructional Development and has served on the editorial boards of most of the major academic journals in the instructional design field. Bill Coscarelli is professor in the Instructional Design specialization at Southern Illinois University Carbondale's department of Curriculum & Instruction and the former co-director of the Hewlett-Packard World Wide Test Development Center. Bill has been elected as president of the International Society for Performance Improvement and the Association for Educational Communications and Technology's Division for Instructional Development. He was the founding editor of Performance Improvement Quarterly and ISPI's first vice-president of Publications.
- List of Figures, Tables, and Sidebars xxiiiIntroduction: A Little Knowledge Is Dangerous 1Why Test? 1Why Read This Book? 2A Confusing State of Affairs 3Misleading Familiarity 3Inaccessible Technology 4Procedural Confusion 4Testing and Kirkpatrick’s Levels of Evaluation 5Certification in the Corporate World 7Corporate Testing Enters the New Millennium 10What Is to Come. . . 11Part I: Background: The Fundamentals 131 Test Theory 15What Is Testing? 15What Does a Test Score Mean? 17Reliability and Validity: A Primer 18Reliability 18Equivalence Reliability 19Test-Retest Reliability 19Inter-Rater Reliability 19Validity 20Face Validity 23Context Validity 23Concurrent Validity 23Predictive Validity 24Concluding Comment 242 Types of Tests 25Criterion-Referenced Versus Norm-Referenced Tests 25Frequency Distributions 25Criterion-Referenced Test Interpretation 28Six Purposes for Tests in Training Settings 30Three Methods of Test Construction (One of Which You Should Never Use) 32Topic-Based Test Construction 32Statistically Based Test Construction 33Objectives-Based Test Construction 34Part II: Overview: The CRTD Model and Process 373 The CRTD Model and Process 39Relationship to the Instructional Design Process 39The CRTD Process 43Plan Documentation 44Analyze Job Content 44Establish Content Validity of Objectives 46Create Items 46Create Cognitive Items 46Create Rating Instruments 47Establish Content Validity of Items and Instruments 47Conduct Initial Test Pilot 47Perform Item Analysis 48Difficulty Index 48Distractor Pattern 48Point-Biserial 48Create Parallel Forms or Item Banks 49Establish Cut-Off Scores 49Informed Judgment 50Angoff 50Contrasting Groups 50Determine Reliability 50Determine Reliability of Cognitive Tests 50Equivalence Reliability 51Test-Retest Reliability 51Determine Reliability of Performance Tests 52Report Scores 52Summary 53Part III: The CRTD Process: Planning and Creating the Test 554 Plan Documentation 57Why Document? 57What to Document 63The Documentation 645 Analyze Job Content 75Job Analysis 75Job Analysis Models 77Summary of the Job Analysis Process 78DACUM 79Hierarchies 87Hierarchical Analysis of Tasks 87Matching the Hierarchy to the Type of Test 88Prerequisite Test 89Entry Test 89Diagnostic Test 89Posttest 89Equivalency Test 90Certification Test 90Using Learning Task Analysis to Validate a Hierarchy 91Bloom’s Original Taxonomy 91Knowledge Level 92Comprehension Level 93Application Level 93Analysis Level 93Synthesis Level 93Evaluation Level 94Using Bloom’s Original Taxonomy to Validate a Hierarchy 94Bloom’s Revised Taxonomy 95Gagné’s Learned Capabilities 96Intellectual Skills 96Cognitive Strategies 97Verbal Information 97Motor Skill 97Attitudes 97Using Gagné’s Intellectual Skills to Validate a Hierarchy 97Merrill’s Component Design Theory 98The Task Dimension 99Types of Learning 99Using Merrill’s Component Design Theory to Validate a Hierarchy 99Data-Based Methods for Hierarchy Validation 100Who Killed Cock Robin? 1026 Content Validity of Objectives 105Overview of the Process 105The Role of Objectives in Item Writing 106Characteristics of Good Objectives 107Behavior Component 107Conditions Component 108Standards Component 108A Word from the Legal Department About Objectives 109The Certification Suite 109Certification Levels in the Suite 110Level A—Realworld 110Level B—High-Fidelity Simulation 111Level C—Scenarious 111Quasi-Certification 112Level D—Memorization 112Level E—Attendance 112Level F—Affiliation 113How to Use the Certification Suite 113Finding a Common Understanding 113Making a Professional Decision 114The correct level to match the job 114The operationally correct level 114The consequences of lower fidelity 115Converting Job-Task Statements to Objectives 116In Conclusion 1197 Create Cognitive Items 121What Are Cognitive Items? 121Classification Schemes for Objectives 122Bloom’s Cognitive Classifications 123Types of Test Items 129Newer Computer-Based Item Types 129The Six Most Common Item Types 130True/False Items 131Matching Items 132Multiple-Choice Items 132Fill-In Items 147Short Answer Items 147Essay Items 148The Key to Writing Items That Match Jobs 149The Single Most Useful Improvement You Can Make in Test Development 149Intensional Versus Extensional Items 150Show Versus Tell 152The Certification Suite 155Guidelines for Writing Test Items 158Guidelines for Writing the Most Common Item Types 159How Many Items Should Be on a Test? 166Test Reliability and Test Length 166Criticality of Decisions and Test Length 167Resources and Test Length 168Domain Size of Objectives and Test Length 168Homogeneity of Objectives and Test Length 169Research on Test Length 170Summary of Determinants of Test Length 170A Cookbook for the SME 172Deciding Among Scoring Systems 174Hand Scoring 175Optical Scanning 175Computer-Based Testing 176Computerized Adaptive Testing 1808 Create Rating Instruments 183What Are Performance Tests? 183Product Versus Process in Performance Testing 187Four Types of Rating Scales for Use in Performance Tests (Two of Which You Should Never Use) 187Numerical Scales 188Descriptive Scales 188Behaviorally Anchored Rating Scales 188Checklists 190Open Skill Testing 1929 Establish Content Validity of Items and Instruments 195The Process 195Establishing Content Validity—The 196Single Most Important Step Face Validity 196Content Validity 197Two Other Types of Validity 202Concurrent Validity 202Predictive Validity 208Summary Comment About Validity 20910 Initial Test Pilot 211Why Pilot a Test? 211Six Steps in the Pilot Process 212Determine the Sample 212Orient the Participants 213Give the Test 214Analyze the Test 214Interview the Test-Takers 215Synthesize the Results 216Preparing to Collect Pilot Test Data 217Before You Administer the Test 217Sequencing Test Items 217Test Directions 218Test Readability Levels 219Lexile Measure 220Formatting the Test 220Setting Time Limits—Power, Speed, and Organizational Culture 221When You Administer the Test 222Physical Factors 222Psychological Factors 222Giving and Monitoring the Test 223Special Considerations for Performance Tests 225Honesty and Integrity in Testing 231Security During the Training-Testing Sequence 234Organization-Wide Policies Regarding Test Security 23611 Statistical Pilot 241Standard Deviation and Test Distributions 241The Meaning of Standard Deviation 241The Five Most Common Test Distributions 244Problems with Standard Deviations and Mastery Distributions 247Item Statistics and Item Analysis 248Item Statistics 248Difficulty Index 248P-Value 249Distractor Pattern 249Point-Biserial Correlation 250Item Analysis for Criterion-Referenced Tests 251The Upper-Lower Index 253Phi 255Choosing Item Statistics and Item Analysis Techniques 255Garbage In-Garbage Out 25712 Parallel Forms 259Paper-and-Pencil Tests 260Computerized Item Banks 262Reusable Learning Objects 26413 Cut-Off Scores 265Determining the Standard for Mastery 265The Outcomes of a Criterion-Referenced Test 266The Necessity of Human Judgment in Setting a Cut-Off Score 267Consequences of Misclassification 267Stakeholders 268Revisability 268Performance Data 268Three Procedures for Setting the Cut-Off Score 269The Issue of Substitutability 269Informed Judgment 270A Conjectural Approach, the Angoff Method 272Contrasting Groups Method 278Borderline Decisions 282The Meaning of Standard Error of Measurement 282Reducing Misclassification Errors at the Borderline 284Problems with Correction-for-Guessing 285The Problem of the Saltatory Cut-Off Score 28714 Reliability of Cognitive Tests 289The Concepts of Reliability, Validity, and Correlation 289Correlation 290Types of Reliability 293Single-Test-Administration Reliability Techniques 294Internal Consistency 294Squared-Error Loss 296Threshold-Loss 296Calculating Reliability for Single-Test Administration Techniques 297Livingston’s Coefficient kappa (κ 2) 297The Index Sc 297Outcomes of Using the Single-Test- Administration Reliability Techniques 298Two-Test-Administration Reliability Techniques 299Equivalence Reliability 299Test-Retest Reliability 300Calculating Reliability for Two-Test Administration Techniques 301The Phi Coefficient 302Description of Phi 302Calculating Phi 302How High Should Phi Be? 304The Agreement Coefficient 306Description of the Agreement Coefficient 306Calculating the Agreement Coefficient 307How High Should the Agreement Coefficient Be? 308The Kappa Coefficient 308Description of Kappa 308Calculating the Kappa Coefficient 309How High Should the Kappa Coefficient Be? 311Comparison of φ, ρ0, and κ 313The Logistics of Establishing Test Reliability 314Choosing Items 314Sample Test-Takers 315Testing Conditions 316Recommendations for Choosing a Reliability Technique 316Summary Comments 31715 Reliability of Performance Tests 319Reliability and Validity of Performance Tests 319Types of Rating Errors 320Error of Standards 320Halo Error 321Logic Error 321Similarity Error 321Central Tendency Error 321Leniency Error 322Inter-Rater Reliability 322Calculating and Interpreting Kappa (κ) 323Calculating and Interpreting Phi (φ) 335Repeated Performance and Consecutive Success 344Procedures for Training Raters 347What If a Rater Passes Everyone Regardless of Performance? 349What Should You Do? 352What If You Get a High Percentage of Agreement Among Raters But a Negative Phi Coefficient? 35316 Report Scores 357CRT Versus NRT Reporting 358Summing Subscores 358What Should You Report to a Manager? 361Is There a Legal Reason to Archive the Tests? 362A Final Thought About Testing and Teaching 362Part IV: Legal Issues in Criterion-Referenced Testing 36517 Criterion-Referenced Testing and Employment Selection Laws 367What Do We Mean by Employment Selection Laws? 368Who May Bring a Claim? 368A Short History of the Uniform Guidelines on Employee Selection Procedures 370Purpose and Scope 371Legal Challenges to Testing and the Uniform Guidelines 373Reasonable Reconsideration 376In Conclusion 376Balancing CRTs with Employment Discrimination Laws 376Watch Out for Blanket Exclusions in the Name of Business Necessity 378Adverse Impact, the Bottom Line, and Affirmative Action 380Adverse Impact 380The Bottom Line 383Affirmative Action 385Record-Keeping of Adverse Impact and Job-Relatedness of Tests 387Accommodating Test-Takers with Special Needs 387Testing, Assessment, and Evaluation for Disabled Candidates 390Test Validation Criteria: General Guidelines 394Test Validation: A Step-by-Step Guide 3971. Obtain Professional Guidance 3972. Select a Legally Acceptable Validation Strategy for Your Particular Test 3973. Understand and Employ Standards for Content-Valid Tests 3984. Evaluate the Overall Test Circumstances to Assure Equality of Opportunity 399Keys to Maintaining Effective and Legally Defensible Documentation 400Why Document? 400What Is Documentation? 401Why Is Documentation an Ally in Defending Against Claims? 401How Is Documentation Used? 402Compliance Documentation 402Documentation to Avoid Regulatory Penalties or Lawsuits 404Use of Documentation in Court 404Documentation to Refresh Memory 404Documentation to Attack Credibility 404Disclosure and Production of Documentation 405Pay Attention to Document Retention Policies and Protocols 407Use Effective Word Management in Your Documentation 409Use Objective Terms to Describe Events and Compliance 412Avoid Inflammatory and Off-the-Cuff Commentary 412Develop and Enforce Effective Document Retention Policies 413Make Sure Your Documentation Is Complete 414Make Sure Your Documentation Is Capable of "Authentication" 415In Conclusion 415Is Your Criterion-Referenced Testing Legally Defensible? A Checklist 416A Final Thought 419Epilogue: CRTD as Organizational Transformation 421References 425Index 433About the Authors 453