Big Data For Dummies
Häftad, Engelska, 2013
Av Judith S. Hurwitz, Alan Nugent, Fern Halper, Marcia Kaufman, Judith S Hurwitz
329 kr
Beställningsvara. Skickas inom 5-8 vardagar
Fri frakt för medlemmar vid köp för minst 249 kr.Find the right big data solution for your business or organizationBig data management is one of the major challenges facing business, industry, and not-for-profit organizations. Data sets such as customer transactions for a mega-retailer, weather patterns monitored by meteorologists, or social network activity can quickly outpace the capacity of traditional data management tools. If you need to develop or manage big data solutions, you'll appreciate how these four experts define, explain, and guide you through this new and often confusing concept. You'll learn what it is, why it matters, and how to choose and implement solutions that work. Effectively managing big data is an issue of growing importance to businesses, not-for-profit organizations, government, and IT professionalsAuthors are experts in information management, big data, and a variety of solutionsExplains big data in detail and discusses how to select and implement a solution, security concerns to consider, data storage and presentation issues, analytics, and much moreProvides essential information in a no-nonsense, easy-to-understand style that is empoweringBig Data For Dummies cuts through the confusion and helps you take charge of big data solutions for your organization.
Produktinformation
- Utgivningsdatum2013-04-19
- Mått185 x 231 x 23 mm
- Vikt431 g
- SpråkEngelska
- Antal sidor336
- FörlagJohn Wiley & Sons Inc
- EAN9781118504222
Tillhör följande kategorier
Judith Hurwitz is an expert in cloud computing, information management, and business strategy.Alan Nugent has extensive experience in cloud-based big data solutions.Dr. Fern Halper specializes in big data and analytics.Marcia Kaufman specializes in cloud infrastructure, information management, and analytics.
- Introduction 1About This Book 2Foolish Assumptions 2How This Book Is Organized 3Part I: Getting Started with Big Data 3Part II: Technology Foundations for Big Data 3Part III: Big Data Management 3Part IV: Analytics and Big Data 4Part V: Big Data Implementation 4Part VI: Big Data Solutions in the Real World 4Part VII: The Part of Tens 4Glossary 4Icons Used in This Book 5Where to Go from Here 5Part I: Getting Started with Big Data 7Chapter 1: Grasping the Fundamentals of Big Data 9The Evolution of Data Management 10Understanding the Waves of Managing Data 11Wave 1: Creating manageable data structures 11Wave 2: Web and content management 13Wave 3: Managing big data 14Defining Big Data 15Building a Successful Big Data Management Architecture 16Beginning with capture, organize, integrate, analyze, and act 16Setting the architectural foundation 17Performance matters 20Traditional and advanced analytics 22The Big Data Journey 23Chapter 2: Examining Big Data Types 25Defining Structured Data 26Exploring sources of big structured data 26Understanding the role of relational databases in big data 27Defining Unstructured Data 29Exploring sources of unstructured data 29Understanding the role of a CMS in big data management 31Looking at Real-Time and Non-Real-Time Requirements 32Putting Big Data Together 33Managing different data types 33Integrating data types into a big data environment 34Chapter 3: Old Meets New: Distributed Computing 37A Brief History of Distributed Computing 37Giving thanks to DARPA 38The value of a consistent model 39Understanding the Basics of Distributed Computing 40Why we need distributed computing for big data 40The changing economics of computing 40The problem with latency 41Demand meets solutions 41Getting Performance Right 42Part II: Technology Foundations for Big Data 45Chapter 4: Digging into Big Data Technology Components 47Exploring the Big Data Stack 48Layer 0: Redundant Physical Infrastructure 49Physical redundant networks 51Managing hardware: Storage and servers 51Infrastructure operations 51Layer 1: Security Infrastructure 52Interfaces and Feeds to and from Applications and the Internet 53Layer 2: Operational Databases 54Layer 3: Organizing Data Services and Tools 56Layer 4: Analytical Data Warehouses 56Big Data Analytics 58Big Data Applications 58Chapter 5: Virtualization and How It Supports Distributed Computing 61Understanding the Basics of Virtualization 61The importance of virtualization to big data 63Server virtualization 64Application virtualization 65Network virtualization 66Processor and memory virtualization 66Data and storage virtualization 67Managing Virtualization with the Hypervisor 68Abstraction and Virtualization 69Implementing Virtualization to Work with Big Data 69Chapter 6: Examining the Cloud and Big Data 71Defining the Cloud in the Context of Big Data 71Understanding Cloud Deployment and Delivery Models 72Cloud deployment models 73Cloud delivery models 74The Cloud as an Imperative for Big Data 75Making Use of the Cloud for Big Data 77Providers in the Big Data Cloud Market 78Amazon’s Public Elastic Compute Cloud 78Google big data services 79Microsoft Azure 80OpenStack 80Where to be careful when using cloud services 81Part III: Big Data Management 83Chapter 7: Operational Databases 85RDBMSs Are Important in a Big Data Environment 87PostgreSQL relational database 87Nonrelational Databases 88Key-Value Pair Databases 89Riak key-value database 90Document Databases 91MongoDB 92CouchDB 93Columnar Databases 94HBase columnar database 94Graph Databases 95Neo4J graph database 96Spatial Databases 97PostGIS/OpenGEO Suite 98Polyglot Persistence 99Chapter 8: MapReduce Fundamentals 101Tracing the Origins of MapReduce 101Understanding the map Function 103Adding the reduce Function 104Putting map and reduce Together 105Optimizing MapReduce Tasks 108Hardware/network topology 108Synchronization 108File system 108Chapter 9: Exploring the World of Hadoop 111Explaining Hadoop 111Understanding the Hadoop Distributed File System (HDFS) 112NameNodes 113Data nodes 114Under the covers of HDFS 115Hadoop MapReduce 116Getting the data ready 117Let the mapping begin 118Reduce and combine 118Chapter 10: The Hadoop Foundation and Ecosystem 121Building a Big Data Foundation with the Hadoop Ecosystem 121Managing Resources and Applications with Hadoop YARN 122Storing Big Data with HBase 123Mining Big Data with Hive 124Interacting with the Hadoop Ecosystem 125Pig and Pig Latin 125Sqoop 126Zookeeper 127Chapter 11: Appliances and Big Data Warehouses 129Integrating Big Data with the Traditional Data Warehouse 129Optimizing the data warehouse 130Differentiating big data structures from data warehouse data 130Examining a hybrid process case study 131Big Data Analysis and the Data Warehouse 133The integration lynchpin 134Rethinking extraction, transformation, and loading 134Changing the Role of the Data Warehouse 135Changing Deployment Models in the Big Data Era 136The appliance model 136The cloud model 137Examining the Future of Data Warehouses 137Part IV: Analytics and Big Data 139Chapter 12: Defining Big Data Analytics 141Using Big Data to Get Results 142Basic analytics 142Advanced analytics 143Operationalized analytics 146Monetizing analytics 146Modifying Business Intelligence Products to Handle Big Data 147Data 147Analytical algorithms 148Infrastructure support 148Studying Big Data Analytics Examples 149Orbitz 149Nokia 150NASA 150Big Data Analytics Solutions 151Chapter 13: Understanding Text Analytics and Big Data 153Exploring Unstructured Data 154Understanding Text Analytics 155The difference between text analytics and search 156Analysis and Extraction Techniques 157Understanding the extracted information 159Taxonomies 160Putting Your Results Together with Structured Data 160Putting Big Data to Use 161Voice of the customer 161Social media analytics 162Text Analytics Tools for Big Data 164Attensity 164Clarabridge 165IBM 165OpenText 165SAS 166Chapter 14: Customized Approaches for Analysis of Big Data 167Building New Models and Approaches to Support Big Data 168Characteristics of big data analysis 168Understanding Different Approaches to Big Data Analysis 170Custom applications for big data analysis 171Semi-custom applications for big data analysis 173Characteristics of a Big Data Analysis Framework 174Big to Small: A Big Data Paradox 177Part V: Big Data Implementation 179Chapter 15: Integrating Data Sources 181Identifying the Data You Need 181Exploratory stage 182Codifying stage 184Integration and incorporation stage 184Understanding the Fundamentals of Big Data Integration 186Defining Traditional ETL 187Data transformation 188Understanding ELT — Extract, Load, and Transform 189Prioritizing Big Data Quality 189Using Hadoop as ETL 191Best Practices for Data Integration in a Big Data World 191Chapter 16: Dealing with Real-Time Data Streams and Complex Event Processing 193Explaining Streaming Data and Complex Event Processing 194Using Streaming Data 194Data streaming 195The need for metadata in streams 196Using Complex Event Processing 198Differentiating CEP from Streams 199Understanding the Impact of Streaming Data and CEP on Business 200Chapter 17: Operationalizing Big Data 201Making Big Data a Part of Your Operational Process 201Integrating big data 202Incorporating big data into the diagnosis of diseases 203Understanding Big Data Workflows 205Workload in context to the business problem 206Ensuring the Validity, Veracity, and Volatility of Big Data 207Data validity 207Data volatility 208Chapter 18: Applying Big Data within Your Organization 211Figuring the Economics of Big Data 212Identification of data types and sources 212Business process modifications or new process creation 215The technology impact of big data workflows 215Finding the talent to support big data projects 216Calculating the return on investment (ROI) from big data investments 216Enterprise Data Management and Big Data 217Defining Enterprise Data Management 217Creating a Big Data Implementation Road Map 218Understanding business urgency 218Projecting the right amount of capacity 219Selecting the right software development methodology 219Balancing budgets and skill sets 219Determining your appetite for risk 220Starting Your Big Data Road Map 220Chapter 19: Security and Governance for Big Data Environments 225Security in Context with Big Data 225Assessing the risk for the business 226Risks lurking inside big data 226Understanding Data Protection Options 227The Data Governance Challenge 228Auditing your big data process 230Identifying the key stakeholders 231Putting the Right Organizational Structure in Place 231Preparing for stewardship and management of risk 232Setting the right governance and quality policies 232Developing a Well-Governed and Secure Big Data Environment 233Part VI: Big Data Solutions in the Real World 235Chapter 20: The Importance of Big Data to Business 237Big Data as a Business Planning Tool 238Stage 1: Planning with data 238Stage 2: Doing the analysis 239Stage 3: Checking the results 239Stage 4: Acting on the plan 240Adding New Dimensions to the Planning Cycle 240Stage 5: Monitoring in real time 240Stage 6: Adjusting the impact 241Stage 7: Enabling experimentation 241Keeping Data Analytics in Perspective 241Getting Started with the Right Foundation 242Getting your big data strategy started 242Planning for Big Data 243Transforming Business Processes with Big Data 244Chapter 21: Analyzing Data in Motion: A Real-World View 245Understanding Companies’ Needs for Data in Motion 246The value of streaming data 247Streaming Data with an Environmental Impact 247Using sensors to provide real-time information about rivers and oceans 248The benefits of real-time data 249Streaming Data with a Public Policy Impact 249Streaming Data in the Healthcare Industry 251Capturing the data stream 251Streaming Data in the Energy Industry 252Using streaming data to increase energy efficiency 252Using streaming data to advance the production of alternative sources of energy 252Connecting Streaming Data to Historical and Other Real-Time Data Sources 253Chapter 22: Improving Business Processes with Big Data Analytics: A Real-World View 255Understanding Companies’ Needs for Big Data Analytics 256Improving the Customer Experience with Text Analytics 256The business value to the big data analytics implementation 257Using Big Data Analytics to Determine Next Best Action 257Preventing Fraud with Big Data Analytics 260The Business Benefit of Integrating New Sources of Data 262Part VII: The Part of Tens 263Chapter 23: Ten Big Data Best Practices 265Understand Your Goals 265Establish a Road Map 266Discover Your Data 266Figure Out What Data You Don’t Have 267Understand the Technology Options 267Plan for Security in Context with Big Data 268Plan a Data Governance Strategy 268Plan for Data Stewardship 268Continually Test Your Assumptions 269Study Best Practices and Leverage Patterns 269Chapter 24: Ten Great Big Data Resources 271Hurwitz & Associates 271Standards Organizations 271The Open Data Foundation 272The Cloud Security Alliance 272National Institute of Standards and Technology 272Apache Software Foundation 273Oasis 273Vendor Sites 273Online Collaborative Sites 274Big Data Conferences 274Chapter 25: Ten Big Data Do’s and Don’ts 275Do Involve All Business Units in Your Big Data Strategy 275Do Evaluate All Delivery Models for Big Data 276Do Think about Your Traditional Data Sources as Part of Your Big Data Strategy 276Do Plan for Consistent Metadata 276Do Distribute Your Data 277Don’t Rely on a Single Approach to Big Data Analytics 277Don’t Go Big Before You Are Ready 277Don’t Overlook the Need to Integrate Data 277Don’t Forget to Manage Data Securely 278Don’t Overlook the Need to Manage the Performance of Your Data 278Glossary 279Index 295