Apache HBase Primer

Download Apache HBase Primer PDF Online Free

Author :
Publisher : Apress
ISBN 13 : 1484224248
Total Pages : 147 pages
Book Rating : 4.4/5 (842 download)

DOWNLOAD NOW!


Book Synopsis Apache HBase Primer by : Deepak Vohra

Download or read book Apache HBase Primer written by Deepak Vohra and published by Apress. This book was released on 2016-11-17 with total page 147 pages. Available in PDF, EPUB and Kindle. Book excerpt: Learn the fundamental foundations and concepts of the Apache HBase (NoSQL) open source database. It covers the HBase data model, architecture, schema design, API, and administration. Apache HBase is the database for the Apache Hadoop framework. HBase is a column family based NoSQL database that provides a flexible schema model. What You'll Learn Work with the core concepts of HBase Discover the HBase data model, schema design, and architecture Use the HBase API and administration Who This Book Is For Apache HBase (NoSQL) database users, designers, developers, and admins.

Handbook of Big Geospatial Data

Download Handbook of Big Geospatial Data PDF Online Free

Author :
Publisher : Springer Nature
ISBN 13 : 3030554627
Total Pages : 641 pages
Book Rating : 4.0/5 (35 download)

DOWNLOAD NOW!


Book Synopsis Handbook of Big Geospatial Data by : Martin Werner

Download or read book Handbook of Big Geospatial Data written by Martin Werner and published by Springer Nature. This book was released on 2021-05-07 with total page 641 pages. Available in PDF, EPUB and Kindle. Book excerpt: This handbook covers a wide range of topics related to the collection, processing, analysis, and use of geospatial data in their various forms. This handbook provides an overview of how spatial computing technologies for big data can be organized and implemented to solve real-world problems. Diverse subdomains ranging from indoor mapping and navigation over trajectory computing to earth observation from space, are also present in this handbook. It combines fundamental contributions focusing on spatio-textual analysis, uncertain databases, and spatial statistics with application examples such as road network detection or colocation detection using GPUs. In summary, this handbook gives an essential introduction and overview of the rich field of spatial information science and big geospatial data. It introduces three different perspectives, which together define the field of big geospatial data: a societal, governmental, and governance perspective. It discusses questions of how the acquisition, distribution and exploitation of big geospatial data must be organized both on the scale of companies and countries. A second perspective is a theory-oriented set of contributions on arbitrary spatial data with contributions introducing into the exciting field of spatial statistics or into uncertain databases. A third perspective is taking a very practical perspective to big geospatial data, ranging from chapters that describe how big geospatial data infrastructures can be implemented and how specific applications can be implemented on top of big geospatial data. This would include for example, research in historic map data, road network extraction, damage estimation from remote sensing imagery, or the analysis of spatio-textual collections and social media. This multi-disciplinary approach makes the book unique. This handbook can be used as a reference for undergraduate students, graduate students and researchers focused on big geospatial data. Professionals can use this book, as well as practitioners facing big collections of geospatial data.

Research Anthology on Big Data Analytics, Architectures, and Applications

Download Research Anthology on Big Data Analytics, Architectures, and Applications PDF Online Free

Author :
Publisher : IGI Global
ISBN 13 : 1668436639
Total Pages : 1988 pages
Book Rating : 4.6/5 (684 download)

DOWNLOAD NOW!


Book Synopsis Research Anthology on Big Data Analytics, Architectures, and Applications by : Management Association, Information Resources

Download or read book Research Anthology on Big Data Analytics, Architectures, and Applications written by Management Association, Information Resources and published by IGI Global. This book was released on 2021-09-24 with total page 1988 pages. Available in PDF, EPUB and Kindle. Book excerpt: Society is now completely driven by data with many industries relying on data to conduct business or basic functions within the organization. With the efficiencies that big data bring to all institutions, data is continuously being collected and analyzed. However, data sets may be too complex for traditional data-processing, and therefore, different strategies must evolve to solve the issue. The field of big data works as a valuable tool for many different industries. The Research Anthology on Big Data Analytics, Architectures, and Applications is a complete reference source on big data analytics that offers the latest, innovative architectures and frameworks and explores a variety of applications within various industries. Offering an international perspective, the applications discussed within this anthology feature global representation. Covering topics such as advertising curricula, driven supply chain, and smart cities, this research anthology is ideal for data scientists, data analysts, computer engineers, software engineers, technologists, government officials, managers, CEOs, professors, graduate students, researchers, and academicians.

Architecting Modern Data Platforms

Download Architecting Modern Data Platforms PDF Online Free

Author :
Publisher : "O'Reilly Media, Inc."
ISBN 13 : 1491969229
Total Pages : 688 pages
Book Rating : 4.4/5 (919 download)

DOWNLOAD NOW!


Book Synopsis Architecting Modern Data Platforms by : Jan Kunigk

Download or read book Architecting Modern Data Platforms written by Jan Kunigk and published by "O'Reilly Media, Inc.". This book was released on 2018-12-05 with total page 688 pages. Available in PDF, EPUB and Kindle. Book excerpt: There’s a lot of information about big data technologies, but splicing these technologies into an end-to-end enterprise data platform is a daunting task not widely covered. With this practical book, you’ll learn how to build big data infrastructure both on-premises and in the cloud and successfully architect a modern data platform. Ideal for enterprise architects, IT managers, application architects, and data engineers, this book shows you how to overcome the many challenges that emerge during Hadoop projects. You’ll explore the vast landscape of tools available in the Hadoop and big data realm in a thorough technical primer before diving into: Infrastructure: Look at all component layers in a modern data platform, from the server to the data center, to establish a solid foundation for data in your enterprise Platform: Understand aspects of deployment, operation, security, high availability, and disaster recovery, along with everything you need to know to integrate your platform with the rest of your enterprise IT Taking Hadoop to the cloud: Learn the important architectural aspects of running a big data platform in the cloud while maintaining enterprise security and high availability

Pro Apache Phoenix

Download Pro Apache Phoenix PDF Online Free

Author :
Publisher : Apress
ISBN 13 : 1484223705
Total Pages : 148 pages
Book Rating : 4.4/5 (842 download)

DOWNLOAD NOW!


Book Synopsis Pro Apache Phoenix by : Shakil Akhtar

Download or read book Pro Apache Phoenix written by Shakil Akhtar and published by Apress. This book was released on 2016-12-29 with total page 148 pages. Available in PDF, EPUB and Kindle. Book excerpt: Leverage Phoenix as an ANSI SQL engine built on top of the highly distributed and scalable NoSQL framework HBase. Learn the basics and best practices that are being adopted in Phoenix to enable a high write and read throughput in a big data space. This book includes real-world cases such as Internet of Things devices that send continuous streams to Phoenix, and the book explains how key features such as joins, indexes, transactions, and functions help you understand the simple, flexible, and powerful API that Phoenix provides. Examples are provided using real-time data and data-driven businesses that show you how to collect, analyze, and act in seconds. Pro Apache Phoenix covers the nuances of setting up a distributed HBase cluster with Phoenix libraries, running performance benchmarks, configuring parameters for production scenarios, and viewing the results. The book also shows how Phoenix plays well with other key frameworks in the Hadoop ecosystem such as Apache Spark, Pig, Flume, and Sqoop. You will learn how to: Handle a petabyte data store by applying familiar SQL techniques Store, analyze, and manipulate data in a NoSQL Hadoop echo system with HBase Apply best practices while working with a scalable data store on Hadoop and HBase Integrate popular frameworks (Apache Spark, Pig, Flume) to simplify big data analysis Demonstrate real-time use cases and big data modeling techniques Who This Book Is For Data engineers, Big Data administrators, and architects.

HBase: The Definitive Guide

Download HBase: The Definitive Guide PDF Online Free

Author :
Publisher : "O'Reilly Media, Inc."
ISBN 13 : 1449315224
Total Pages : 555 pages
Book Rating : 4.4/5 (493 download)

DOWNLOAD NOW!


Book Synopsis HBase: The Definitive Guide by : Lars George

Download or read book HBase: The Definitive Guide written by Lars George and published by "O'Reilly Media, Inc.". This book was released on 2011-08-29 with total page 555 pages. Available in PDF, EPUB and Kindle. Book excerpt: If you're looking for a scalable storage solution to accommodate a virtually endless amount of data, this book shows you how Apache HBase can fulfill your needs. As the open source implementation of Google's BigTable architecture, HBase scales to billions of rows and millions of columns, while ensuring that write and read performance remain constant. Many IT executives are asking pointed questions about HBase. This book provides meaningful answers, whether you’re evaluating this non-relational database or planning to put it into practice right away. Discover how tight integration with Hadoop makes scalability with HBase easier Distribute large datasets across an inexpensive cluster of commodity servers Access HBase with native Java clients, or with gateway servers providing REST, Avro, or Thrift APIs Get details on HBase’s architecture, including the storage format, write-ahead log, background processes, and more Integrate HBase with Hadoop's MapReduce framework for massively parallelized data processing jobs Learn how to tune clusters, design schemas, copy tables, import bulk data, decommission nodes, and many other tasks

Handbook of Research on Pattern Engineering System Development for Big Data Analytics

Download Handbook of Research on Pattern Engineering System Development for Big Data Analytics PDF Online Free

Author :
Publisher : IGI Global
ISBN 13 : 1522538712
Total Pages : 425 pages
Book Rating : 4.5/5 (225 download)

DOWNLOAD NOW!


Book Synopsis Handbook of Research on Pattern Engineering System Development for Big Data Analytics by : Tiwari, Vivek

Download or read book Handbook of Research on Pattern Engineering System Development for Big Data Analytics written by Tiwari, Vivek and published by IGI Global. This book was released on 2018-04-20 with total page 425 pages. Available in PDF, EPUB and Kindle. Book excerpt: Due to the growing use of web applications and communication devices, the use of data has increased throughout various industries. It is necessary to develop new techniques for managing data in order to ensure adequate usage. The Handbook of Research on Pattern Engineering System Development for Big Data Analytics is a critical scholarly resource that examines the incorporation of pattern management in business technologies as well as decision making and prediction process through the use of data management and analysis. Featuring coverage on a broad range of topics such as business intelligence, feature extraction, and data collection, this publication is geared towards professionals, academicians, practitioners, and researchers seeking current research on the development of pattern management systems for business applications.

Architecting Modern Data Platforms

Download Architecting Modern Data Platforms PDF Online Free

Author :
Publisher : O'Reilly Media
ISBN 13 : 1491969245
Total Pages : 633 pages
Book Rating : 4.4/5 (919 download)

DOWNLOAD NOW!


Book Synopsis Architecting Modern Data Platforms by : Jan Kunigk

Download or read book Architecting Modern Data Platforms written by Jan Kunigk and published by O'Reilly Media. This book was released on 2018-12-05 with total page 633 pages. Available in PDF, EPUB and Kindle. Book excerpt: There’s a lot of information about big data technologies, but splicing these technologies into an end-to-end enterprise data platform is a daunting task not widely covered. With this practical book, you’ll learn how to build big data infrastructure both on-premises and in the cloud and successfully architect a modern data platform. Ideal for enterprise architects, IT managers, application architects, and data engineers, this book shows you how to overcome the many challenges that emerge during Hadoop projects. You’ll explore the vast landscape of tools available in the Hadoop and big data realm in a thorough technical primer before diving into: Infrastructure: Look at all component layers in a modern data platform, from the server to the data center, to establish a solid foundation for data in your enterprise Platform: Understand aspects of deployment, operation, security, high availability, and disaster recovery, along with everything you need to know to integrate your platform with the rest of your enterprise IT Taking Hadoop to the cloud: Learn the important architectural aspects of running a big data platform in the cloud while maintaining enterprise security and high availability

HBase in Action

Download HBase in Action PDF Online Free

Author :
Publisher : Simon and Schuster
ISBN 13 : 1638355355
Total Pages : 507 pages
Book Rating : 4.6/5 (383 download)

DOWNLOAD NOW!


Book Synopsis HBase in Action by : Amandeep Khurana

Download or read book HBase in Action written by Amandeep Khurana and published by Simon and Schuster. This book was released on 2012-11-01 with total page 507 pages. Available in PDF, EPUB and Kindle. Book excerpt: Summary HBase in Action has all the knowledge you need to design, build, and run applications using HBase. First, it introduces you to the fundamentals of distributed systems and large scale data handling. Then, you'll explore real-world applications and code samples with just enough theory to understand the practical techniques. You'll see how to build applications with HBase and take advantage of the MapReduce processing framework. And along the way you'll learn patterns and best practices. About the Technology HBase is a NoSQL storage system designed for fast, random access to large volumes of data. It runs on commodity hardware and scales smoothly from modest datasets to billions of rows and millions of columns. About this Book HBase in Action is an experience-driven guide that shows you how to design, build, and run applications using HBase. First, it introduces you to the fundamentals of handling big data. Then, you'll explore HBase with the help of real applications and code samples and with just enough theory to back up the practical techniques. You'll take advantage of the MapReduce processing framework and benefit from seeing HBase best practices in action. Purchase of the print book comes with an offer of a free PDF, ePub, and Kindle eBook from Manning. Also available is all code from the book. What's Inside When and how to use HBase Practical examples Design patterns for scalable data systems Deployment, integration, and design Written for developers and architects familiar with data storage and processing. No prior knowledge of HBase, Hadoop, or MapReduce is required. Table of Contents PART 1 HBASE FUNDAMENTALS Introducing HBase Getting started Distributed HBase, HDFS, and MapReduce PART 2 ADVANCED CONCEPTS HBase table design Extending HBase with coprocessors Alternative HBase clients PART 3 EXAMPLE APPLICATIONS HBase by example: OpenTSDB Scaling GIS on HBase PART 4 OPERATIONALIZING HBASE Deploying HBase Operations

Understanding Big Data: Analytics for Enterprise Class Hadoop and Streaming Data

Download Understanding Big Data: Analytics for Enterprise Class Hadoop and Streaming Data PDF Online Free

Author :
Publisher : McGraw Hill Professional
ISBN 13 : 0071790543
Total Pages : 176 pages
Book Rating : 4.0/5 (717 download)

DOWNLOAD NOW!


Book Synopsis Understanding Big Data: Analytics for Enterprise Class Hadoop and Streaming Data by : Paul Zikopoulos

Download or read book Understanding Big Data: Analytics for Enterprise Class Hadoop and Streaming Data written by Paul Zikopoulos and published by McGraw Hill Professional. This book was released on 2011-10-22 with total page 176 pages. Available in PDF, EPUB and Kindle. Book excerpt: Big Data represents a new era in data exploration and utilization, and IBM is uniquely positioned to help clients navigate this transformation. This book reveals how IBM is leveraging open source Big Data technology, infused with IBM technologies, to deliver a robust, secure, highly available, enterprise-class Big Data platform. The three defining characteristics of Big Data--volume, variety, and velocity--are discussed. You'll get a primer on Hadoop and how IBM is hardening it for the enterprise, and learn when to leverage IBM InfoSphere BigInsights (Big Data at rest) and IBM InfoSphere Streams (Big Data in motion) technologies. Industry use cases are also included in this practical guide. Learn how IBM hardens Hadoop for enterprise-class scalability and reliability Gain insight into IBM's unique in-motion and at-rest Big Data analytics platform Learn tips and tricks for Big Data use cases and solutions Get a quick Hadoop primer

Moving Hadoop to the Cloud

Download Moving Hadoop to the Cloud PDF Online Free

Author :
Publisher : "O'Reilly Media, Inc."
ISBN 13 : 1491959584
Total Pages : 320 pages
Book Rating : 4.4/5 (919 download)

DOWNLOAD NOW!


Book Synopsis Moving Hadoop to the Cloud by : Bill Havanki

Download or read book Moving Hadoop to the Cloud written by Bill Havanki and published by "O'Reilly Media, Inc.". This book was released on 2017-07-14 with total page 320 pages. Available in PDF, EPUB and Kindle. Book excerpt: Until recently, Hadoop deployments existed on hardware owned and run by organizations. Now, of course, you can acquire the computing resources and network connectivity to run Hadoop clusters in the cloud. But there’s a lot more to deploying Hadoop to the public cloud than simply renting machines. This hands-on guide shows developers and systems administrators familiar with Hadoop how to install, use, and manage cloud-born clusters efficiently. You’ll learn how to architect clusters that work with cloud-provider features—not just to avoid pitfalls, but also to take full advantage of these services. You’ll also compare the Amazon, Google, and Microsoft clouds, and learn how to set up clusters in each of them. Learn how Hadoop clusters run in the cloud, the problems they can help you solve, and their potential drawbacks Examine the common concepts of cloud providers, including compute capabilities, networking and security, and storage Build a functional Hadoop cluster on cloud infrastructure, and learn what the major providers require Explore use cases for high availability, relational data with Hive, and complex analytics with Spark Get patterns and practices for running cloud clusters, from designing for price and security to dealing with maintenance

Big Scientific Data Management

Download Big Scientific Data Management PDF Online Free

Author :
Publisher : Springer
ISBN 13 : 3030280616
Total Pages : 346 pages
Book Rating : 4.0/5 (32 download)

DOWNLOAD NOW!


Book Synopsis Big Scientific Data Management by : Jianhui Li

Download or read book Big Scientific Data Management written by Jianhui Li and published by Springer. This book was released on 2019-08-06 with total page 346 pages. Available in PDF, EPUB and Kindle. Book excerpt: This book constitutes the refereed proceedings of the First International Conference on Big Scientific Data Management, BigSDM 2018, held in Beijing, Greece, in November/December 2018. The 24 full papers presented together with 7 short papers were carefully reviewed and selected from 86 submissions. The topics involved application cases in the big scientific data management, paradigms for enhancing scientific discovery through big data, data management challenges posed by big scientific data, machine learning methods to facilitate scientific discovery, science platforms and storage systems for large scale scientific applications, data cleansing and quality assurance of science data, and data policies.

Secure Data Science

Download Secure Data Science PDF Online Free

Author :
Publisher : CRC Press
ISBN 13 : 1000557502
Total Pages : 457 pages
Book Rating : 4.0/5 (5 download)

DOWNLOAD NOW!


Book Synopsis Secure Data Science by : Bhavani Thuraisingham

Download or read book Secure Data Science written by Bhavani Thuraisingham and published by CRC Press. This book was released on 2022-04-27 with total page 457 pages. Available in PDF, EPUB and Kindle. Book excerpt: Secure data science, which integrates cyber security and data science, is becoming one of the critical areas in both cyber security and data science. This is because the novel data science techniques being developed have applications in solving such cyber security problems as intrusion detection, malware analysis, and insider threat detection. However, the data science techniques being applied not only for cyber security but also for every application area—including healthcare, finance, manufacturing, and marketing—could be attacked by malware. Furthermore, due to the power of data science, it is now possible to infer highly private and sensitive information from public data, which could result in the violation of individual privacy. This is the first such book that provides a comprehensive overview of integrating both cyber security and data science and discusses both theory and practice in secure data science. After an overview of security and privacy for big data services as well as cloud computing, this book describes applications of data science for cyber security applications. It also discusses such applications of data science as malware analysis and insider threat detection. Then this book addresses trends in adversarial machine learning and provides solutions to the attacks on the data science techniques. In particular, it discusses some emerging trends in carrying out trustworthy analytics so that the analytics techniques can be secured against malicious attacks. Then it focuses on the privacy threats due to the collection of massive amounts of data and potential solutions. Following a discussion on the integration of services computing, including cloud-based services for secure data science, it looks at applications of secure data science to information sharing and social media. This book is a useful resource for researchers, software developers, educators, and managers who want to understand both the high level concepts and the technical details on the design and implementation of secure data science-based systems. It can also be used as a reference book for a graduate course in secure data science. Furthermore, this book provides numerous references that would be helpful for the reader to get more details about secure data science.

Big Data

Download Big Data PDF Online Free

Author :
Publisher : Springer
ISBN 13 : 8132224949
Total Pages : 195 pages
Book Rating : 4.1/5 (322 download)

DOWNLOAD NOW!


Book Synopsis Big Data by : Hrushikesha Mohanty

Download or read book Big Data written by Hrushikesha Mohanty and published by Springer. This book was released on 2015-06-29 with total page 195 pages. Available in PDF, EPUB and Kindle. Book excerpt: This book is a collection of chapters written by experts on various aspects of big data. The book aims to explain what big data is and how it is stored and used. The book starts from the fundamentals and builds up from there. It is intended to serve as a review of the state-of-the-practice in the field of big data handling. The traditional framework of relational databases can no longer provide appropriate solutions for handling big data and making it available and useful to users scattered around the globe. The study of big data covers a wide range of issues including management of heterogeneous data, big data frameworks, change management, finding patterns in data usage and evolution, data as a service, service-generated data, service management, privacy and security. All of these aspects are touched upon in this book. It also discusses big data applications in different domains. The book will prove useful to students, researchers, and practicing database and networking engineers.

Getting Started with Kudu

Download Getting Started with Kudu PDF Online Free

Author :
Publisher : "O'Reilly Media, Inc."
ISBN 13 : 1491980206
Total Pages : 158 pages
Book Rating : 4.4/5 (919 download)

DOWNLOAD NOW!


Book Synopsis Getting Started with Kudu by : Jean-Marc Spaggiari

Download or read book Getting Started with Kudu written by Jean-Marc Spaggiari and published by "O'Reilly Media, Inc.". This book was released on 2018-07-09 with total page 158 pages. Available in PDF, EPUB and Kindle. Book excerpt: Fast data ingestion, serving, and analytics in the Hadoop ecosystem have forced developers and architects to choose solutions using the least common denominator—either fast analytics at the cost of slow data ingestion or fast data ingestion at the cost of slow analytics. There is an answer to this problem. With the Apache Kudu column-oriented data store, you can easily perform fast analytics on fast data. This practical guide shows you how. Begun as an internal project at Cloudera, Kudu is an open source solution compatible with many data processing frameworks in the Hadoop environment. In this book, current and former solutions professionals from Cloudera provide use cases, examples, best practices, and sample code to help you get up to speed with Kudu. Explore Kudu’s high-level design, including how it spreads data across servers Fully administer a Kudu cluster, enable security, and add or remove nodes Learn Kudu’s client-side APIs, including how to integrate Apache Impala, Spark, and other frameworks for data manipulation Examine Kudu’s schema design, including basic concepts and primitives necessary to make your project successful Explore case studies for using Kudu for real-time IoT analytics, predictive modeling, and in combination with another storage engine

Spark: The Definitive Guide

Download Spark: The Definitive Guide PDF Online Free

Author :
Publisher : "O'Reilly Media, Inc."
ISBN 13 : 1491912294
Total Pages : 594 pages
Book Rating : 4.4/5 (919 download)

DOWNLOAD NOW!


Book Synopsis Spark: The Definitive Guide by : Bill Chambers

Download or read book Spark: The Definitive Guide written by Bill Chambers and published by "O'Reilly Media, Inc.". This book was released on 2018-02-08 with total page 594 pages. Available in PDF, EPUB and Kindle. Book excerpt: Learn how to use, deploy, and maintain Apache Spark with this comprehensive guide, written by the creators of the open-source cluster-computing framework. With an emphasis on improvements and new features in Spark 2.0, authors Bill Chambers and Matei Zaharia break down Spark topics into distinct sections, each with unique goals. Youâ??ll explore the basic operations and common functions of Sparkâ??s structured APIs, as well as Structured Streaming, a new high-level API for building end-to-end streaming applications. Developers and system administrators will learn the fundamentals of monitoring, tuning, and debugging Spark, and explore machine learning techniques and scenarios for employing MLlib, Sparkâ??s scalable machine-learning library. Get a gentle overview of big data and Spark Learn about DataFrames, SQL, and Datasetsâ??Sparkâ??s core APIsâ??through worked examples Dive into Sparkâ??s low-level APIs, RDDs, and execution of SQL and DataFrames Understand how Spark runs on a cluster Debug, monitor, and tune Spark clusters and applications Learn the power of Structured Streaming, Sparkâ??s stream-processing engine Learn how you can apply MLlib to a variety of problems, including classification or recommendation

Apache Spark in 24 Hours, Sams Teach Yourself

Download Apache Spark in 24 Hours, Sams Teach Yourself PDF Online Free

Author :
Publisher : Sams Publishing
ISBN 13 : 0134445821
Total Pages : 1353 pages
Book Rating : 4.1/5 (344 download)

DOWNLOAD NOW!


Book Synopsis Apache Spark in 24 Hours, Sams Teach Yourself by : Jeffrey Aven

Download or read book Apache Spark in 24 Hours, Sams Teach Yourself written by Jeffrey Aven and published by Sams Publishing. This book was released on 2016-08-31 with total page 1353 pages. Available in PDF, EPUB and Kindle. Book excerpt: Apache Spark is a fast, scalable, and flexible open source distributed processing engine for big data systems and is one of the most active open source big data projects to date. In just 24 lessons of one hour or less, Sams Teach Yourself Apache Spark in 24 Hours helps you build practical Big Data solutions that leverage Spark’s amazing speed, scalability, simplicity, and versatility. This book’s straightforward, step-by-step approach shows you how to deploy, program, optimize, manage, integrate, and extend Spark–now, and for years to come. You’ll discover how to create powerful solutions encompassing cloud computing, real-time stream processing, machine learning, and more. Every lesson builds on what you’ve already learned, giving you a rock-solid foundation for real-world success. Whether you are a data analyst, data engineer, data scientist, or data steward, learning Spark will help you to advance your career or embark on a new career in the booming area of Big Data. Learn how to • Discover what Apache Spark does and how it fits into the Big Data landscape • Deploy and run Spark locally or in the cloud • Interact with Spark from the shell • Make the most of the Spark Cluster Architecture • Develop Spark applications with Scala and functional Python • Program with the Spark API, including transformations and actions • Apply practical data engineering/analysis approaches designed for Spark • Use Resilient Distributed Datasets (RDDs) for caching, persistence, and output • Optimize Spark solution performance • Use Spark with SQL (via Spark SQL) and with NoSQL (via Cassandra) • Leverage cutting-edge functional programming techniques • Extend Spark with streaming, R, and Sparkling Water • Start building Spark-based machine learning and graph-processing applications • Explore advanced messaging technologies, including Kafka • Preview and prepare for Spark’s next generation of innovations Instructions walk you through common questions, issues, and tasks; Q-and-As, Quizzes, and Exercises build and test your knowledge; "Did You Know?" tips offer insider advice and shortcuts; and "Watch Out!" alerts help you avoid pitfalls. By the time you're finished, you'll be comfortable using Apache Spark to solve a wide spectrum of Big Data problems.