Home Cognitive Class Using HBase for Real-time Access to your Big Data Cognitive Class Exam...

Using HBase for Real-time Access to your Big Data Cognitive Class Exam Answers

0

Get an overview of HBase, how to use the HBase API and Clients, its integration with Hadoop.

HBase is the open source Hadoop database used for random, real-time read/writes to your Big Data.

  • Learn how to set it up as a source or sink for MapReduce jobs, and details about its architecture and administration, including labs for practice and hands-on learning.
  • Learn how HBase runs on a distributed architecture on top of commodity hardware, including practice and hands-on learning of the following features:
    • Linear and modular scalability
    • Strictly consistent read and writes
    • Automatic and configurable sharding of tables
    • Automatic failover support between RegionServers
    • Easy to use Java API for client access

COURSE SYLLABUS

  • Module 1 – Introduction to HBase
    1. HBase Overview, CAP Theorem and ACID properties
    2. Roles of HBase and difference between RDBMS
    3. HBase Shell and Tables
  • Module 2 – HBase Client API – The Basics
    1. Use of Java API for Batch, Scan, and Scan operations
  • Module 3 – Client API: Administrative and Advance Features
    1. Use of administrative operations and schemas
    2. Use of Filters, Counters, and ImportTSV tool
  • Module 4 – Available HBase Clients
    1. Understand how interactive and batch clients interact with HBase
  • Module 5 – HBase and MapReduce Integration
    1. Understand how MapReduce works in the Hadoop framework
    2. How to setup HBase as a source and a sink
  • Module 6 – HBase Configuration and Administration
    1. Configuration of HBase for various environmental optimization
    2. Architecture and administrative tasks

Using HBase for Real-time Access to your Big Data Cognitive Class Exam Answers :

Using HBase for Real-time Access to your Big Data

Module 1: Introduction to HBase

Question 1 :What are some of the key properties of HBase? Select all that apply.

  • All HBase data is stored as bytes
  • HBase can run up to 1000 queries per second at the most
  • HBase is ACID compliant across all rows and tables
  • HBase is a NoSQL technology
  • HBase is an open source Apache project

Question 2 : Which HBase component is responsible for storing the rows of a table?

  • HDFS
  • Region
  • API
  • ZooKeeper
  • Master

Question 3 : What is NOT a characteristic of an HBase table?

  • Columns are grouped into column families
  • Columns can have multiple timestamps
  • Each row must have a unique row key
  • NULL columns aren’t supported
  • Columns can be added on the fly

Module 2: HBase Client API – The Basics 

Question 1 :Which HBase command is used to update existing data in a table?

  • Put
  • Scan
  • Get
  • Batch
  • Delete

Question 2 :The batch command allows the user to determine the order of execution. True or false?

  • True
  • False

Question 3 : Which of the following statements are true of the scan operation? Select all that apply.

  • Scanner caching is enabled by default
  • The startRow and endRow parameters are both inclusive
  • The addColumn() method can be used to restrict a scan
  • Scanning is a resource-intensive operation
  • Scan operations are used to iterate over HBase tables

Module 3: Client API: Administrative and Advance Features

Question 1 : Which statement about HBase tables is incorrect?

  • HColumnDescriptor is used to describe columns, not column families
  • A table requires two descriptor classes
  • Performance may suffer if a table has more than three column families
  • Everything in HBase is stored within tables
  • Each table must contain at least one column family

Question 2 :When using a CompareFilter, you must specify what to include as part of the scan, rather than what to exclude. True or false?

  • True
  • False

Question 3 :What is an example of a Dedicated Filter? Select all that apply.

  • SingleColumnValueFilter
  • QualifierFilter
  • ColumnPrefixFilter
  • TimestampsFilter
  • FamilyFilter

Module 4: Available HBase Clients

Question 1 :Which statements accurately describe the HBase interactive clients? Select all that apply.

  • Thrift is included with Hbase
  • Thrift and Avro both support C++
  • With REST, data transport is always performed in binary
  • Avro has a dynamic schema
  • REST needs to be complied before it can run

Question 2 :Unlike an interactive client, a batch client is used to run a large set of operations in the background. True or false?

  • True
  • False

Question 3 :Which of the following is an example of a batch client?

  • PyHBase
  • HBql
  • Pig
  • JRuby
  • AsyncHBase

Module 5: HBase and MapReduce Integration

Question 1 :HBase can act both as a source and a sink of a MapReduce job.

  • False
  • True

Question 2 : Which HBase class is responsible for splitting the source data?

  • TableReducer
  • TableOutputFormat
  • TableMapReduceUtil
  • TableMapper
  • TableInputFormat

Question 3 :Which of the following is NOT a component of the MapReduce framework?

  • Reducer
  • OutputFormat
  • Mapper
  • InputFormat
  • All of the above are part of the MapReduce framework

Module 6: HBase Configuration and Administration

Question 1 :Which of the following statements accurately describe the HBase run modes? Select all that apply.

  • The standalone mode is suited for a production environment
  • The pseudo-distributed mode is used for performance evaluation
  • The standalone mode uses local file systems
  • The distributed mode is suited for a production environment
  • The distributed mode requires the HDFS

Question 2 :Which is NOT a component of a region server?

  • StoreFile
  • MemStore
  • HFile
  • ZooKeeper
  • HLog

Question 3 : What is an example of an operational task? Select all that apply.

  • BulkImport
  • CopyTable
  • Adding Servers
  • Node decommissioning
  • Import and export

Final Exam

Question 1 :Which statements accurately describe column families in HBase? Select all that apply.

  • You aren’t required to specify any column families when declaring a table
  • Each region contains multiple column families
  • You typically want no more than two or three column families per table
  • Column families have their own compression methods
  • Column families can be defined dynamically after table creation

Question 2 :Which of the following is NOT a component of HBase?

  • Master
  • Region
  • ZooKeeper
  • Pig
  • Region Server

Question 3 :Which programming language is supported by Thrift?

  • PHP
  • C#
  • Python
  • Perl
  • All of the above

Question 4 :Which HBase command is used to retrieve data from a table?

  • Delete
  • Get
  • Scan
  • Batch
  • Put

Question 5 : The HBase Shell and the native Java API are the only available tools for interacting with HBase. True or false?

  • True
  • False

Question 6 :Without this filter, a scan will need to check every file to see if a piece of data exists.

  • WhileMatchFilter
  • TimeStampsFilter
  • PageFilter
  • SkipFilter
  • BloomFilter

Question 7 :What are the characteristics of the Avro client? Select all that apply.

  • Avro is included with HBase
  • Data transport is performed in binary
  • Avro needs to be compiled before running
  • Avro is a batch client
  • Avro supports Python and PHP, among others

Question 8 :Deleting an internal table in Hive automatically deletes the corresponding HBase table. True or false?

  • True
  • False

Question 9 :What is the main purpose of an HBase Counter?

  • To count the number of regions
  • To increment column values for statistical data collection
  • To count the number of region servers
  • To count the number of column families
  • All of the above

Question 10 :Which file is used to specify configurations for HBase, HDFS, and ZooKeeper?

  • RegionServer
  • hbase-site.xml
  • log4j.properties
  • hbase-default.xml
  • hbase-env.sh

Question 11 :Which HBase component manages the race to add a backup master?

  • Primary master
  • Region
  • ZooKeeper
  • HDFS
  • Region Server

Question 12 :Which component of a region server is the actual storage file of the data?

  • HFile
  • Store
  • StoreFile
  • HLog
  • HRegion

Question 13 :When the master node is updated, which file can be used to automatically update the other nodes in the cluster?

  • syncconf.sh
  • synchbase.sh
  • hbase-default.xml
  • hbase-site.xml
  • hbase-env.sh

Question 14 : There is a single HLog for each region server. True or false?

  • True
  • False

Question 15 :What is the main purpose of the Write-Ahead log?

  • To store HBase configuration details
  • To store HDFS configuration details
  • To flush data when the system reaches its capacity
  • To prevent data loss in the event of a system crash
  • To store performance details

LEAVE A REPLY

Please enter your comment!
Please enter your name here