Here we bring in some top-level HBase interview questions for Freshers and experienced. For Hadoop professionals if they will go for the interview, then following HBase question set can help them in cracking the interview.
HBase Questions Covered are:
- Explain Apache HBase Or What is Apache HBase?
- What are the advantages of using HBase?
- Compare traditional RDBMS with HBase.
- Name some more column-based databases like HBase
- Explain Filters of HBase database
- What are the data model operations in HBase?
- How the HBase cluster Back up is performed?
- Does HBase support syntaxes similar to SQL?
- Can we iterate HBase database rows in reverse order?
- Enlist the key components of HBase.
- What is the difference between HBase and Hive?
- Explain HBase data model.
- Define column families.
- What is a standalone mode in HBase?
- Write any data manipulation commands of HBase.
- Explain delete operation of HBase and mention three types of tombstone markers of HBase.
HBase Interview Questions and Answers
Here are some popular questions for freshers and experienced which can help you in cracking the interview. The top and frequently asked questions are included below:
HBase Questions and Answers for Freshers
Q1). Explain Apache HBase or What is Apache HBase?
Apache HBase is a column based database which can store the sparse data sets. It is a NoSQL column-oriented database which runs on the top of HDFS or Hadoop distributed file system and is capable to store any type of data. HBase data can be accessed either through a native Java API or through REST gateway due to which it can be accessed from any language. Some of the key properties of HBase are listed below:
- NoSQL: HBase is not like traditional relational database or RDBMS. To achieve more scalability HBase relaxes ACID properties of the traditional database which is also known as Atomicity, Consistency, Isolation and Durability. As HBase data is not stored in rigid schemas so it is ideal to store structured and unstructured both types of database.
- Distributed and Scalable: The rows of HBase are grouped into regions. These groups help the user to split the table data over several nodes present in any cluster. Any too large region is automatically split and the load is shared among several servers.
- Wide-Column: The data of HBase are stored in a table-like format =. The tables of HBase can store billions of rows and millions of columns. The columns can be grouped together into “column families” which can help the user to physically distribute row values into various cluster nodes.
- Consistent: HBase has strongly-consistent reads and writes unlike other NoSQL databases which are eventually consistent databases. It means whenever any write operation will be performed all read requests will return the same value.
Q2). What are the advantages of using HBase?
Following are a few of the advantages of HBase database like:
- Greater consistency of the records
- Inbuilt database versioning
- RDBMS like triggers and stored procedures are provided in the form of co-processors. Through co-processors the custom code can be run on the region servers.
Q3). Compare traditional RDBMS with HBase.
Following are the few differences between RDBMS and HBase:
Q4). Name some more column-based databases like HBase
They are CouchDB, Cassandra and MongoDB.
Read: HDFS Tutorial Guide for Beginner
Q5). Explain Filters of HBase database.
In HBase queries the filters can be attached to the queries due to which the programmers can eliminate the not required data from large datasets. Filters can be applied to the data of complete rows. Following are a few most used filters of HBase:
- Page Filter
- Time stamp filter
- Family Filter
- Row Filter
- Qualifier Filter
- Column Pagination a Filter
- Value Filter
- Prefix Filter
- Single Column Value Filter
- Multiple Column Prefix Filter
- Column Count Get Filter
- Inclusive Stop Filter
- First Key Only Filter
- Key Only Filter
- Dependent Column Filter
Q6). What are the data model operations in HBase?
Following are the most used data operations:
- Put Method – Used to store data in HBase
- Delete Method – Used to delete the data from database
- Get Method- Used to retrieve data from HBase database
- Scan Method- Used to iterate over the data from entire table with larger key ranges
Q7). How the HBase cluster Back up is performed?
Back up of HBase cluster is performed in following two ways:
- Through Live cluster back up
- Through full shut down back up
Live Cluster Back Up: In live backup strategy copy table utility is used to copy the data fromone table to another present in thesame cluster or other clusters. To dump the table content or table data the Export utility is used and the data is dumped onto HDFS of the same cluster.
Full Shut Down Back Up:In this approach a periodic shutdown of the HBase cluster is performed and in this shutdown the master and region servers go down in case when any in-flight change happens to storeFiles or metadata. This kind of approach can be used for back-end analytic capacity and at the same time it cannot be used for the applications that serve front-endwebpages.
Q8). Does HBase support syntaxes similar to SQL?
In HBase the SQL like support is not available. Apache Phoneix can be used to retrieve data from HBase and for that SQL queries are used.
Read: Top 30 Apache spark interview questions and answers
Q9). Can we iterate HBase database rows in reverse order?
We can not iterate tables is the reverse order. As at the time of storing data column values are written first on the disk in which the length of the values is written first following by the actual values. For iteration these actual values must be written twice.
Hbase Questions for Experienced Candidates
Q10). Enlist the key components of HBase.
Following are the key components of HBase:
- HMaster: It can manage and coordinate the Region servers like DataNodes and Namenodes
- ZooKeeper: It works as a coordinator inside HBase distributed environment. It maintains the server state inside clusters through session communication.
- Region Server: Tables of HBase can be divided into several regions. The group of regions is served by a Region Server
Q11). What is the difference between HBase and Hive?
Apache Hive in the data warehousing infrastructure which is built on the top of Hadoop. The data stored in HDFS is queried by HQL or Hive Query Language. HQL is a SQL like alanguage which can translate the queries into MapReduce jobs. It performs batch processing on Hadoop file system. Apache HBase runs on the top of HDFS and is a NoSQL key/value store. HBase operations run in the database in real time and not on MapReduce jobs. In HBase the tables are partitioned which are further split into column families.
Q12). Explain HBase data model.
HBase data model consists of following components:
- Set of Table
- Tables Having Column Families and Rows
- Row Key Acts as a Primary Key in Hbase
- Hbase Tables Uses Primary Key to Access This
Q13). Define column families.
Column collection is called column family while therow is also a collection of column families.
Q14). What is astandalone mode in HBase?
In this default mode of HBase it does not use HDFS. HBase runs on thelocal filesystem and all HBase daemons and local ZooKeeper runs in the same JVM process.
Read: An Introduction and Differences Between YARN and MapReduce
Q15). Write any data manipulation commands of HBase.
Following are a few data manipulation commands of HBase:
- put – Used to put a specific value in a specific column and specific row in a specific table
- get- Used to fetch the content from the specific row and cell
- delete- Used to delete a particular value from the table
- deleteall-Used to delete all cells of the given row
- scan-Used to return and scan the table data
- count-Used to count and return total rows of the table
- truncate-Used to drop, disable and recreate a specified table
Q16). Explain delete operation of HBase and mention three types of tombstone markers of HBase.
In HBase whenever the cell is deleted, it is not actually deleted basically a tombstone marker is set internally, which makes the deleted cell invisible. Deletion is actually done during major compaction. Three types of tombstone markers are:
- Column delete marker- It is used to delete or mark all the columns of the table
- Version delete marker- It is used to delete or mark single version of the column
- Family delete marker- It is used to delete or mark all columns of a column family
Conclusion Any of the aspiring candidate for HBase interview can refer the above listed questions while going for an interview. Apart from the above listed questions there are many more new concepts and questions which you should know for cracking the interview. Along with the basic knowledge of tool you must have practical knowledge as well to prove yourself.
HBase Related Interview Questions and Answers
- Hadoop Interview Questions & Answers
- Pig Interview Questions & Answers
- Storm Interview Questions & Answers
- kafka Interview Questions & Answers
- Mapreduce Interview Questions & Answers
- Splunk Interview Questions & Answers
- Spark Interview Questions & Answers
- AWS & Fundamentals of Linux
- Amazon Simple Storage Service
- Elastic Compute Cloud
- Databases Overview & Amazon Route 53
0 day 30 May 2023
- Intro to DevOps
- GIT and Maven
- Jenkins & Ansible
- Docker and Cloud Computing
4 days 03 Jun 2023
- Data Science Introduction
- Hadoop and Spark Overview
- Python & Intro to R Programming
- Machine Learning
10 days 09 Jun 2023
- Architecture, HDFS & MapReduce
- Unix Shell & Apache Pig Installation
- HIVE Installation & User-Defined Functions
- SQOOP & Hbase Installation
10 days 09 Jun 2023
- Salesforce Configuration Introduction
- Security & Automation Process
- Sales & Service Cloud
- Apex Programming, SOQL & SOSL
1 day 31 May 2023
- Introduction and Software Testing
- Software Test Life Cycle
- Automation Testing and API Testing
- Selenium framework development using Testing
3 days 02 Jun 2023
- BA & Stakeholders Overview
- BPMN, Requirement Elicitation
- BA Tools & Design Documents
- Enterprise Analysis, Agile & Scrum
17 days 16 Jun 2023
MS SQL Server
- Introduction & Database Query
- Programming, Indexes & System Functions
- SSIS Package Development Procedures
- SSRS Report Design
3 days 02 Jun 2023
- Features of Python
- Python Editors and IDEs
- Data types and Variables
- Python File Operation
4 days 03 Jun 2023
- Components of AI
- Categories of Machine Learning
- Recurrent Neural Networks
- Recurrent Neural Networks
18 days 17 Jun 2023
- Introduction to Machine Learning & Python
- Machine Learning: Supervised Learning
- Machine Learning: Unsupervised Learning
31 days 30 Jun 2023
- Introduction to Tableau Desktop
- Data Transformation Methods
- Configuring tableau server
- Integration with R & Hadoop
10 days 09 Jun 2023
Receive Latest Materials and Offers on Hadoop Course