Grab Deal : Flat 30% off on live classes + 2 free self-paced courses - SCHEDULE CALL

Select Course
Blog
Corporate Training

+1 202 599 3842

(4.8/5 ) | 1.5K+ Ratings

- Hadoop Blogs -

Hadoop Hive Modules & Data Type with Examples

Hadoop is an open source framework from Apache. Hadoop is used to analyze huge data volume and store processes. The language used in Hadoop is written in Java and is not an online analytical process, which is used for batch/offline processing. Hadoop is widely in a trend and is being used by Facebook, Yahoo, Google, Twitter, LinkedIn and many more. Hadoop can be scaled up simply by adding nodes in the cluster.

Modules of Hadoop

Hadoop Distributed File System: HDFS was developed by Google. According to HDFS the files will be broken into blocks and stored in nodes over the distributed architecture.
Yarn: In order to manage the cluster, another Resource Negotiator is used which also performs job scheduling.
Map Reduce: Map Reduce helps in reducing the framework which enables the parallel computation for the Java programs for the data using the key-value pair. The Map task helps to input the data and converts it into the data set that can be computed in Key value pair. The Map task output is consumed by reducing the task and then the output of the reducer gives the result that is desired.
Hadoop Common: Hadoop common are Java libraries which are used to start Hadoop and are also used by other Hadoop modules.

Advantages of Hadoop

Speedy: In HDFS the data is distributed over the cluster and as they are mapped it helps in faster retrieval of the data. Even the tools to process the data are often on the same servers, thus reducing the processing time.
Scalable: Hadoop cluster can be extended by just adding nodes in the cluster.
Cost Effective: Hadoop is an open source. As to store data it uses commodity hardware becomes so much more cost-effective when compared to traditional relational database management system.
Resistance to failure: HDFS has a unique property using which it can replicate data over the network. When network failures occur or when a node is down, then Hadoop acts efficiently by training the other copy of data and makes use of it. Though the data is replicated three times, but the replication factor is configurable.

Data Types in Hive

In the hive tables, Data types are used for specifying the column/field type. Hive data types can be classified into following categories: All the data types in the Hive are classified into types, given as follows:

1). Primitive Data type Primitive Data Types also divide into 4 types which are as follows:

A). Numeric Data Type The Hive Numeric Data types also classified into two types-

B). Integral Data Types The Hive Integral data types are as follows- TINYINT (1-byte (8 bit) signed integer, from -128 to 127) SMALLINT (2-byte (16 bit) signed integer, from -32, 768 to 32, 767) INT (4-byte (32-bit) signed integer, from –2,147,483,648to 2,147,483,647) BIGINT (8-byte (64-bit) signed integer, from –9,223,372,036,854,775,808 to 9,223,372,036,854,775,807)

C). Floating Data Types The Hive Floating data types are as follows- FLOAT (4-byte (32-bit) single-precision floating-point number) DOUBLE (8-byte (64-bit) double-precision floating-point number) DECIMAL (Arbitrary-precision signed decimal number)

i). Date/Time Data Type The second category of Apache Hive primitive data type is Date/Time data types. The following data types come into this category-

Read: What Is Hue? Hue Hadoop Tutorial Guide for Beginners

TIMESTAMP (Timestamp with nanosecond precision)
DATE (date)
INTERVAL

ii). String Data Type String data types are the third category under Hive data types. Below are the data types:- STRING (Unbounded variable-length character string) VARCHAR (Variable-length character string) CHAR (Fixed-length character string)

iii). Miscellaneous Data Type The two data types come from Hive miscellaneous data types- BOOLEAN (True/false value) BINARY (Byte array)

2). Complex Data Type Following are the complex data types:

ARRAY

An Array is the ordered collection of fields. All the fields must be of the same type. Syntax: ARRAY<data_type> E.g. array (1, 2)

MAP

A Map is the unordered collection of key-value pairs. Key values can be of any type. Syntax: MAP<primitive_type, data_type> E.g. map(‘a', 1, ‘b', 2).

STRUCT

A Struct is the collection of named fields. The fields may be of different types. Syntax: STRUCT<col_name : data_type [COMMENT col_comment],…..> E.g. struct(‘a', 1 1.0),[b] named_struct(‘col1', ‘a', ‘col2', 1, ‘col3', 1.0)

UNION

A union is the value that may be one of a number of defined data. The value is tagged with an integer (zero-indexed) representing its data type in the union. Syntax: UNIONTYPE<data_type, data_type, …> E.g. create_union(1, ‘a', 63)

Read: Apache Storm Interview Questions and Answers: Fresher & Experience

3). Column Type

Integral Type

Following are the 4 data types of integral type: TINYINT, Ex. 100Y SMALLINT, Ex. 100S INT/INTEGER BIGINT, Ex. 100L

Strings

The string can be represented with either single quotes (‘) or double quotes (").Hive uses C-style escaping within the strings.

Time stamp

The traditional UNIX timestamp is supported in Hive with operational nanosecond precision. Timestamps of text files use format "YYYY-MM-DD HH:MM:SS.fffffffff" and "yyyy-mm-dd hh:mm:ss.ffffffffff".

Dates DATE values are described in a particular year/month/day (YYYY-MM-DD) format. E.g. DATE ‘2017-01-01’.
Decimals

Hive DECIMAL type is similar to a Big Decimal format of Java that represents the arbitrary precision. The syntax and example are below: “Apache Hive 0.11 and 0.12 has the precision of the DECIMAL type fixed. And it’s limited to 38 digits. Apache Hive 0.13 users can specify the scale and precision when creating tables with the DECIMAL data type using DECIMAL (precision, scale) syntax. If the scale is not specified, then it defaults to 0 (no fractional digits). If no precision is specified, then it defaults to 10. CREATE TABLE foo ( a DECIMAL, -- Defaults to decimal(10,0)b DECIMAL(9, 7) b DECIMAL(9, 7) )

Union Types

Heterogeneous data types collection. “By using create union, we can create an instance.” The syntax and example are as below: CREATE TABLE union_test(foo UNIONTYPE<int, double, array<string>, struct<a:int,b:string>>); SELECT foo FROM union_test; {0:1} {1:2.0} {2:["three","four"]} {3:{"a":5,"b":"five"}} {2:["six","seven"]} {3:{"a":8,"b":"eight"}} {0:9} {1:10.0}

4). Literals In Hive following literals are used:

Read: Apache Flink Tutorial Guide for Beginner

Floating Point Types

These are nothing but numbers with decimal points. This type of data is composed of the DOUBLE data type.

Decimal Type

This type is nothing but floating point value with higher range than the DOUBLE data type. The decimal type range is approximate -10^-308 to 10³⁰⁸.

5). Null Value In Hive, missing values are represented by the special value NULL.

Conclusion

In this blog on Hive data types we have discussed all the data types in detail with examples. It will definitely provide you a deeper understanding and will help you to understand all the data types in hive easily.

FaceBook

Twitter

JanBask Training Team

The JanBask Training Team includes certified professionals and expert writers dedicated to helping learners navigate their career journeys in QA, Cybersecurity, Salesforce, and more. Each article is carefully researched and reviewed to ensure quality and relevance.

Comments

Hadoop Course
Upcoming Batches

Jul

Mon - Fri

6 Weeks

Aug

Mon - Fri

6 Weeks

Aug

Mon - Fri

6 Weeks

Aug

Mon - Fri

6 Weeks

View Detail

Trending Courses

Cyber Security

Introduction to cybersecurity
Cryptography and Secure Communication
Cloud Computing Architectural Framework
Security Architectures and Models

Upcoming Class

7 days 25 Jul 2025

View Details

Introduction and Software Testing
Software Test Life Cycle
Automation Testing and API Testing
Selenium framework development using Testing

Upcoming Class

-0 day 18 Jul 2025

View Details

Salesforce

Salesforce Configuration Introduction
Security & Automation Process
Sales & Service Cloud
Apex Programming, SOQL & SOSL

Upcoming Class

5 days 23 Jul 2025

View Details

Business Analyst

BA & Stakeholders Overview
BPMN, Requirement Elicitation
BA Tools & Design Documents
Enterprise Analysis, Agile & Scrum

Upcoming Class

7 days 25 Jul 2025

View Details

MS SQL Server

Introduction & Database Query
Programming, Indexes & System Functions
SSIS Package Development Procedures
SSRS Report Design

Upcoming Class

7 days 25 Jul 2025

View Details

Data Science

Data Science Introduction
Hadoop and Spark Overview
Python & Intro to R Programming
Machine Learning

Upcoming Class

-0 day 18 Jul 2025

View Details

DevOps

Intro to DevOps
GIT and Maven
Jenkins & Ansible
Docker and Cloud Computing

Upcoming Class

1 day 19 Jul 2025

View Details

Hadoop

Architecture, HDFS & MapReduce
Unix Shell & Apache Pig Installation
HIVE Installation & User-Defined Functions
SQOOP & Hbase Installation

Upcoming Class

-0 day 18 Jul 2025

View Details

Python

Features of Python
Python Editors and IDEs
Data types and Variables
Python File Operation

Upcoming Class

7 days 25 Jul 2025

View Details

Artificial Intelligence

Components of AI
Categories of Machine Learning
Recurrent Neural Networks
Recurrent Neural Networks

Upcoming Class

-0 day 18 Jul 2025

View Details

Machine Learning

Introduction to Machine Learning & Python
Machine Learning: Supervised Learning
Machine Learning: Unsupervised Learning

Upcoming Class

7 days 25 Jul 2025

View Details

Tableau

Introduction to Tableau Desktop
Data Transformation Methods
Configuring tableau server
Integration with R & Hadoop

Upcoming Class

-0 day 18 Jul 2025

View Details

Browse Categories

Hbase Architecture & Main Server Components

Mar 01, 2018 eye-dark

184.5k

Top 20 Big Data Hadoop Interview Questions and Answers 2018

Apr 20, 2017 eye-dark

121.4k

Harnessing the Power of Data Analytics: Exploring Hadoop Analytics Tools for Big Data

Jun 06, 2023 eye-dark

4.3k

Search Posts

Reset

Hbase Architecture & Main Server Components 184.5k

Top 20 Big Data Hadoop Interview Questions and Answers 2018 121.4k

Harnessing the Power of Data Analytics: Exploring Hadoop Analytics Tools for Big Data 4.3k

An Introduction to the Architecture & Components of Hadoop Ecosystem 668.5k

Hadoop Developer And Architect: Roles and Responsibilities 241.1k

Hadoop Course
Upcoming Batches

Jul

Mon - Fri

6 Weeks

Aug

Mon - Fri

6 Weeks

Aug

Mon - Fri

6 Weeks

Aug

Mon - Fri

6 Weeks

View Detail

Receive Latest Materials and Offers on Hadoop Course

By submitting my contact details, I agree Privacy Policy ... and I consent to receiving SMS/call/email, including marketing and promotional SMS. Read More

Scroll

Hadoop Hive Modules & Data Type with Examples

Modules of Hadoop

Advantages of Hadoop

Data Types in Hive

JanBask Training Team

Comments

Trending Courses

Browse Categories

Related Posts