Grab Deal : Flat 30% off on live classes + 2 free self-paced courses - SCHEDULE CALL

Select Course
Blog
Corporate Training

+1 202 599 3842

(4.8/5 ) | 1.5K+ Ratings

- Hadoop Blogs -

How to Install Apache Pig on Linux?

You might be here because you wanted to install Apache Pig on Linux. Thankfully, you have been reached to the right place today where step by step guide is given to complete the installation process precisely. Before we jump to the installation part directly, let us start with a brief introduction to Apache Pig first.

Apache Pig is a platform which is used to create and execute the programs of MapReduce that are utilized in Hadoop. Large data sets are analyzed through this tool. Generally speaking, Apache Pig can be said an abstraction over MapReduce.

The language used in Apache Pig is Pig Latin and the novice programmers use this language for executing and writing MapReduce programs, especially those programmers who are not good in Java language.

Pig Latin is basically a high-level language which is used for Apache Pig platform. The programs written in Apache Latin can be run on any platform, even over distributed database environment of the Hadoop File System or HDFS.

The scripts of Apache Pig are written in Pig Latin language and they are later converted to MapReduce job. There are various operators present in Apache Pig and they can be used to read, write and to process the data. These operators are called Apache Pig Relational Operators.

Apache Pig Installation on Linux Platform

Apache Pig is one of the most popular tools that is frequently used by the Hadoop developers. If you want to install Apache Pig on Linux platform, then you need to follow proper steps for successful installation of Apache Pig on Linux.

Are there any Pre-Requisites?

To install Apache Pig on Linux platform, there are many pre-requisites. First of all, Hadoop and Java must be pre-installed on your machine. There is a complete procedure and set of steps to install Hadoop and Java on any Linux-based machine, you can follow these steps or Google the proper steps or installation guide for these tools.

After downloading it from the site, install Java and Hadoop and prepare the environment so that you can install Apache Pig properly on your machine. The link from where you can access the complete installation guide for Java and Hadoop is available on Google.

How to download Apache Pig online?

To download Apache Pig, you must select the version which you want to use for your requirement or you can download the latest available version of this tool. There is complete information available online about the various available versions of this platform and their compatibility with other platforms. Download the appropriate package for your need and install them later on. The MapReduce Accelerator support information is listed below:

Read: Chief Elements Of A Professional Hadoop Resume

Apache Pig versions which are supported by MapReduce Accelerators are:

Apache Pig 0.8.1 (A compatible release of Apache Hadoop versions 0.20.x)
Apache Pig 0.9.2 (A compatible version for Apache Hadoop 1.0.0, 1.0.1 and 1.1)

You can take a note of your compatible version of Apache Hadoop, which you are using for Pig scripts. Apache Pig releases do not support Apache Hadoop default version that is 0.21.0, so one must download the supported version of MapReduce Accelerator. The Apache Pig version value is to be inserted as HADOOP_VERSION value while installing it on Linux platform.

Steps to be followed for installation of Apache Pig on Linux

Below is listed the procedure of installing Apache Pig on Ubuntu 16.04 version:

Step 1: From the link http://www-us.apache.org/dist/pig/pig-0.16.0/pig-0.16.0.tar.gz, you can download the latest version of Apache Pig. There will be an archived file with the .tar extension, you just need to download that particular file through the following command: Command: wget (site link) or http://www-us.apache.org/dist/pig/pig-0.16.0/pig-0.16.0.tar.gz

Step 2: Using tar command, you can extract the zipped file. Below is the complete syntax to extract and list the file -

Command: tar –xzf pig-0.16.0.tar.gz

Command: ls

Step 3: You can update the environment variables of Apache Pig through “.bashrc” file. Here in this example, the value of this variable is set in the way so that it can be updated from any directory and to execute Pig command the user would not have to access pig directory each time.

Moreover, one can easily know the path of Apache Pig file, if some application wanted to know that. Following is the command to set the environment variable:

Read: Hbase Architecture & Main Server Components

Command: studio gedit .bashrc

You can add following commands at the end of the file: How to Install Apache Pig on Linux?

Before updating the environment variable for Pig make sure that Hadoop path is also set.

Step 4: Now, this is the time to check the version of Pig just to ensure that Apache Pig has been installed properly. In case, if the version is not displayed properly then you may have to repeat the above-listed steps again.

Command: pig –version

Step 5: Check Pig help option to list all the commands available under the tab-

Pig -help

Step 6: To start the grunt shell, you should run Pig. Pig Latin script is run through Grunt Shell. Command: pig. Congratulations! The Apache Pig is installed correctly on your Linux OS. Now you must be able to see two execution modes where it can run –

MapReduce Mode and the Local Mode.

Read: Hadoop Hive Modules & Data Type with Examples

Let us discuss on each of the modes briefly.

Apache Pig Execution Modes

Following two modes are available to execute Apache Pig on Linux environment:

MapReduce Mode: MapReduce is the default available execution mode of Apache Pig, it required HDFS installation and Hadoop cluster as its component. As it is the default mode, so the user does not need to specify –x flag. On local file system, input and output modes are also present.
Local Mode: All files are run and installed using the file system of the local host on a single machine. The local mode is usually specified using by –x flag. Input and output both are present on the local machine and the following command is used for this purpose:Command: pig –x local

Apache Pig Components

There are various components of Apache Pig framework. The major components of Pig are listed below:

Parser: Pig scripts are usually handled by the Parser wheremany types and various other checks are performed. The output of the parser will be DAG or direct acyclic graph, through which logical operators and Pig Latin statements are represented. Here the operators are represented as nodes and data flows are represented through the edges.
Compiler: The optimized logical plan is compiled in various steps of MapReduce jobs.
Optimizer: The logical optimizer usually carries and execute the logical plan like projection and pushdown.
Execute Engine: The MapReduce jobs are submitted to Hadoop in a sorted order and the desired result is produced by the Hadoop engine.

Final Words:

Hadoop is a popular framework used by database experts and there are a number of tools and technologies present in this framework. Due to the popularity of this platform, there are a number of jobs options available in the market and most of the developers are using this platform for Linux or any other operating system.

Here the Apache Pig framework is the best tool for the Linux OS. Installation of this framework is a step by step procedure, which is not that much difficult but in order to use it in the proper way the user will have to install it properly.

The above list of steps or installation guide has a detailed procedure and the user can easily follow this step by step process if he wants to use Apache Pig or Pig Latin script language of this platform to leverage the Hadoop platform. Apache website has the link to download the framework and the user can easily download it right from there. Even there is a help guide also available for his platform.

To learn more about Hadoop and Apache, you should start with the Training and Certification program at JanBask right away. The right learning not only boosts your knowledge and gives a new direction to your career that is actually required to become successful and established in your near future.

Read: Top 20 Apache Solr Interview Questions & Answers for Freshers and Experienced

FaceBook

Twitter

JanBask Training Team

The JanBask Training Team includes certified professionals and expert writers dedicated to helping learners navigate their career journeys in QA, Cybersecurity, Salesforce, and more. Each article is carefully researched and reviewed to ensure quality and relevance.

Comments

Hadoop Course
Upcoming Batches

Jul

Mon - Fri

6 Weeks

Aug

Mon - Fri

6 Weeks

Aug

Mon - Fri

6 Weeks

Aug

Mon - Fri

6 Weeks

View Detail

Trending Courses

Cyber Security

Introduction to cybersecurity
Cryptography and Secure Communication
Cloud Computing Architectural Framework
Security Architectures and Models

Upcoming Class

7 days 25 Jul 2025

View Details

Introduction and Software Testing
Software Test Life Cycle
Automation Testing and API Testing
Selenium framework development using Testing

Upcoming Class

-0 day 18 Jul 2025

View Details

Salesforce

Salesforce Configuration Introduction
Security & Automation Process
Sales & Service Cloud
Apex Programming, SOQL & SOSL

Upcoming Class

5 days 23 Jul 2025

View Details

Business Analyst

BA & Stakeholders Overview
BPMN, Requirement Elicitation
BA Tools & Design Documents
Enterprise Analysis, Agile & Scrum

Upcoming Class

7 days 25 Jul 2025

View Details

MS SQL Server

Introduction & Database Query
Programming, Indexes & System Functions
SSIS Package Development Procedures
SSRS Report Design

Upcoming Class

7 days 25 Jul 2025

View Details

Data Science

Data Science Introduction
Hadoop and Spark Overview
Python & Intro to R Programming
Machine Learning

Upcoming Class

-0 day 18 Jul 2025

View Details

DevOps

Intro to DevOps
GIT and Maven
Jenkins & Ansible
Docker and Cloud Computing

Upcoming Class

1 day 19 Jul 2025

View Details

Hadoop

Architecture, HDFS & MapReduce
Unix Shell & Apache Pig Installation
HIVE Installation & User-Defined Functions
SQOOP & Hbase Installation

Upcoming Class

-0 day 18 Jul 2025

View Details

Python

Features of Python
Python Editors and IDEs
Data types and Variables
Python File Operation

Upcoming Class

7 days 25 Jul 2025

View Details

Artificial Intelligence

Components of AI
Categories of Machine Learning
Recurrent Neural Networks
Recurrent Neural Networks

Upcoming Class

-0 day 18 Jul 2025

View Details

Machine Learning

Introduction to Machine Learning & Python
Machine Learning: Supervised Learning
Machine Learning: Unsupervised Learning

Upcoming Class

7 days 25 Jul 2025

View Details

Tableau

Introduction to Tableau Desktop
Data Transformation Methods
Configuring tableau server
Integration with R & Hadoop

Upcoming Class

-0 day 18 Jul 2025

View Details

Browse Categories

Hadoop Hive Modules & Data Type with Examples

Feb 07, 2024 eye-dark

608.4k

How to Compare Hive, Spark, Impala and Presto?

Oct 18, 2024 eye-dark

992.2k

What Is Hadoop 3? What's New Features in Hadoop 3.0

Feb 12, 2024 eye-dark

931.2k

Search Posts

Reset

Hadoop Hive Modules & Data Type with Examples 608.4k

How to Compare Hive, Spark, Impala and Presto? 992.2k

What Is Hadoop 3? What's New Features in Hadoop 3.0 931.2k

MapReduce Interview Questions and Answers 730.4k

What Is The Hadoop Cluster? How Does It Work? 318.3k

Hadoop Course
Upcoming Batches

Jul

Mon - Fri

6 Weeks

Aug

Mon - Fri

6 Weeks

Aug

Mon - Fri

6 Weeks

Aug

Mon - Fri

6 Weeks

View Detail

Receive Latest Materials and Offers on Hadoop Course

By submitting my contact details, I agree Privacy Policy ... and I consent to receiving SMS/call/email, including marketing and promotional SMS. Read More

Scroll

How to Install Apache Pig on Linux?

Apache Pig Installation on Linux Platform

Are there any Pre-Requisites?

How to download Apache Pig online?

Steps to be followed for installation of Apache Pig on Linux

Apache Pig Execution Modes

Apache Pig Components

JanBask Training Team

Comments

Trending Courses

Browse Categories

Related Posts