HADOOP 3.1.1 INSTALLATION

Hadoop(3.1.1) Installation in Ubuntu Environment

Author : Bottu Gurunadha Rao

 Created: 18-Dec-2018

 Updated: 20-Feb-2021
 Release  : 1.0.1
 Purpose: Hadoop(3.1.1) distributed Installation steps in Ubuntu
 History: N/A
 In this blog, I am going to guide you through setting up the   Hadoop environment on Ubuntu Operating System.

Do the following as admin(root) user(sudo)
1)Update OS
command: sudo apt update
Then it asks for admin password, provide it

2)Update ssh
command: sudo apt install ssh

ssh-keygen is a standard component of the Secure Shell (SSH) protocol used to establish secure shell sessions between remote computers over insecure networks.
3)Generate key
command:  ssh-keygen

4)Append generated keys(use cat)
command : cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys

5)Modify "/etc/ssh/ssh_config"
command : sudo gedit /etc/ssh/ssh_config

6) In the above config file,  check for "#StrictHostkeyChecking YES or ask" and replace YES or ask
    with "no" or "NO"
ex : StrictHostkeyChecking NO
Note : remove # before "StrictHostkeyChecking"

7)Check ssh login
 command:  ssh localhost

8)Logout from ssh
command:  logout

9)Install java8
command: sudo apt install openjdk-8-jdk
Note : JAVA_HOME: /usr/lib/jvm/Java-8-openjdk-amd64

10)Download Hadoop s/w from the link "https://hadoop.apache.org" using any browser
then Hadoop(hadoop-3.1.1.tar.gz) s/w will be downloaded in ubuntu's "Downloads" folder.

11)Goto Downloads folder in ubuntu and extract the file using tar command
command:  tar -zxvf hadoop-3.1.1.tar.gz
Note: After the above command is used, a new folder "hadoop-3.1.1"  will be created in the Downloads folder.

12)create Hadoop Home Directory (It is not mandatory to use hadoop3, any name will do)
command : sudo mkdir /usr/lib/hadoop3

13)change ownership of Hadoop Home Directory
sudo chown <username> /usr/lib/hadoop3
Note1: Replace username with current ubuntu's username)
Note2: To know the current ubuntu username type "whoami" in terminal
command : sudo chown gvp-87 /usr/lib/hadoop3

14)Move contents of extracted directory(Downloads\hadoop-3.1.1)
      to Hadoop Home Directory(/usr/lib/hadoop3)
command :  sudo mv hadoop-3.1.1/* /usr/lib/hadoop3

15)Change Directory to ubuntu root(home) directory
     Type cd followed by enter key, it takes us to root(ubuntu home) directory.
command: cd

16)From Ubuntu Root Directory Setup Hadoop Environment Variables
     in the configuration file ".bashrc"
Note : Add the following lines in ".bashrc", at end of file(after last fi keyword of .bashrc file)
command : sudo gedit ~/.bashrc

export HADOOP_PREFIX=/usr/lib/hadoop3
export PATH=$PATH:$HADOOP_PREFIX/bin
export PATH=$PATH:$HADOOP_PREFIX/sbin
export PATH=$PATH:$JAVA_HOME/bin
export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64

17)Now execute bash profile
command: exec bash

18)Reload bash profile
command : source ~/.bashrc

19)Go to hadoop directory in hadoop home etc directory [usr/lib/hadoop3/etc/hadoop]
command : cd /usr/lib/hadoop3/etc/hadoop

20)Modify hadoop-env.sh in the directory "/usr/lib/hadoop3/etc/hadoop"
command : sudo gedit hadoop-env.sh
Note : set export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
           and remove "#" symbol in the beginning.

21)Modify xml files existed in the directory "/usr/lib/hadoop3/etc/hadoop"
Note : While copying contents in <name></name> tags and also in <value></value> tags
           for each property, take care of more unwanted white spaces, by chance if you copy
           them unexpectedly, remove those extra spaces.

sudo gedit core-site.xml
=================
<configuration>
   <property>
      <name>fs.defaultFS</name>
      <value>hdfs://localhost:9000</value>
   </property>
</configuration>

sudo gedit hdfs-site.xml
=================
<configuration>
   <property>
      <name>dfs.replication</name>
      <value>1</value>
   </property>
</configuration>

sudo gedit mapred-site.xml
===================
<configuration>
  <property>
    <name>mapreduce.framework.name</name>
    <value>yarn</value>
  </property>
</configuration>

sudo gedit yarn-site.xml (Take care while copying 2nd <property> <value> content, remove spaces)
=================
<configuration>
  <property>
   <name>yarn.nodemanager.aux-services</name>
   <value>mapreduce_shuffle</value>
  </property>
<property>
<name>yarn.nodemanager.env-whitelist</name><value>JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASSPATH_PREPEND_DISTCACHE,HADOOP_YARN_HOME,HADOOP_MAPRED_HOME</value>
</property>
</configuration>

22)Run hadoop services from this location "/usr/lib/hadoop3/etc/hadoop"
command : hadoop namenode -format

23)Start dfs services, from sbin of HadoopHome ( /usr/lib/hadoop3/sbin)
Notes: It starts NameNode, DataNode, SecondaryNameNode services
command : ./start-dfs.sh

23)Start Yarn Services, from sbin of HadoopHome (/usr/lib/hadoop3/sbin)
Note : It starts ResourceManager and NodeManager
command : ./start-yarn.sh.

24)Instead of using step23 & 24 we can run all services in a single command.
command : ./start-all.sh [It is not recommended by experts]

25)To check whether all the services are up and running successfully type jps in terminal.
command: jps
output:
8161 Jps
6482 DataNode
6754 SecondaryNameNode
6308 NameNode
7029 ResourceManager
7211 NodeManager

Note: If all the above six services are listed then you are successful in Hadoop Installation.
           If  any name is  missed in the output, it indicates that your installation is not
           successful  - go through all steps again.

26)To safely stop all Hadoop Services use following command.(/usr/lib/hadoop3/sbin)
command : ./stop-all.sh

No comments:

Post a Comment

Hadoop Commands

HADOOP COMMANDS OS : Ubuntu Environment Author : Bottu Gurunadha Rao Created: 31-Jan-2022 Updated: 31-Jan-2022 Release  : 1.0.1 Purpose: To ...

Search This Blog