Friday, 4 March 2016

NameNode High Availability for Hadoop Cluster


  1. What and Why High Availability Cluster?

  2. HA Cluster Architecture

  3. Components of HA Cluster

  4. Installation and configuration

  5. Install and Configure JAVA

  6. OS Configuration and Optimization

  7. Disable the fastest mirror hunting Process by the yum installation

  8. Disable Firewall (iptables and selinux)

  9. Install the necessary packages for hadoop

  10. Download and Add the Cloudera repository key

  11. Check the available hadoop packages

  12. Installing HDFS package

  13. Hadoop Configuration files

  14. NameNode HA Configuration

  15. JAVA configuration for hadoop cluster

  16. Initializing the Journal Nodes and Formatting the HA Cluster NameNodes

  17. Initializing ZKFC by formatting Zookeeper

  18. Activating the passive NameNode – Bootstrap Process

  19. Checking the HA_Cluster



What and Why High Availability Cluster?

NameNode HA Architecture

 
NameNode HA Architecture

 Components of HA Cluster 

  1. Active NameNode
  2. Standby NameNode
  3. Zookeeper
  4. ZKFC - Zookeeper Fail-over Controller
  5. Journal Nodes

Installation and configuration


Install  and Configure JAVA

Install JAVA (Here I have selected the /usr/local directory for the jdk installation, copy the bin file to the /usr/local directory and execute the commands as follows), You can also use the rpm based installation as well. 

#chmod 755 jdk-6u45-linux-x64-rpm.bin

#./jdk-6u45-linux-x64-rpm.bin
Update the /etc/profile file with the JAVA_HOME and bin PATH

#vim /etc/profile

#cd ~

#vim .bashrc

export JAVA_HOME={PATH TO JAVA_HOME}

export PATH=$PATH: {PATH TO JAVA_HOME}/bin

For 32 bit architecture reboot is needed to update this configuration. Better to logoff and logon the machine once and check the java installation

#echo $JAVA_HOME

#echo $PATH

#which java

OS Configuration and Optimization

Edit /etc/hosts file and add all the host entry in to this file

# vim /etc/hosts

192.168.0.100 nn1.hadoop.com nn1

192.168.0.101 nn2.hadoop.com nn2

192.168.0.102 dn1.hadoop.com dn1

192.168.0.103 dn2.hadoop.com dn2

192.168.0.104 dn3.hadoop.com dn3
 
Update the /etc/sysconfig/network file and update the hostname in each node 


NETWORKING=yes

HOSTNAME=hostname.hadoop.com

GATEWAY=192.168.0.1
 

Disable the fastest mirror hunting Process by the yum installation


# vim /etc/yum/pluginconf.d/ fastestmirror.conf

enabled=0
 

Disable Firewall (iptables and selinux)

iptables(IPV4)

#service iptables stop

#chkconfig iptables off


ip6tables(IPV6)

#service ip6tables stop

#chkconfig ip6tables off

SELinuxt
#vim /etc/selinux/config

SELINUX=disabled

reboot the node after disabling the firewall.
To see the status of firewall

#service iptables status; service ip6tables status

# sestatus

Install the necessary packages for hadoop

#yum install perl; openssh-clients

Download  and Add the Cloudera repository key  

Download the Cloudera repository and save it in the /etc/yum/yum.repos.d/ directory
Wget http://archive.cloudera.com/cdh5/redhat/6/x86_64/cdh/cloudera-cdh4.repo

Add the repository key
#rpm –import http://archive.cloudera.com/cdh5/redhat/6/x86_64/cdh/RPM-GPG-KEY-cloudera

Check the available hadoop packages

#yum search hadoop

Installing HDFS package

Here we are taking 3 nodes nn1, nn2, and dn1. We will start with installing packages required by the NameNode on nn1, nn2 and dn1. The reason we are installing HDFS package on a non namenode server is because we will need to run a journal node there.
Install hadoop-hdfs-namenode package on all three nodes
# yum install hadoop-hdfs-namenode

Install the journalnode package on all the three nodes
# yum install hadoop-hdfs-journalnode

Install the Zookeeper Server package on all the three nodes
# yum install zookeeper-server

Install failover controller only on both NameNodes, nn1 and nn2
# yum install hadoop-hdfs-zkfc

Before configuring the NameNode we need to make sure that the zookeeper cluster in up and running. Here we have three zookeeper. nn1, nn2 & dn1. We need to update this info into the zookeeper configuration file which is /etc/zookeeper/conf/zoo.cfg. Enter the following details at the end of the configuration file.
General Syntax
[server.id=host:port:port.]

# vim /etc/zookeeper/conf/zoo.cfg

server.1=nn1.hadoop.com:2888:3888

server.2=nn2.hadoop.com:2888:3888

server.3=dn1.hadoop.com:2888:3888
Update the zoo.conf file in all the zookeeper nodes
If you are deploying multiple ZooKeeper servers after a fresh install, you need to create a myid file in the data directory. You can do this by means of an init command option:
Server1
# service zookeeper-server init --myid=1

Server2
# service zookeeper-server init --myid=2

Server3
# service zookeeper-server init --myid=3
Note:
So myid of server 1 would contain the text "1" and nothing else. The id must be unique within the ensemble and should have a value between 1 and 255.
Start the zookeeper server on all the nodes
# service zookeeper-server start

# chkconfig zookeeper-server on

Hadoop Configuration files

There are three main configuration files for the hadoop components. core-site.xml, hdfs-site.xml, mapred-site.xml. core-site.xml file contains configuration options that are common to all the servers in the cluster. hdfs-site.xml, mapred-site.xml files provide the configuration option for HDFS and MAPREDUCE components respectively. The configuration file location is /etc/hadoop/conf and /etc/default. Some of the  environment variables are moved to /etc/default. 

Namenode HA Configuration

Hadoop XML file general property syntax:

<property>
<name>  </name>
<value>  </value>
<description>  </description>
<finalize>  </finalize>
</property>

Open the core-site.xml file and enter the following details.

<property>
<name>fs.default.name</name>
<value>hdfs://sample-cluster/</value>
<description>NameNode Cluster Name</description >
</property >

<property>
<name>ha.zookeeper.quorum</name>
<value>nn1.hadoop.com:2181,nn2.hadoop.com:2181,dn1.hadoop.com:2181</value>
<description>Specifies the location and port of the zookeeper Cluster</description >
</property >

Open the hdfs-site.xml file and enter the following details.
<property>
<name>dfs.name.dir</name>
<value>/dfs/nn/</value >
</property>

<property>
<name>dfs.nameservices</name>
<value>sample-cluster</value>
<description>Logical name of NameNode Cluster</description >
</property>

<property>
<name>dfs.ha.namenode.sample-cluster</name>
<value>nn1,nn2</value>
<description>NameNodes that makes the HA Cluster</description >
</property>

<!----rpc-address properties for the HA_NN_Cluster-->

<property>
<name>dfs.namenode.rpc-address.sample-cluster.nn1</name>
<value>nn1.hadoop.com:8020</value>
<description>nn1 rpc-address</description>
</property>

<property>
<name>dfs.namenode.rpc-address.sample-cluster.nn2</name>
<value>nn2.hadoop.com:8020</value>
<description>nn2 rpc-address</description>
</property>

<!----http-address properties for the HA_NN_Cluster-->

<property>
<name>dfs.namenode.http-address.sample-cluster.nn1</name>
<value>nn2.hadoop.com:50070</value>
<description>nn1 http-address</description>
</property>

<property>
<name>dfs.namenode.http-address.sample-cluster.nn1</name>
<value>nn2.hadoop.com:50070</value>
<description>nn2 http-address</description>
</property>


Note:
Stand by NameNode uses HTTP calls to periodically copy the fsimage file from the primary server perform the checkpoint operation and ship it back

<!--quorum journal properties for the HA_NN_Cluster-->

<property>
<name>dfs.namenode.shared.edits.dir</name>
<value>qjournal://nn1.hadoop.com:8485;nn2.hadoop.com:8485; dn1.hadoop.com:8485/sample-cluster</value>
<description>Specifies the setup of JournalNode Cluster</description>
</property>


Note:
This variable specifies the setup of JournalNode Cluster. Both Active and Standby NameNode  will use this variable to identify which hosts they should contact to send or receive new changes from edit log.

<property>
<name>dfs.journalnode.edits.dir</name>
<value>/dfs/journal</value>
<description>Location on the local file system where editlog changes will be stored</description>
</property>

<!--failover properties for the HA_NN_Cluster-->

<property>
<name>dfs.client.failover.proxy.provider.sample-cluster</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
<description>The Java class that HDFS clients use to contact the Active NameNode</description>
</property>

<property>
<name>dfs.ha.automatic-failover.enabled</name>
<value>true</value>
<description>Indicates if the NameNode cluster will use manual or automatic failover</description>
</property>

<property>
<name>dfs.ha.fencing.methods</name>
<value>sshfence</value>
<description></description>
</property>

<property>
<name>dfs.ha.fencing.ssh.private-key-files</name>
<value>/var/lib/hadoop-hdfs/.ssh/id_rsa</value>
<description></description>
</property>



Passwordless ssh authentication for passwordless hdfs user in Cloudera  Hadoop Cluster.


Then Synchronize these configuration to other nodes in the cluster. we can use the rsync command to do this.


JAVA Configuration for Hadoop cluster


JAVA_HOME must be configured in the /etc/defaults/bigtop-utils file

export JAVA_HOME=/usr/local/java


Initializing the Journal Nodes and Formatting the HA Cluster NameNodes


Now we can start the journal nodes in all the three nodes.

# service hadoop-hdfs-journalnode start

Now we need to initially format hdfs, for this run the following command in the hadoop HA node nn1.
# sudo -u hdfs hdfs namenode -format


Initializing ZKFC by formatting Zookeeper


Next step is to create an entry for the HA cluster in Zookeeper and start NameNode and ZKFC on any of the NameNode,

# sudo -u hdfs hdfs zkfc -formatZK

# service hadoop-hdfs-namenode start

# service hadoop-hdfs-zkfc start


Activating the passive NameNode – Bootstrap Process


To activate the passive NameNode an operation called bootstraping needs to be performed, execute the following command on nn2

# sudo -u hdfs hdfs namenode -bootstrapStandby


Checking the HA_Cluster


To check the status of active and standby cluster



# sudo -u hdfs hdfs haadmin -getServiceState nn1

# sudo -u hdfs hdfs haadmin -getServiceState nn2




To get the web UI of active NN

http://nn1.hadoop.com:50070
 



To get the web UI of standby NN

http://nn1.hadoop.com:50070




Happy Hadooping!......