Setting the Hadoop Cluster (Multi Node Setup)

by Balasundaram

The following are the changes that need to be made inside hadoop folder.

1)core-site.xml

For MASTER:

changes :

The ip address of the master is given in place of the localhost

<property>

<name>fs.default.name</name>

<value>hdfs://10.229.152.18:10011</value>

</property>

SLAVE:

changes :

Similar to master, each replacing with their corresponding ip address.

3)hdfs-site.xml

MASTER:

// The vaule in the replication is the no of slaves + master

<configuration>

<property>

<name>dfs.replication</name>

<value>3</value>

</property>

// Here the name node directory is specified

<property>

<name>dfs.name.dir</name>

<value>/home/user1/asl-hadoop-0.20.2+228/filesystem/name</value>

</property>

// data node directory path

<property>

<name>dfs.data.dir</name>

<value>/home/user1/asl-hadoop-0.20.2+228/filesystem/data</value>

</property>

// temporary directory path

<property>

<name>dfs.temp.dir</name>

<value>/home/user1/asl-hadoop-0.20.2+228/filesystem/temp</value>

</property>

</configuration>

SLAVES:

no changes default values

4)mapred-site.xml

MASTER:

<configuration>

<property>

<name>mapred.job.tracker</name>

<value>10.229.152.18:10012</value>

</property>

//local path need to be given where the local directory is created automatically

<property>

<name>mapred.local.dir</name>

<value>/home/user1/asl-hadoop-0.20.2+228/local</value>

</property>

//Here no of task may be (no of slaves*10) which is rule of thumbs

<property>

<name>mapred.map.tasks</name>

<value>30</value>

</property>

//Here no of reduce tasks may be (no of slaves*3) which is rule of thumbs

<property>

<name>mapred.reduce.tasks</name>

<value>6</value>

</property>

SLAVES:

no changes default values

5)CONF / MASTERS AND SLAVES

MASTER:

conf/masters

master ip

conf/slaves

master_ip

slave_ip

SLAVE:

conf/masters

localhost

conf/slaves

localhost

{HADOOP_HOME}/bin/start-all.sh

After executing the start-all.sh the jps must look like

running node:

in master:

23763 TaskTracker

23186 NameNode

23603 JobTracker

23359 DataNode

In slave

3232 DataNode

6772 TaskTracker

SUCCESSFULLY COMPLETED HADOOP CLUSTER…

Advertisements