There are lots of options you can go to install Hadoop, and Cloudera is one of the easiest. However, if you just want to experience Hadoop on a virtual machine, having Cloudera on top will eat lots of your memory. It would be much better to just install vanilla Hadoop from its official website.

This post will focus on getting Hadoop running on CentOS without any third party software.

Configure “hadoop”user

The following steps assumes that you logged in as “root” user.

[root@hadoop-spark-vm root]$ adduser hadoop

This should add a new user with username “hadoop” and a group called “hadoop”

Update “hadopp”‘s password:

[root@hadoop-spark-vm root]$ passwd hadoop

Give “hadoop” user “sudo” access, open ‘/etc/sudoers’ and add the following line:

hadoop ALL=(ALL) ALL

And now login as “hadoop” user:

[root@hadoop-spark-vm root]$ su - hadoop

Set up the “passwordless” SSH

Run the following commands:

[hadoop@hadoop-spark-vm hadoop]$ ssh-keygen -t rsa -P ''
[hadoop@hadoop-spark-vm hadoop]$ sudo cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys

Install JAVA

[hadoop@hadoop-spark-vm hadoop]$ sudo yum install java-1.7.0-openjdk*

After the installation, verify JAVA:

[hadoop@hadoop-spark-vm hadoop]$ java -version
java version "1.7.0_55"
OpenJDK Runtime Environment (rhel-2.4.7.1.el6_5-x86_64 u55-b13)
OpenJDK 64-Bit Server VM (build 24.51-b03, mixed mode)

The folder /etc/alternatives contains a link to JAVA installation:

[hadoop@hadoop-spark-vm hadoop]$ ll /etc/alternatives/java
lrwxrwxrwx 1 root root 46 May 21 12:06 /etc/alternatives/java -> /usr/lib/jvm/jre-1.7.0-openjdk.x86_64/bin/java

Add the path to JAVA_HOME environment variable by updating ~/.bashrc

export JAVA_HOME=/usr/lib/jvm/jre-1.7.0-openjdk.x86_64/bin/java
export PATH=$PATH:$JAVA_HOME

Now we are ready to install Hadoop

Download Hadoop’s binary package:

[hadoop@hadoop-spark-vm hadoop]$ cd ~
[hadoop@hadoop-spark-vm hadoop]$ wget http://apache.mirror.uber.com.au/hadoop/common/current/hadoop-2.4.0.tar.gz # download
[hadoop@hadoop-spark-vm hadoop]$ tar xzvf hadoop-2.4.0.tar.gz # un-tar the archive
[hadoop@hadoop-spark-vm hadoop]$ sudo mv hadoop-2.4.0 /usr/local/hadoop # move un-tar-ed directory to /usr/local/hadoop
[hadoop@hadoop-spark-vm hadoop]$ sudo chown -R hadoop:hadoop /usr/local/hadoop # change the ownership of /usr/local/hadoop to "hadoop" user and group

Next create namenode and datanode folders:

[hadoop@hadoop-spark-vm hadoop]$ mkdir -p ~/hadoopspace/hdfs/namenode
[hadoop@hadoop-spark-vm hadoop]$ mkdir -p ~/hadoopspace/hdfs/datanode

Configuring Hadoop

Add the following lines into ~/.bashrc to setup environment variable for Hadoop:

export HADOOP_INSTALL=/usr/local/hadoop
export HADOOP_MAPRED_HOME=$HADOOP_INSTALL
export HADOOP_COMMON_HOME=$HADOOP_INSTALL
export HADOOP_HDFS_HOME=$HADOOP_INSTALL
export YARN_HOME=$HADOOP_INSTALL
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_INSTALL/lib/native
export PATH=$PATH:$HADOOP_INSTALL/sbin
export PATH=$PATH:$HADOOP_INSTALL/bin

Update the environment variable immedicately:

[hadoop@hadoop-spark-vm hadoop]$ source ~/.bashrc

Go to “/usr/local/hadoop/etc/hadoop/” directory and update the XML files one by one:

mapred-site.xml

[hadoop@hadoop-spark-vm hadoop]$ cp mapred-site.xml.template mapred-site.xml
[hadoop@hadoop-spark-vm hadoop]$ vi mapred-site.xml

And add the following between the configuration tabs:

<property>
  <name>mapreduce.framework.name</name>
 <value>yarn</value>
</property>

yarn-ste.xml

Add the following between the configuration tabs:

<property>
  <name>yarn.nodemanager.aux-services</name>
  <value>mapreduce_shuffle</value>
</property>

core-site.xml

Add the following between the confiugration tabs:

<property>
  <name>fs.default.name</name>
  <value>hdfs://localhost:9000</value>
</property>

hdfs-site.xml

<property>
 <name>dfs.replication</name>
<value>1</value>
</property>

<property>
  <name>dfs.name.dir</name>
  <value>file:///home/hadoopuser/hadoopspace/hdfs/namenode</value>
</property>

<property>
  <name>dfs.data.dir</name>
  <value>file:///home/hadoopuser/hadoopspace/hdfs/datanode</value>
</property>

hadoop-env.sh

Add an entry for JAVA_HOME

export JAVA_HOME=/usr/lib/jvm/jre-1.7.0-openjdk.x86_64/

Next, run the following command one by one

hdfs namenode -format
start-dfs.sh
start-yean.sh

You will see the following outputs:


[hadoop@hadoop-spark-vm hadoop]$ hdfs namenode -format
14/05/21 12:33:50 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG: host = hadoop-spark-vm/127.0.0.1
STARTUP_MSG: args = [-format]
STARTUP_MSG: version = 2.4.0
STARTUP_MSG: classpath = /usr/local/hadoop/etc/hadoop:/usr/local/hadoop/share/hadoop/common/lib/zookeeper-3.4.5.jar:/usr/local/hadoop/share/hadoop/common/lib/stax-api-1.0-2.jar:/usr/local/hadoop/share/hadoop/common/lib/commons-beanutils-1.7.0.jar:/usr/local/hadoop/share/hadoop/common/lib/jettison-1.1.jar:/usr/local/hadoop/share/hadoop/common/lib/commons-httpclient-3.1.jar:/usr/local/hadoop/share/hadoop/common/lib/commons-cli-1.2.jar:/usr/local/hadoop/share/hadoop/common/lib/servlet-api-2.5.jar:/usr/local/hadoop/share/hadoop/common/lib/jackson-jaxrs-1.8.8.jar:/usr/local/hadoop/share/hadoop/common/lib/java-xmlbuilder-0.4.jar:/usr/local/hadoop/share/hadoop/common/lib/jasper-runtime-5.5.23.jar:/usr/local/hadoop/share/hadoop/common/lib/jackson-mapper-asl-1.8.8.jar:/usr/local/hadoop/share/hadoop/common/lib/httpclient-4.2.5.jar:/usr/local/hadoop/share/hadoop/common/lib/mockito-all-1.8.5.jar:/usr/local/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar:/usr/local/hadoop/share/hadoop/common/lib/commons-logging-1.1.3.jar:/usr/local/hadoop/share/hadoop/common/lib/jaxb-impl-2.2.3-1.jar:/usr/local/hadoop/share/hadoop/common/lib/paranamer-2.3.jar:/usr/local/hadoop/share/hadoop/common/lib/hadoop-auth-2.4.0.jar:/usr/local/hadoop/share/hadoop/common/lib/jetty-6.1.26.jar:/usr/local/hadoop/share/hadoop/common/lib/jaxb-api-2.2.2.jar:/usr/local/hadoop/share/hadoop/common/lib/jsr305-1.3.9.jar:/usr/local/hadoop/share/hadoop/common/lib/commons-net-3.1.jar:/usr/local/hadoop/share/hadoop/common/lib/slf4j-api-1.7.5.jar:/usr/local/hadoop/share/hadoop/common/lib/commons-codec-1.4.jar:/usr/local/hadoop/share/hadoop/common/lib/commons-compress-1.4.1.jar:/usr/local/hadoop/share/hadoop/common/lib/avro-1.7.4.jar:/usr/local/hadoop/share/hadoop/common/lib/xmlenc-0.52.jar:/usr/local/hadoop/share/hadoop/common/lib/httpcore-4.2.5.jar:/usr/local/hadoop/share/hadoop/common/lib/commons-configuration-1.6.jar:/usr/local/hadoop/share/hadoop/common/lib/jackson-core-asl-1.8.8.jar:/usr/local/hadoop/share/hadoop/common/lib/commons-io-2.4.jar:/usr/local/hadoop/share/hadoop/common/lib/guava-11.0.2.jar:/usr/local/hadoop/share/hadoop/common/lib/jasper-compiler-5.5.23.jar:/usr/local/hadoop/share/hadoop/common/lib/activation-1.1.jar:/usr/local/hadoop/share/hadoop/common/lib/commons-lang-2.6.jar:/usr/local/hadoop/share/hadoop/common/lib/jersey-json-1.9.jar:/usr/local/hadoop/share/hadoop/common/lib/jets3t-0.9.0.jar:/usr/local/hadoop/share/hadoop/common/lib/commons-beanutils-core-1.8.0.jar:/usr/local/hadoop/share/hadoop/common/lib/commons-collections-3.2.1.jar:/usr/local/hadoop/share/hadoop/common/lib/commons-math3-3.1.1.jar:/usr/local/hadoop/share/hadoop/common/lib/commons-el-1.0.jar:/usr/local/hadoop/share/hadoop/common/lib/asm-3.2.jar:/usr/local/hadoop/share/hadoop/common/lib/jersey-core-1.9.jar:/usr/local/hadoop/share/hadoop/common/lib/netty-3.6.2.Final.jar:/usr/local/hadoop/share/hadoop/common/lib/commons-digester-1.8.jar:/usr/local/hadoop/share/hadoop/common/lib/jersey-server-1.9.jar:/usr/local/hadoop/share/hadoop/common/lib/snappy-java-1.0.4.1.jar:/usr/local/hadoop/share/hadoop/common/lib/hadoop-annotations-2.4.0.jar:/usr/local/hadoop/share/hadoop/common/lib/jsp-api-2.1.jar:/usr/local/hadoop/share/hadoop/common/lib/jsch-0.1.42.jar:/usr/local/hadoop/share/hadoop/common/lib/protobuf-java-2.5.0.jar:/usr/local/hadoop/share/hadoop/common/lib/jetty-util-6.1.26.jar:/usr/local/hadoop/share/hadoop/common/lib/log4j-1.2.17.jar:/usr/local/hadoop/share/hadoop/common/lib/junit-4.8.2.jar:/usr/local/hadoop/share/hadoop/common/lib/xz-1.0.jar:/usr/local/hadoop/share/hadoop/common/lib/jackson-xc-1.8.8.jar:/usr/local/hadoop/share/hadoop/common/hadoop-common-2.4.0.jar:/usr/local/hadoop/share/hadoop/common/hadoop-common-2.4.0-tests.jar:/usr/local/hadoop/share/hadoop/common/hadoop-nfs-2.4.0.jar:/usr/local/hadoop/share/hadoop/hdfs:/usr/local/hadoop/share/hadoop/hdfs/lib/commons-cli-1.2.jar:/usr/local/hadoop/share/hadoop/hdfs/lib/servlet-api-2.5.jar:/usr/local/hadoop/share/hadoop/hdfs/lib/jasper-runtime-5.5.23.jar:/usr/local/hadoop/share/hadoop/hdfs/lib/jackson-mapper-asl-1.8.8.jar:/usr/local/hadoop/share/hadoop/hdfs/lib/commons-logging-1.1.3.jar:/usr/local/hadoop/share/hadoop/hdfs/lib/jetty-6.1.26.jar:/usr/local/hadoop/share/hadoop/hdfs/lib/jsr305-1.3.9.jar:/usr/local/hadoop/share/hadoop/hdfs/lib/commons-codec-1.4.jar:/usr/local/hadoop/share/hadoop/hdfs/lib/commons-daemon-1.0.13.jar:/usr/local/hadoop/share/hadoop/hdfs/lib/xmlenc-0.52.jar:/usr/local/hadoop/share/hadoop/hdfs/lib/jackson-core-asl-1.8.8.jar:/usr/local/hadoop/share/hadoop/hdfs/lib/commons-io-2.4.jar:/usr/local/hadoop/share/hadoop/hdfs/lib/guava-11.0.2.jar:/usr/local/hadoop/share/hadoop/hdfs/lib/commons-lang-2.6.jar:/usr/local/hadoop/share/hadoop/hdfs/lib/commons-el-1.0.jar:/usr/local/hadoop/share/hadoop/hdfs/lib/asm-3.2.jar:/usr/local/hadoop/share/hadoop/hdfs/lib/jersey-core-1.9.jar:/usr/local/hadoop/share/hadoop/hdfs/lib/netty-3.6.2.Final.jar:/usr/local/hadoop/share/hadoop/hdfs/lib/jersey-server-1.9.jar:/usr/local/hadoop/share/hadoop/hdfs/lib/jsp-api-2.1.jar:/usr/local/hadoop/share/hadoop/hdfs/lib/protobuf-java-2.5.0.jar:/usr/local/hadoop/share/hadoop/hdfs/lib/jetty-util-6.1.26.jar:/usr/local/hadoop/share/hadoop/hdfs/lib/log4j-1.2.17.jar:/usr/local/hadoop/share/hadoop/hdfs/hadoop-hdfs-2.4.0.jar:/usr/local/hadoop/share/hadoop/hdfs/hadoop-hdfs-2.4.0-tests.jar:/usr/local/hadoop/share/hadoop/hdfs/hadoop-hdfs-nfs-2.4.0.jar:/usr/local/hadoop/share/hadoop/yarn/lib/zookeeper-3.4.5.jar:/usr/local/hadoop/share/hadoop/yarn/lib/stax-api-1.0-2.jar:/usr/local/hadoop/share/hadoop/yarn/lib/jettison-1.1.jar:/usr/local/hadoop/share/hadoop/yarn/lib/commons-httpclient-3.1.jar:/usr/local/hadoop/share/hadoop/yarn/lib/commons-cli-1.2.jar:/usr/local/hadoop/share/hadoop/yarn/lib/servlet-api-2.5.jar:/usr/local/hadoop/share/hadoop/yarn/lib/jackson-jaxrs-1.8.8.jar:/usr/local/hadoop/share/hadoop/yarn/lib/jackson-mapper-asl-1.8.8.jar:/usr/local/hadoop/share/hadoop/yarn/lib/commons-logging-1.1.3.jar:/usr/local/hadoop/share/hadoop/yarn/lib/jaxb-impl-2.2.3-1.jar:/usr/local/hadoop/share/hadoop/yarn/lib/jetty-6.1.26.jar:/usr/local/hadoop/share/hadoop/yarn/lib/jaxb-api-2.2.2.jar:/usr/local/hadoop/share/hadoop/yarn/lib/jsr305-1.3.9.jar:/usr/local/hadoop/share/hadoop/yarn/lib/aopalliance-1.0.jar:/usr/local/hadoop/share/hadoop/yarn/lib/commons-codec-1.4.jar:/usr/local/hadoop/share/hadoop/yarn/lib/commons-compress-1.4.1.jar:/usr/local/hadoop/share/hadoop/yarn/lib/jersey-guice-1.9.jar:/usr/local/hadoop/share/hadoop/yarn/lib/guice-servlet-3.0.jar:/usr/local/hadoop/share/hadoop/yarn/lib/jackson-core-asl-1.8.8.jar:/usr/local/hadoop/share/hadoop/yarn/lib/commons-io-2.4.jar:/usr/local/hadoop/share/hadoop/yarn/lib/guava-11.0.2.jar:/usr/local/hadoop/share/hadoop/yarn/lib/jline-0.9.94.jar:/usr/local/hadoop/share/hadoop/yarn/lib/jersey-client-1.9.jar:/usr/local/hadoop/share/hadoop/yarn/lib/activation-1.1.jar:/usr/local/hadoop/share/hadoop/yarn/lib/commons-lang-2.6.jar:/usr/local/hadoop/share/hadoop/yarn/lib/jersey-json-1.9.jar:/usr/local/hadoop/share/hadoop/yarn/lib/commons-collections-3.2.1.jar:/usr/local/hadoop/share/hadoop/yarn/lib/asm-3.2.jar:/usr/local/hadoop/share/hadoop/yarn/lib/jersey-core-1.9.jar:/usr/local/hadoop/share/hadoop/yarn/lib/guice-3.0.jar:/usr/local/hadoop/share/hadoop/yarn/lib/jersey-server-1.9.jar:/usr/local/hadoop/share/hadoop/yarn/lib/leveldbjni-all-1.8.jar:/usr/local/hadoop/share/hadoop/yarn/lib/protobuf-java-2.5.0.jar:/usr/local/hadoop/share/hadoop/yarn/lib/jetty-util-6.1.26.jar:/usr/local/hadoop/share/hadoop/yarn/lib/log4j-1.2.17.jar:/usr/local/hadoop/share/hadoop/yarn/lib/javax.inject-1.jar:/usr/local/hadoop/share/hadoop/yarn/lib/xz-1.0.jar:/usr/local/hadoop/share/hadoop/yarn/lib/jackson-xc-1.8.8.jar:/usr/local/hadoop/share/hadoop/yarn/hadoop-yarn-server-tests-2.4.0.jar:/usr/local/hadoop/share/hadoop/yarn/hadoop-yarn-server-web-proxy-2.4.0.jar:/usr/local/hadoop/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.4.0.jar:/usr/local/hadoop/share/hadoop/yarn/hadoop-yarn-server-resourcemanager-2.4.0.jar:/usr/local/hadoop/share/hadoop/yarn/hadoop-yarn-common-2.4.0.jar:/usr/local/hadoop/share/hadoop/yarn/hadoop-yarn-server-common-2.4.0.jar:/usr/local/hadoop/share/hadoop/yarn/hadoop-yarn-server-nodemanager-2.4.0.jar:/usr/local/hadoop/share/hadoop/yarn/hadoop-yarn-client-2.4.0.jar:/usr/local/hadoop/share/hadoop/yarn/hadoop-yarn-server-applicationhistoryservice-2.4.0.jar:/usr/local/hadoop/share/hadoop/yarn/hadoop-yarn-api-2.4.0.jar:/usr/local/hadoop/share/hadoop/yarn/hadoop-yarn-applications-unmanaged-am-launcher-2.4.0.jar:/usr/local/hadoop/share/hadoop/mapreduce/lib/jackson-mapper-asl-1.8.8.jar:/usr/local/hadoop/share/hadoop/mapreduce/lib/paranamer-2.3.jar:/usr/local/hadoop/share/hadoop/mapreduce/lib/aopalliance-1.0.jar:/usr/local/hadoop/share/hadoop/mapreduce/lib/commons-compress-1.4.1.jar:/usr/local/hadoop/share/hadoop/mapreduce/lib/jersey-guice-1.9.jar:/usr/local/hadoop/share/hadoop/mapreduce/lib/avro-1.7.4.jar:/usr/local/hadoop/share/hadoop/mapreduce/lib/guice-servlet-3.0.jar:/usr/local/hadoop/share/hadoop/mapreduce/lib/jackson-core-asl-1.8.8.jar:/usr/local/hadoop/share/hadoop/mapreduce/lib/commons-io-2.4.jar:/usr/local/hadoop/share/hadoop/mapreduce/lib/hamcrest-core-1.1.jar:/usr/local/hadoop/share/hadoop/mapreduce/lib/asm-3.2.jar:/usr/local/hadoop/share/hadoop/mapreduce/lib/jersey-core-1.9.jar:/usr/local/hadoop/share/hadoop/mapreduce/lib/junit-4.10.jar:/usr/local/hadoop/share/hadoop/mapreduce/lib/guice-3.0.jar:/usr/local/hadoop/share/hadoop/mapreduce/lib/netty-3.6.2.Final.jar:/usr/local/hadoop/share/hadoop/mapreduce/lib/jersey-server-1.9.jar:/usr/local/hadoop/share/hadoop/mapreduce/lib/snappy-java-1.0.4.1.jar:/usr/local/hadoop/share/hadoop/mapreduce/lib/hadoop-annotations-2.4.0.jar:/usr/local/hadoop/share/hadoop/mapreduce/lib/protobuf-java-2.5.0.jar:/usr/local/hadoop/share/hadoop/mapreduce/lib/log4j-1.2.17.jar:/usr/local/hadoop/share/hadoop/mapreduce/lib/javax.inject-1.jar:/usr/local/hadoop/share/hadoop/mapreduce/lib/xz-1.0.jar:/usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.4.0.jar:/usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-hs-plugins-2.4.0.jar:/usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-common-2.4.0.jar:/usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.4.0.jar:/usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-core-2.4.0.jar:/usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-hs-2.4.0.jar:/usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-shuffle-2.4.0.jar:/usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.4.0-tests.jar:/usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-app-2.4.0.jar:/contrib/capacity-scheduler/*.jar
STARTUP_MSG: build = http://svn.apache.org/repos/asf/hadoop/common -r 1583262; compiled by 'jenkins' on 2014-03-31T08:29Z
STARTUP_MSG: java = 1.7.0_55
************************************************************/
14/05/21 12:33:50 INFO namenode.NameNode: registered UNIX signal handlers for [TERM, HUP, INT]
14/05/21 12:33:50 INFO namenode.NameNode: createNameNode [-format]
14/05/21 12:33:51 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Formatting using clusterid: CID-fcb97223-8452-4534-bf0d-23a9c0ce42df
14/05/21 12:33:51 INFO namenode.FSNamesystem: fsLock is fair:true
14/05/21 12:33:51 INFO namenode.HostFileManager: read includes:
HostSet(
)
14/05/21 12:33:51 INFO namenode.HostFileManager: read excludes:
HostSet(
)
14/05/21 12:33:51 INFO blockmanagement.DatanodeManager: dfs.block.invalidate.limit=1000
14/05/21 12:33:51 INFO blockmanagement.DatanodeManager: dfs.namenode.datanode.registration.ip-hostname-check=true
14/05/21 12:33:51 INFO util.GSet: Computing capacity for map BlocksMap
14/05/21 12:33:51 INFO util.GSet: VM type = 64-bit
14/05/21 12:33:51 INFO util.GSet: 2.0% max memory 966.7 MB = 19.3 MB
14/05/21 12:33:51 INFO util.GSet: capacity = 2^21 = 2097152 entries
14/05/21 12:33:51 INFO blockmanagement.BlockManager: dfs.block.access.token.enable=false
14/05/21 12:33:51 INFO blockmanagement.BlockManager: defaultReplication = 1
14/05/21 12:33:51 INFO blockmanagement.BlockManager: maxReplication = 512
14/05/21 12:33:51 INFO blockmanagement.BlockManager: minReplication = 1
14/05/21 12:33:51 INFO blockmanagement.BlockManager: maxReplicationStreams = 2
14/05/21 12:33:51 INFO blockmanagement.BlockManager: shouldCheckForEnoughRacks = false
14/05/21 12:33:51 INFO blockmanagement.BlockManager: replicationRecheckInterval = 3000
14/05/21 12:33:51 INFO blockmanagement.BlockManager: encryptDataTransfer = false
14/05/21 12:33:51 INFO blockmanagement.BlockManager: maxNumBlocksToLog = 1000
14/05/21 12:33:51 INFO namenode.FSNamesystem: fsOwner = hadoop (auth:SIMPLE)
14/05/21 12:33:51 INFO namenode.FSNamesystem: supergroup = supergroup
14/05/21 12:33:51 INFO namenode.FSNamesystem: isPermissionEnabled = true
14/05/21 12:33:51 INFO namenode.FSNamesystem: HA Enabled: false
14/05/21 12:33:51 INFO namenode.FSNamesystem: Append Enabled: true
14/05/21 12:33:51 INFO util.GSet: Computing capacity for map INodeMap
14/05/21 12:33:51 INFO util.GSet: VM type = 64-bit
14/05/21 12:33:51 INFO util.GSet: 1.0% max memory 966.7 MB = 9.7 MB
14/05/21 12:33:51 INFO util.GSet: capacity = 2^20 = 1048576 entries
14/05/21 12:33:51 INFO namenode.NameNode: Caching file names occuring more than 10 times
14/05/21 12:33:51 INFO util.GSet: Computing capacity for map cachedBlocks
14/05/21 12:33:51 INFO util.GSet: VM type = 64-bit
14/05/21 12:33:51 INFO util.GSet: 0.25% max memory 966.7 MB = 2.4 MB
14/05/21 12:33:51 INFO util.GSet: capacity = 2^18 = 262144 entries
14/05/21 12:33:51 INFO namenode.FSNamesystem: dfs.namenode.safemode.threshold-pct = 0.9990000128746033
14/05/21 12:33:51 INFO namenode.FSNamesystem: dfs.namenode.safemode.min.datanodes = 0
14/05/21 12:33:51 INFO namenode.FSNamesystem: dfs.namenode.safemode.extension = 30000
14/05/21 12:33:51 INFO namenode.FSNamesystem: Retry cache on namenode is enabled
14/05/21 12:33:51 INFO namenode.FSNamesystem: Retry cache will use 0.03 of total heap and retry cache entry expiry time is 600000 millis
14/05/21 12:33:51 INFO util.GSet: Computing capacity for map NameNodeRetryCache
14/05/21 12:33:51 INFO util.GSet: VM type = 64-bit
14/05/21 12:33:51 INFO util.GSet: 0.029999999329447746% max memory 966.7 MB = 297.0 KB
14/05/21 12:33:51 INFO util.GSet: capacity = 2^15 = 32768 entries
14/05/21 12:33:51 INFO namenode.AclConfigFlag: ACLs enabled? false
Re-format filesystem in Storage Directory /home/hadoop/hadoopspace/hdfs/namenode ? (Y or N) Y
14/05/21 12:34:10 INFO namenode.FSImage: Allocated new BlockPoolId: BP-614723417-127.0.0.1-1400639649869
14/05/21 12:34:10 INFO common.Storage: Storage directory /home/hadoop/hadoopspace/hdfs/namenode has been successfully formatted.
14/05/21 12:34:10 INFO namenode.NNStorageRetentionManager: Going to retain 1 images with txid >= 0
14/05/21 12:34:10 INFO util.ExitUtil: Exiting with status 0
14/05/21 12:34:10 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at hadoop-spark-vm/127.0.0.1
************************************************************/

[hadoop@hadoop-spark-vm hadoop]$ start-dfs.sh
14/05/21 12:35:04 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Starting namenodes on [localhost]
localhost: starting namenode, logging to /usr/local/hadoop/logs/hadoop-hadoop-namenode-hadoop-spark-vm.out
localhost: starting datanode, logging to /usr/local/hadoop/logs/hadoop-hadoop-datanode-hadoop-spark-vm.out
Starting secondary namenodes [0.0.0.0]
The authenticity of host '0.0.0.0 (0.0.0.0)' can't be established.
RSA key fingerprint is 6b:bc:6d:99:e5:88:52:29:07:cc:e3:bb:b9:dd:af:0b.
Are you sure you want to continue connecting (yes/no)? yes
0.0.0.0: Warning: Permanently added '0.0.0.0' (RSA) to the list of known hosts.
0.0.0.0: starting secondarynamenode, logging to /usr/local/hadoop/logs/hadoop-hadoop-secondarynamenode-hadoop-spark-vm.out
14/05/21 12:35:27 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
[hadoop@hadoop-spark-vm hadoop]$ start-yarn.sh
starting yarn daemons
starting resourcemanager, logging to /usr/local/hadoop/logs/yarn-hadoop-resourcemanager-hadoop-spark-vm.out
localhost: starting nodemanager, logging to /usr/local/hadoop/logs/yarn-hadoop-nodemanager-hadoop-spark-vm.out

Issue the jps command and verify that the following jobs are running:

[hadoop@hadoop-spark-vm hadoop]$ jps
2795 NodeManager
2326 NameNode
2576 SecondaryNameNode
2434 DataNode
2710 ResourceManager
3091 Jps

Now you can try to create some hadoop directories by running

[hadoop@hadoop-spark-vm hadoop]$ hadoop fs -mkdir /test

And verify it is created by running:

[hadoop@hadoop-spark-vm hadoop]$ hadoop fs -ls /
drwxr-xr-x - hadoop supergroup 0 2014-05-21 14:21 /test

This tutorial was followed on AJ’S DATA STORAGE TUTORIALS, and I modified it to suit my own environment.

7 Comments

  1. Ayache Khettar

    Hi Eric

    Thank you for sharing this with us. I followed your steps in installing hadoop 2.4 on Centos Vritual Machine. I have an Hbase running on my mac machine, I get connection refused error when hbase tries to connect.
    Here is my setting in core-site.xml

    fs.default.name
    hdfs://localhost:54310

    I can telnet to the port from the VM but not from a remote machine, i.e my local macbook which is running the virtual machine. Looks like the port is closed to remote client. I have disabled firewall but it didn’t help.

    Any idea?

    Regards,

    Ayache

    1. Eric Lin

      Hi Ayache,

      Thanks for your comment.

      How do you telnet to the VM from your local machine? What’s the address or IP you use? It might be easier to run HBase in the VM as well. But it should not be a problem if runs on different machines. You might need to find out the correct IP address for your VM.

      Regards

    1. Eric Lin

      Sorry Heera, I was overseas for a while and did not have time to check my blog comments. To answer your question, “hadoopspace” is just a directory where you keep your HDFS folders and files, you can call it whatever you want, if it makes sense.

      Cheers.

Leave a Reply

Your email address will not be published. Required fields are marked *