The Jar files for Hadoop 2.x have moved location from Hadoop 1.x. I found the following command
javac -classpath $HADOOP_HOME/share/hadoop/common/hadoop-common-2.2.0.jar:$HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-client-core-2.2.0.jar:$HADOOP_HOME/share/hadoop/common/lib/commons-cli-1.2.jar -d wordcount_classes myWordCount.java
will allow you to compile the standard wordcount example code. You can see that the common files are in /share/hadoop/common/ and the mapreduce files are in /share/hadoop/mapreduce/. Finally the common lib file are in /share/hadoop/common/lib
This post is in answer to this stackoverflow question:
http://stackoverflow.com/questions/19488894/compile-hadoop-2-2-0-job
(or set your classpath lke this
export CLASSPATH=$HADOOP_HOME/share/hadoop/common/hadoop-common-2.2.0.jar:$HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-client-core-2.2.0.jar:$HADOOP_HOME/share/hadoop/common/lib/commons-cli-1.2.jar
and compile like this:
javac -classpath $CLASSPATH -d myWordCountClasses myWordCount.java
)
A blog about software development, mostly in Java and an emphasis on Big Data, noSql (Cassandra) and webapps. Many posts relate to the ac32007 (now ac32009) (Internet Programming) and ac41011 (Big Data) modules I teach at the University of Dundee
Showing posts with label Hadoop. Show all posts
Showing posts with label Hadoop. Show all posts
Saturday, November 2, 2013
Wednesday, October 30, 2013
Hadoop 2 on Ubuntu on Azure.
This is to be read in conjunction with http://ac31004.blogspot.co.uk/2013/10/installing-hadoop-2-on-mac_29.html
Fire up a Azure Ubuntu server and ssh to it
Install a Java JDK:
apt-get install default-jdk
On you home machine, download a copy of Hadoop and secure copy it to the Azure machine (your username and machine will be different)
scp hadoop-2.2.0.tar.gz user@Hadoopmachine.cloudapp.net:
Unzip it and untar it
gunzip hadoop-2.2.0.tar.gz
tar xvf hadoop-2.2.0.tar
You'll still need to set up the env variables
export JAVA_HOME=/usr/lib/jvm/default-java
export HADOOP_INSTALL=/home/user/hadoop-2.2.0
export PATH=$PATH:$HADOOP_INSTALL/bin:$HADOOP_INSTALL/sbin
Also add JAVA_HOME, add Hadoop_INSTALL and change path in /etc/environment, see http://trentrichardson.com/2010/02/10/how-to-set-java_home-in-ubuntu/ for details
After setting up core-site.xml and hdfs-site.xml you'll make the datanode and name nodename directories
mkdir -p /home/hadoop/yarn/namenode
mkdir /home/hadoop/yarn/datanode
Everything else should be the same.
Fire up a Azure Ubuntu server and ssh to it
Install a Java JDK:
apt-get install default-jdk
On you home machine, download a copy of Hadoop and secure copy it to the Azure machine (your username and machine will be different)
scp hadoop-2.2.0.tar.gz user@Hadoopmachine.cloudapp.net:
Unzip it and untar it
gunzip hadoop-2.2.0.tar.gz
tar xvf hadoop-2.2.0.tar
You'll still need to set up the env variables
export JAVA_HOME=/usr/lib/jvm/default-java
export HADOOP_INSTALL=/home/user/hadoop-2.2.0
export PATH=$PATH:$HADOOP_INSTALL/bin:$HADOOP_INSTALL/sbin
Also add JAVA_HOME, add Hadoop_INSTALL and change path in /etc/environment, see http://trentrichardson.com/2010/02/10/how-to-set-java_home-in-ubuntu/ for details
After setting up core-site.xml and hdfs-site.xml you'll make the datanode and name nodename directories
mkdir -p /home/hadoop/yarn/namenode
mkdir /home/hadoop/yarn/datanode
Everything else should be the same.
Tuesday, October 29, 2013
Installing Hadoop 2 on a Mac
I've had a lot of trouble getting Hadoop 2 and yarn 2 running on my MAC. There are some tutorials out there but they are often for
beta and alpha versions of the hadoop 2.0 family. These are the steps I used to get Hadoop 2.2.0 working on my MAC running OSX 10.9
Note: watch for version differences in this blog. It was written for Hadoop 2.2.0, we are currently on 2.6.2 so that will need to be changed throughout.
Get hadoop from http://www.apache.org/dyn/closer.cgi/hadoop/common/
make sure JAVA_HOME is set (if you have Java 6 on your machine):
export JAVA_HOME=`/usr/libexec/java_home -v1.6`
(Note your Java version should be 1.7 or 1.8)
point HADOOP_INSTALL to the hadoop installation directory
export HADOOP_INSTALL=/Applications/hadoop-2.2.0
And set the path
export PATH=$PATH:$HADOOP_INSTALL/bin:$HADOOP_INSTALL/sbin
You can test hadoop is found with
hadoop -version
make sure ssh is set up on your machine:
system preferences -> sharing -> remote login is ticked
try:
ssh@localhost
where is the name you used to logon.
in $HADOOP_INSTALL/etc these are the conf files I changed.
core-site.xml
hdfs-site.xml
Make the directories for the namenode and datanode data (note the file above and the mkdir below will need to reflect where you want to store the files, I've stored mine in the home directory of the Administrator user on my Mac).
mkdir -p /Users/Administrator/hadoop/namenode
mkdir -p /Users/Administrator/hadoop/datanode
hadoop namenode -format
yarn-site.xml
start-dfs.sh
start-yarn.sh
jps
should give
9430 ResourceManager
9325 SecondaryNameNode
9513 NodeManager
9225 DataNode
9916 Jps
9140 NameNode
if not check log files. If data node is not started and you get incompatible id's error, stop everything delete datanode directory and recreate
datanode directory
try a ls
hadoop fs -ls
if you get
ls: `.': No such file or directory
then there is no home directory in the hadoop file system. So
hadoop fs -mkdir /user
hadoop fs -mkdir /user/<username>
where is the name you are logged onto the machine with.
now change to $HADOOP_INSTALL directory and upload a file
hadoop fs -put LICENSE.txt
finally try a mapreduce job:
cd share/hadoop/mapreduce
hadoop jar ./hadoop-mapreduce-examples-2.2.0.jar wordcount LICENSE.txt out
beta and alpha versions of the hadoop 2.0 family. These are the steps I used to get Hadoop 2.2.0 working on my MAC running OSX 10.9
Note: watch for version differences in this blog. It was written for Hadoop 2.2.0, we are currently on 2.6.2 so that will need to be changed throughout.
Get hadoop from http://www.apache.org/dyn/closer.cgi/hadoop/common/
make sure JAVA_HOME is set (if you have Java 6 on your machine):
export JAVA_HOME=`/usr/libexec/java_home -v1.6`
(Note your Java version should be 1.7 or 1.8)
point HADOOP_INSTALL to the hadoop installation directory
export HADOOP_INSTALL=/Applications/hadoop-2.2.0
And set the path
export PATH=$PATH:$HADOOP_INSTALL/bin:$HADOOP_INSTALL/sbin
You can test hadoop is found with
hadoop -version
make sure ssh is set up on your machine:
system preferences -> sharing -> remote login is ticked
try:
ssh
where
in $HADOOP_INSTALL/etc these are the conf files I changed.
core-site.xml
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>
hdfs-site.xml
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/Users/Administrator/hadoop/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/Users/Administrator/hadoop/datanode</value>
</property>
</configuration>
Make the directories for the namenode and datanode data (note the file above and the mkdir below will need to reflect where you want to store the files, I've stored mine in the home directory of the Administrator user on my Mac).
mkdir -p /Users/Administrator/hadoop/namenode
mkdir -p /Users/Administrator/hadoop/datanode
hadoop namenode -format
yarn-site.xml
<configuration>
<!-- Site specific YARN configuration properties -->
<property>
<name>yarn.resourcemanager.address</name>
<value>localhost:8032</value>
</property>
<property>
<name>yarn.nodemanager-aux-services</name>
<value>madpreduce.shuffle</value>
</property>
</configuration>
start-dfs.sh
start-yarn.sh
jps
should give
9430 ResourceManager
9325 SecondaryNameNode
9513 NodeManager
9225 DataNode
9916 Jps
9140 NameNode
if not check log files. If data node is not started and you get incompatible id's error, stop everything delete datanode directory and recreate
datanode directory
try a ls
hadoop fs -ls
if you get
ls: `.': No such file or directory
then there is no home directory in the hadoop file system. So
hadoop fs -mkdir /user
hadoop fs -mkdir /user/
where
now change to $HADOOP_INSTALL directory and upload a file
hadoop fs -put LICENSE.txt
finally try a mapreduce job:
cd share/hadoop/mapreduce
hadoop jar ./hadoop-mapreduce-examples-2.2.0.jar wordcount LICENSE.txt out
Subscribe to:
Posts (Atom)