Mahbubul Majumder, PhD
Dec 2, 2014
hadoop fs -<linux commands>
ls
, mkdir
, rm
. Visit our lecture listing some Linux commands.hadoop fs -ls
to create a directory called myHdfs use
hadoop fs -mkdir myHdfs
The syntax for copying files to/from Hadoop File System (HDFS) is
hadoop fs -put <linux source> <hdfs destination>
Example: you can copy a file from Linux system to newly created HDFS directory using command
hadoop fs -put myLinuxFile.txt myHdfs/fileName.txt
hadoop fs -get myHdfs/fileName.txt myLinuxFile.txt
We have high performance computing facilities at PKI. View more details.
crane
, tusker
, sandhill
and red
crane
has 452 nodes (64GB RAM per node)For our class we have a small 10 nodes cluster setup at PKI
ssh <username>@crane.unl.edu
, it will require two way authentication and thenssh <username>@10.138.11.29
We want to count number of words in a file or a number of files
We will use built in virtual machine provided by Cloudera
user: cloudera
and password: cloudera
for all applicationsRun your virtual machine using VMWare Player
. You are now ready test your mapreduce codes.
hadoop fs -mkdir wordcount
hadoop fs -mkdir wordcount/input
echo "Data Science Class and Data Science HW" > file0
echo "Nice Class Nice HW Nice Project and Nice program" > file1
hadoop fs -put file* wordcount/input
file*
to indicate both files file0
and file1
. Make sure the files are copied properly. To check that use hadoop fs -ls wordcount/input
mkdir wordcount_classes
WordCount.java
in your current directory.hadoop classpath
. The classpath is given as the option of javac
command. javac -cp /usr/lib/hadoop/client-0.20/\* -d wordcount_classes WordCount.java
jar -cvf wordcount.jar -C wordcount_classes/ .
hadoop jar wordcount.jar org.myorg.WordCount wordcount/input wordcount/output
hadoop fs -cat wordcount/output/part-00000
Class 2
Data 2
HW 2
Nice 4
Project 1
Science 2
and 2
program 1
hadoop fs -rm -r wordcount
rm -r -f wordcount_classes
rm -f file* wordcount.jar
Apache Hadoop web site
http://hadoop.apache.org/
Apache Hadoop shell command guide
http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/FileSystemShell.html
More about Cloudera QuickStart Virtual Machine
http://www.cloudera.com/content/cloudera/en/documentation/DemoVMs/Cloudera-QuickStart-VM/cloudera_quickstart_vm.html
Documentation of Hollanad Computing Center (HCC) at PKI
https://hcc-docs.unl.edu/display/HCCDOC/HCC+Documentation