These days, the organizations are handling large volumes of data. Hadoop is one of the most popular frameworks which is helping the organizations to store and manage the data efficiently. In this blog, we are here to discuss the most frequently used Hadoop commands. Through Hadoop training, you will gain an in-depth knowledge in Hadoop and its concepts, helping you to clear the Hadoop certification exam. Let's get started!
Basically, Hadoop commands are used for performing the operations. Hadoop HDFS is referred to as a distributed file system that helps in providing redundant storage for the files of different sizes, which needs to be stored. HDFS is one of the key components which helps in storing large volumes of unstructured and structured data. Hadoop commands help in the execution of the tasks or operations and all the commands are to be executed in the bin shell scripts. Let us know some of the most commonly used Hadoop commands which are used during our real-time work culture.
Become a Hadoop Certified professional by learning this HKR Hadoop Training !
Hadoop is capable of storing large volumes of data like petabytes of data using HDFS – Hadoop Distributed File System which stores structured to unstructured data. There are different set of commands that can be used to perform the file operations as required. Let us have a look at some of the commonly used Hadoop commands.
Name of the command: version
Usage of the command: version
Representation with an example: Hadoop version
Explanation: The version command is used to display the version of the software i.e Hadoop that is installed in the system.
Name of the command: mkdir
Usage of the command: mkdir
Representation with an example:
hdfs dfs - mkdir/ ram/mrv1
Explanation: The mkdir command is used to create a directory by taking the
Name of the command: Is
Usage of the command: Is
Representation with an example: hdfs dfs -ls/user
Explanation:
The ls command is used for displaying the content that is present in the directory and is specified by the
Name of the command: put
Usage of the command: put
Representation with an example: hdfs dfs -put/home/sample.txt/user/dir1
Explanation:
The put command is used for copying the file in the local filesystem to the file in the DFS.
Name of the command: copyFrom Local
Usage of the command: copyFrom Local
Representation with an example:
hdfs dfs -copyFromLocal /home/sample /user/dir1
Want to know more about Hadoop,visit here Hadoop Tutorial !
Explanation: The copyfrom Local command is similar to the put command. However, the source will be the local files only.
Name of the command: get
Usage of the command: get
Representation with an example: hdfs dfs -get /user/dir1 /home
Explanation: The get command is used to copy the file present in HDFS that is identified as the
Name of the command: copyToLocal
Usage of the command:copyToLocal
Representation with an example:hdfs dfs -copyToLocal /user/dir1 /home
Explanation: The copyToLocal command is similar to the get command with a slight difference. The destination of the copied file should refer to the local file.
Name of the command: cat
Usage of the command: cat
Representation with an example: hdfs dfs -cat /user/cloudera/dezyre1/sample1.txt
Explanation: The cat command is used to display the contents of the file that is available on the console.
Name of the command: mv
Usage of the command: mv
Representation with an example: hdfs dfs -mv /user/dir1/sample.txt /user/dir2
Explanation: The mv command is specifically used for moving the file from the source location to the destination location that is available within the hadoop distributed file system.
Name of the command: cp
Usage of the command: cp
Representation with an example: hdfs dfs -cp /user/dir2/sample1.txt /user/dir1
Explanation: The cp command is used for copying the file or a directory from source location to the destination location within the hadoop distributed file system.
Name of the command: touchz
Usage of the command: bin/hdfs dfs -touchz
Representation with an example: bin/hdfs dfs -touchz /geeks/myfile.txt
Explanation: The touchz command is used for the creation of an empty file.
Top 30+ frequently asked Big Data Hadoop Interview Questions !
Name of the command: moveFromLocal
Usage of the command: bin/hdfs dfs -moveFromLocal
Representation with an example:
bin/hdfs dfs -moveFromLocal ../Desktop/cutAndPaste.txt /geeks
Explanation: The mv command is used for moving a particular file from local to the hadoop distributed file system.
Name of the command: du
Usage of the command: bin/hdfs dfs -du
Representation with an example: bin/hdfs dfs -du /geeks
Explanation: The du command is used for displaying the size of each file present in the directory.
Name of the command: du
Usage of the command: bin/hdfs dfs -dus
Representation with an example: bin/hdfs dfs -dus /geeks
Explanation: The dus command is used for displaying the total size of the directory or a file.
Name of the command: stat
Usage of the command: bin/hdfs dfs -stat
Representation with an example: bin/hdfs dfs -stat /geeks
Explanation: the stat command is used to display the last modified date of the path or a file, it shows the latest stats of a file.
Name of the command: setrep
Usage of the command: setrep [-R] [-w] rep
Representation with an example: bin/hdfs dfs -setrep -R -w 6 geeks.txt
Explanation: The setrep command is used in order to make changes to the replication factor of a particular file or a directory present in the Hadoop distribution file system.
Name of the command: test
Usage of the command: hadoop fs -test -[defswrz] URI
Representation with an example:hadoop fs -test -e filename
Explanation: The test command in Hadoop is used for performing the file test operations.
If the path exists, then it gives 1 and if the path provided is a directory or has zero length, then it return the value as 0.
Name of the command: text
Usage of the command: hadoop fs -text
Representation with an example:
Explanation: The Hadoop text command takes the source file as an input and represents the file in the text format. It is capable of detecting the encoding of the file and will decode into the plain text.
Name of the command: find
Usage of the command: hadoop fs -find
Representation with an example:
Explanation: The find command in hadoop is used to find the files that match a particular expression.
Name of the command: getmerge
Usage of the command: hdfs dfs –getmerge [-nl]
Representation with an example: hdfs dfs -getmerge /user/dataflair/dir1/sample.txt /user/dataflair/dir2/sample2.txt /home/sample1.txt
Explanation: The getmerge command is used for merging or concatenating the file content present in the source file to the destination file.
Name of the command: count
Usage of the command: hadoop fs -count [options]
Representation with an example:
Explanation: The count command in hadoop is used for displaying the count of the files or directories or the bytes that are available in the path specified.
Name of the command: AppendToFile
Usage of the command: hadoop fs -appendToFile
Representation with an example:
Explanation: The AppendToFile command is used to append the contents of a single file or multiple files that are specified in the local source to the destination file provided in the Hadoop Distributed file system.
Name of the command: chmod
Usage of the command: hadoop fs -chmod [-R]
Representation with an example:
Explanation: The chmod command in Hadoop is used to make changes to the permissions of a particular file.
Name of the command: rmv
Usage of the command: hadoop fs -rmv
Representation with an example: #hadobp fs –rmv hdfs test
Explanation: The rmv command is used for removing the directory.
Name of the command: expunge
Usage of the command: $ hadoop fs –expunge
Representation with an example: user@ubuntu1:~$ hadoop fs –expunge
Explanation: The expunge command in Hadoop is used to delete or empty the thrash that is present in the hadoop distributed file system.
Name of the command: chown
Usage of the command: hadoop fs -chown [-R] [owner] [:[group]]
Representation with an example:
Explanation: The chown command in Hadoop is used to change the name of athe owner of a particular file.
Name of the command: dfsadmin
Usage of the command: $ hadoop [–config confdir] [Command] [Generic_Options] [Command_Options]
Representation with an example:
Explanation: The dfsadmin command is used to perform the file administration related activities.
Name of the command: tasktracker
Usage of the command:hadoop tasktracker
Explanation: The task tracker is used for running a mapreduce task tracker node.
Name of the command: jobtracker
Usage of the command:hadoop jobtracker
Explanation: The job tracker is used for running a mapreduce job tracker node.
Name of the command: rmr
Usage of the command: bin/hdfs dfs -rmr <filename/directoryName>
Example: bin/hdfs dfs -rmr /geeks_copied
Name of the command: test
Usage of the command: $ hadoop fs -test
Example: $ hadoop fs -test -[defsz] /user/test/test.txt
Name of the command: balancer
Usage of the command: hadoop balancer [ -threshold
Example: hadoop balancer -threshold 20
Name of the command: Secondary NameNode
Usage of the command: hadoop secondarynamenode [-checkpoint [force]] | [-gettextsize]
Example: hadoop secondarynamenode –gettextsize
Name of the command:Datanode
Usage of the command:hadoop datanode [-rollback]
Example:hadoop datanode –rollback
Conclusion:
I hope you have now gained an understanding of the hadoop commands and its usage. All the above listed commands are some of the frequently used commands while working with Hadoop. To gain more knowledge and skills in Hadoop, you can get through the Hadoop training and become a professional expert by mastering all the skills requireD.
Related articles
Batch starts on 29th Sep 2023, Fast Track batch
Batch starts on 3rd Oct 2023, Weekday batch
Batch starts on 7th Oct 2023, Weekend batch
The hadoop commands can be within a pseudo distributed cluster or by using any of the Virtual machines like cloudera, hortonworks, etc.
Below is the command used for listing the files in HDFS.
hdfs dfs -ls command
The start-all.sh is the command that is used for starting the hadoop in terminal.