DistCp (distributed copy) is a tool used for copying data between clusters. It uses MapReduce to effect its distribution, error handling and recovery, and reporting. It expands a list of files and directories into input to map tasks, each of which will copy a partition of the files specified in the source list.
Usage: $ hadoop distcp <src> <dst>
example: $ hadoop distcp hdfs://nn1:8020/file1 hdfs://nn2:8020/file2
file1
from nn1
is copied to nn2
with filename file2
Distcp is the best tool as of now. Sqoop is used to copy data from relational database to HDFS and vice versa, but not between HDFS to HDFS.
More info:
There are two versions available - runtime performance in distcp2
is more compared to distcp