Try dtIngest, it's developed on top of Apache Apex platform. This tool copies data from different sources like HDFS, shared drive, NFS, FTP, Kafka to different destinations. Copying data from remote HDFS cluster to local HDFS cluster is supported by dtIngest. dtIngest runs yarn jobs to copy data in parallel fashion, so it's very fast. It takes care of failure handling, recovery etc. and supports polling directories periodically to do continious copy.
Usage: dtingest [OPTION]... SOURCEURL... DESTINATIONURL example: dtingest hdfs://nn1:8020/source hdfs://nn2:8020/dest