The way to check a HDFS directory s size

Question

I know du -sh in common Linux filesystems  But how to do that with HDFS

User · Answer

To get the size of the directory hdfs dfs -du -s -h   yourDirectoryName can be used   hdfs dfsadmin -report can be used to see a quick cluster level storage report

User · Answer

With this you will get size in GB  hdfs dfs -du PATHTODIRECTORY   awk     0-9      print int  1  1024  3      GB  t   2

User · Answer

Command Should be hadoop fs -du -s -h  dirPath   -du  -s   -h           Show the amount of space  in bytes  used by the files that match the specified   file pattern   -s   Rather than showing the size of each individual file that matches the         pattern  shows the total  summary  size  -h   Formats the sizes of files in a human-readable fashion rather than a number of bytes   Ex MB GB TB etc   Note that  even without the -s option  this only shows size summaries one level deep into a directory   The output is in the form      size    name full path

User · Answer

of used space on Hadoop cluster sudo -u hdfs hadoop fs    df  Capacity under specific folder  sudo -u hdfs hadoop fs -du -h  user

User · Answer

When trying to calculate the total of a particular group of files within a directory the -s option does not work  in Hadoop 2 7 1   For example   Directory structure   some dir  abc txt      count1 txt   count2 txt   def txt       Assume each file is 1 KB in size  You can summarize the entire directory with   hdfs dfs -du -s some dir 4096 some dir   However  if I want the sum of all files containing  count  the command falls short   hdfs dfs -du -s some dir count  1024 some dir count1 txt 1024 some dir count2 txt   To get around this I usually pass the output through awk   hdfs dfs -du some dir count    awk    total   1   END   print total    2048

User · Answer

hadoop version 2 3 33   hadoop fs -dus   path to dir      awk   print  2 1024  3   G

User · Answer

hdfs dfs -count  lt dir gt   info from man page   -count  -q   -h   -v   -t   lt storage type gt     -u   lt path gt          Count the number of directories  files and bytes under the paths   that match the specified file pattern   The output columns are    DIR COUNT FILE COUNT CONTENT SIZE PATHNAME   or  with the -q option    QUOTA REM QUOTA SPACE QUOTA REM SPACE QUOTA         DIR COUNT FILE COUNT CONTENT SIZE PATHNAME

User · Answer

hadoop fs -du -s -h  path to dir displays a directory s size in readable form

User · Answer

Prior to 0 20 203  and officially deprecated in 2 6 0   hadoop fs -dus  directory    Since 0 20 203  dead link  1 0 4 and still compatible through 2 6 0   hdfs dfs -du  -s   -h  URI  URI        You can also run hadoop fs -help for more info and specifics

User · Answer

Extending to Matt D and others answers  the command can be till Apache Hadoop 3 0 0  hadoop fs -du  -s   -h   -v   -x  URI  URI        It displays sizes of files and directories contained in the given directory or the length of a file in case it s just a file    Options     The -s option will result in an aggregate summary of file lengths being displayed  rather than the individual files  Without the -s option  the calculation is done by going 1-level deep from the given path  The -h option will format file sizes in a human-readable fashion  e g 64 0m instead of 67108864  The -v option will display the names of columns as a header line  The -x option will exclude snapshots from the result calculation  Without the -x option  default   the result is always calculated from all INodes  including all snapshots under the given path    du returns three columns with the following format    -------------------------------------------------------------------      size     disk space consumed with all replicas     full path name      -------------------------------------------------------------------      Example command  hadoop fs -du  user hadoop dir1        user hadoop file1       hdfs   nn example com user hadoop dir1   Exit Code  Returns 0 on success and -1 on error  source  Apache doc

[hadoop] The way to check a HDFS directory's size?

Examples related to hadoop

Examples related to command-line

Examples related to directory

Examples related to hdfs