How to fix corrupt HDFS FIles

Question

How does someone fix a HDFS that s corrupt   I looked on the Apache Hadoop website and it said its fsck command  which doesn t fix it  Hopefully someone who has run into this problem before can tell me how to fix this      Unlike a traditional fsck utility for native file systems  this command does not correct the errors it detects  Normally NameNode automatically corrects most of the recoverable failures    When I ran bin hadoop fsck   -delete  it listed the files that were corrupt or missing blocks   How do I make it not corrupt   This is on a practice machine so I COULD blow everything away but when we go live  I won t be able to  fix  it by blowing everything away so I m trying to figure it out now

User · Answer

If you just want to get your HDFS back to normal state and don't worry much about the data, then

This will list the corrupt HDFS blocks:

hdfs fsck -list-corruptfileblocks

This will delete the corrupted HDFS blocks:

hdfs fsck / -delete

Note that, you might have to use sudo -u hdfs if you are not the sudo user (assuming "hdfs" is name of the sudo user)

User · Answer

the solution here worked for me   https   community hortonworks com articles 4427 fix-under-replicated-blocks-in-hdfs-manually html  su -  lt  hdfs user gt   bash-4 1  hdfs fsck     grep  Under replicated    awk -F      print  1    gt  gt   tmp under replicated files   -bash-4 1  for hdfsfile in  cat  tmp under replicated files   do echo  Fixing  hdfsfile       hadoop fs -setrep 3  hdfsfile  done

User · Answer

You can use     hdfs fsck     to determine which files are having problems  Look through the output for missing or corrupt blocks  ignore under-replicated blocks for now   This command is really verbose especially on a large HDFS filesystem so I normally get down to  the meaningful output with    hdfs fsck     egrep -v           grep -v eplica   which ignores lines with nothing but dots and lines talking about replication   Once you find a file that is corrupt    hdfs fsck  path to corrupt file -locations -blocks -files   Use that output to determine where blocks might live  If the file is larger than your block size it might have multiple blocks   You can use the reported block numbers to go around to the  datanodes and the namenode logs searching for the machine or machines on which the blocks lived  Try looking for filesystem errors on those machines  Missing mount points  datanode not running  file system reformatted reprovisioned  If you can find a problem in that way and bring the block back online that file will be healthy again   Lather rinse and repeat until all files are healthy or you exhaust all alternatives looking for the blocks   Once you determine what happened and you cannot recover any more blocks  just use the     hdfs fs -rm  path to file with permanently missing blocks   command to get your HDFS filesystem back to healthy so you can start tracking new errors as they occur

User · Answer

start all daemons and run the command as  hadoop namenode -recover -force  stop the daemons and start again   wait some time to recover data

[hadoop] How to fix corrupt HDFS FIles

Examples related to hadoop

Examples related to hdfs