java io IOException Could not locate executable null bin winutils exe in the Hadoop binaries spark Eclipse on windows 7

Question

I m not able to run a simple spark job in Scala IDE  Maven spark project  installed on Windows 7  Spark core dependency has been added   val conf   new SparkConf   setAppName  DemoDF   setMaster  local   val sc   new SparkContext conf  val logData   sc textFile  File txt   logData count     Error    16 02 26 18 29 33 INFO SparkContext  Created broadcast 0 from textFile at FrameDemo scala 13 16 02 26 18 29 34 ERROR Shell  Failed to locate the winutils binary in the hadoop binary path java io IOException  Could not locate executable null bin winutils exe in the Hadoop binaries      at org apache hadoop util Shell getQualifiedBinPath Shell java 278      at org apache hadoop util Shell getWinUtilsPath Shell java 300      at org apache hadoop util Shell  lt clinit gt  Shell java 293      at org apache hadoop util StringUtils  lt clinit gt  StringUtils java 76      at org apache hadoop mapred FileInputFormat setInputPaths FileInputFormat java 362      at  lt br gt org apache spark SparkContext  anonfun hadoopFile 1  anonfun 33 apply SparkContext scala 1015      at org apache spark SparkContext  anonfun hadoopFile 1  anonfun 33 apply SparkContext scala 1015      at  lt br gt org apache spark rdd HadoopRDD  anonfun getJobConf 6 apply HadoopRDD scala 176      at  lt br gt org apache spark rdd HadoopRDD  anonfun getJobConf 6 apply HadoopRDD scala 176  lt br gt      at scala Option map Option scala 145  lt br gt      at org apache spark rdd HadoopRDD getJobConf HadoopRDD scala 176  lt br gt      at org apache spark rdd HadoopRDD getPartitions HadoopRDD scala 195  lt br gt      at org apache spark rdd RDD  anonfun partitions 2 apply RDD scala 239  lt br gt      at org apache spark rdd RDD  anonfun partitions 2 apply RDD scala 237  lt br gt      at scala Option getOrElse Option scala 120  lt br gt      at org apache spark rdd RDD partitions RDD scala 237  lt br gt      at org apache spark rdd MapPartitionsRDD getPartitions MapPartitionsRDD scala 35  lt br gt      at org apache spark rdd RDD  anonfun partitions 2 apply RDD scala 239  lt br gt      at org apache spark rdd RDD  anonfun partitions 2 apply RDD scala 237  lt br gt      at scala Option getOrElse Option scala 120  lt br gt      at org apache spark rdd RDD partitions RDD scala 237  lt br gt      at org apache spark SparkContext runJob SparkContext scala 1929  lt br gt      at org apache spark rdd RDD count RDD scala 1143  lt br gt      at com org SparkDF FrameDemo  main FrameDemo scala 14  lt br gt      at com org SparkDF FrameDemo main FrameDemo scala  lt br gt

User · Accepted Answer

Here is a good explanation of your problem with the solution.

Download winutils.exe from http://public-repo-1.hortonworks.com/hdp-win-alpha/winutils.exe.
SetUp your HADOOP_HOME environment variable on the OS level or programmatically:

System.setProperty("hadoop.home.dir", "full path to the folder with winutils");
Enjoy

User · Answer

Setting the Hadoop Home environment variable in system properties didn t work for me  But this did    Set the Hadoop Home in the Eclipse Run Configurations environment tab   Follow the  Windows Environment Setup  from here

User · Answer

I got the same problem while running unit tests  I found this workaround solution   The following workaround allows to get rid of this message       File workaround   new File           System getProperties   put  hadoop home dir   workaround getAbsolutePath         new File    bin   mkdirs        new File    bin winutils exe   createNewFile      from  https   issues cloudera org browse DISTRO-544

User · Answer

You can alternatively download winutils exe from GITHub   https   github com steveloughran winutils tree master hadoop-2 7 1 bin  replace hadoop-2 7 1 with the version you want and place the file in D  hadoop bin     If you do not have access rights to the environment variable settings   on your machine  simply add the below line to your code    System setProperty  hadoop home dir    D   hadoop

User · Answer

That s a tricky one    Your storage letter must be capical  For example  C

User · Answer

Download winutils exe Create folder  say C  winutils bin Copy winutils exe inside C  winutils bin Set environment variable HADOOP HOME to C  winutils

User · Answer

On Windows 10 - you should add two different arguments     1  Add the new variable and value as - HADOOP HOME and path  i e  c  Hadoop  under System Variables    2  Add append new entry to the  Path  variable as  C  Hadoop bin     The above worked for me

User · Answer

Follow this    Create a bin folder in any directory to be used in step 3   Download winutils exe and place it in the bin directory  Now add System setProperty  hadoop home dir    PATH TO THE DIR    in your code

User · Answer

1  Download winutils exe from https   github com steveloughran winutils  2  Create a directory In windows  C  winutils bin 3  Copy the winutils exe inside the above bib folder   4  Set the environmental property in the code    System setProperty  hadoop home dir    file    C  winutils     5  Create a folder  file    C  temp  and give 777 permissions  6  Add config property in spark Session   config  spark sql warehouse dir    file    C  temp

User · Answer

if we see below issue  ERROR Shell  Failed to locate the winutils binary in the hadoop binary path java io IOException  Could not locate executable null bin winutils exe in the Hadoop binaries   then do following steps  download winutils exe from  http   public-repo-1 hortonworks com hdp- win-alpha winutils exe  and keep this under bin folder of any folder you created for e g  C  Hadoop bin and in program add following line before creating SparkContext or SparkConf System setProperty  quot hadoop home dir quot    quot C  Hadoop quot

User · Answer

I have also faced the similar problem with the following details Java 1 8 0 121  Spark spark-1 6 1-bin-hadoop2 6  Windows 10 and Eclipse Oxygen When I ran my WordCount java in Eclipse using HADOOP HOME as a system variable as mentioned in the previous post  it did not work  what worked for me is -  System setProperty  hadoop home dir    PATH TO THE DIR      PATH TO THE DIR bin winutils exe whether you run within Eclipse as a Java application  or by spark-submit from cmd using   spark-submit --class groupid artifactid classname --master local 2   path to the jar file created using maven  path to a demo test file  path to output directory command  Example  Go to the bin location of Spark home location bin and execute the spark-submit as mentioned   D  BigData spark-2 3 0-bin-hadoop2 7 bin spark-submit --class com bigdata abdus sparkdemo WordCount --master local 1  D  BigData spark-quickstart target spark-quickstart-0 0 1-SNAPSHOT jar D  BigData spark-quickstart wordcount txt

User · Answer

On top of mentioning your environment variable for HADOOP HOME in windows as C  winutils  you also need to make sure you are the administrator of the machine  If not and adding environment variables prompts you for admin credentials  even under USER variables  then these variables will be applicable once you start your command prompt as administrator

[eclipse] java.io.IOException: Could not locate executable null\bin\winutils.exe in the Hadoop binaries. spark Eclipse on windows 7

Examples related to eclipse

Examples related to scala

Examples related to apache-spark