I am trying to run a spark program where i have multiple jar files, if I had only one jar I am not able run. I want to add both the jar files which are in same location. I have tried the below but it shows a dependency error
spark-submit \
--class "max" maxjar.jar Book1.csv test \
--driver-class-path /usr/lib/spark/assembly/lib/hive-common-0.13.1-cdh?5.3.0.jar
How can i add another jar file which is in the same directory?
I want add /usr/lib/spark/assembly/lib/hive-serde.jar
.
This question is related to
submit
apache-spark
classpath
Specifying full path for all additional jars works.
./bin/spark-submit --class "SparkTest" --master local[*] --jars /fullpath/first.jar,/fullpath/second.jar /fullpath/your-program.jar
Or add jars in conf/spark-defaults.conf by adding lines like:
spark.driver.extraClassPath /fullpath/firs.jar:/fullpath/second.jar
spark.executor.extraClassPath /fullpath/firs.jar:/fullpath/second.jar
For --driver-class-path
option you can use :
as delimeter to pass multiple jars.
Below is the example with spark-shell
command but I guess the same should work with spark-submit
as well
spark-shell --driver-class-path /path/to/example.jar:/path/to/another.jar
Spark version: 2.2.0
Pass --jars
with the path of jar
files separated by ,
to spark-submit
.
For reference:
--driver-class-path is used to mention "extra" jars to add to the "driver" of the spark job
--driver-library-path is used to "change" the default library path for the jars needed for the spark driver
--driver-class-path will only push the jars to the driver machine. If you want to send the jars to "executors", you need to use --jars
And to set the jars programatically set the following config:
spark.yarn.dist.jars
with comma-separated list of jars.
Eg:
from pyspark.sql import SparkSession
spark = SparkSession \
.builder \
.appName("Spark config example") \
.config("spark.yarn.dist.jars", "<path-to-jar/test1.jar>,<path-to-jar/test2.jar>") \
.getOrCreate()
if you are using properties file you can add following line there:
spark.jars=jars/your_jar1.jar,...
assuming that
<your root from where you run spark-submit>
|
|-jars
|-your_jar1.jar
You can use --jars $(echo /Path/To/Your/Jars/*.jar | tr ' ' ',') to include entire folder of Jars. So, spark-submit -- class com.yourClass \ --jars $(echo /Path/To/Your/Jars/*.jar | tr ' ' ',') \ ...
Just use the --jars
parameter. Spark will share those jars (comma-separated) with the executors.
You can use * for import all jars into a folder when adding in conf/spark-defaults.conf .
spark.driver.extraClassPath /fullpath/*
spark.executor.extraClassPath /fullpath/*
In Spark 2.3 you need to just set the --jars option. The file path should be prepended with the scheme though ie file:///<absolute path to the jars>
Eg : file:////home/hadoop/spark/externaljsrs/*
or file:////home/hadoop/spark/externaljars/abc.jar,file:////home/hadoop/spark/externaljars/def.jar
Source: Stackoverflow.com