Spark java lang OutOfMemoryError Java heap space

Question

My cluster  1 master  11 slaves  each node has 6 GB memory   My settings   spark executor memory 4g  Dspark akka frameSize 512   Here is the problem   First  I read some data  2 19 GB  from HDFS to RDD   val imageBundleRDD   sc newAPIHadoopFile        Second  do something on this RDD   val res   imageBundleRDD map data   gt                                   val desPoints   threeDReconstruction data  2  bg                                    data  1  desPoints                                    Last  output to HDFS   res saveAsNewAPIHadoopFile        When I run my program it shows         14 01 15 21 42 27 INFO cluster ClusterTaskSetManager  Starting task 1 0 24 as TID 33 on executor 9  Salve7 Hadoop  NODE LOCAL  14 01 15 21 42 27 INFO cluster ClusterTaskSetManager  Serialized task 1 0 24 as 30618515 bytes in 210 ms 14 01 15 21 42 27 INFO cluster ClusterTaskSetManager  Starting task 1 0 36 as TID 34 on executor 2  Salve11 Hadoop  NODE LOCAL  14 01 15 21 42 28 INFO cluster ClusterTaskSetManager  Serialized task 1 0 36 as 30618515 bytes in 449 ms 14 01 15 21 42 28 INFO cluster ClusterTaskSetManager  Starting task 1 0 32 as TID 35 on executor 7  Salve4 Hadoop  NODE LOCAL  Uncaught error from thread  spark-akka actor default-dispatcher-3  shutting down JVM since  akka jvm-exit-on-fatal-error  is enabled for ActorSystem spark  java lang OutOfMemoryError  Java heap space   There are too many tasks   PS  Every thing is ok when the input data is about 225 MB    How can I solve this problem

User · Answer

Setting these exact configurations helped resolving the issue   spark-submit --conf spark yarn maxAppAttempts 2 --executor-memory 10g --num-executors 50 --driver-memory 12g

User · Answer

To add a use case to this that is often not discussed  I will pose a solution when submitting a Spark application via spark-submit in local mode   According to the gitbook Mastering Apache Spark by Jacek Laskowski      You can run Spark in local mode  In this non-distributed single-JVM deployment mode  Spark spawns all the execution components - driver  executor  backend  and master - in the same JVM  This is the only mode where a driver is used for execution    Thus  if you are experiencing OOM errors with the heap  it suffices to adjust the driver-memory rather than the executor-memory   Here is an example   spark-1 6 1 bin spark-submit   --class  MyClass    --driver-memory 12g   --master local       target scala-2 10 simple-project 2 10-1 0 jar

User · Answer

Did you dump your master gc log  So I met similar issue and I found SPARK DRIVER MEMORY only set the Xmx heap  The initial heap size remains 1G and the heap size never scale up to the Xmx heap   Passing  --conf  spark driver extraJavaOptions -Xms20g  resolves my issue   ps aux   grep java and the you ll see the follow log    24501 30 7 1 7 41782944 2318184 pts 0 Sl  18 49 0 33  usr java latest bin java -cp  opt spark conf   opt spark jars   -Xmx30g -Xms20g

User · Answer

I have a few suggestions   If your nodes are configured to have 6g maximum for Spark  and are leaving a little for other processes   then use 6g rather than 4g  spark executor memory 6g  Make sure you re using as much memory as possible by checking the UI  it will say how much mem you re using  Try using more partitions  you should have 2 - 4 per CPU   IME increasing the number of partitions is often the easiest way to make a program more stable  and often faster   For huge amounts of data you may need way more than 4 per CPU  I ve had to use 8000 partitions in some cases  Decrease the fraction of memory reserved for caching  using spark storage memoryFraction  If you don t use cache   or persist in your code  this might as well be 0   It s default is 0 6  which means you only get 0 4   4g memory for your heap   IME reducing the mem frac often makes OOMs go away  UPDATE  From spark 1 6 apparently we will no longer need to play with these values  spark will determine them automatically  Similar to above but shuffle memory fraction   If your job doesn t need much shuffle memory then set it to a lower value  this might cause your shuffles to spill to disk which can have catastrophic impact on speed    Sometimes when it s a shuffle operation that s OOMing you need to do the opposite i e  set it to something large  like 0 8  or make sure you allow your shuffles to spill to disk  it s the default since 1 0 0   Watch out for memory leaks  these are often caused by accidentally closing over objects you don t need in your lambdas   The way to diagnose is to look out for the  quot task serialized as XXX bytes quot  in the logs  if XXX is larger than a few k or more than an MB  you may have a memory leak  See https   stackoverflow com a 25270600 1586965 Related to above  use broadcast variables if you really do need large objects  If you are caching large RDDs and can sacrifice some access time consider serialising the RDD http   spark apache org docs latest tuning html serialized-rdd-storage  Or even caching them on disk  which sometimes isn t that bad if using SSDs    Advanced  Related to above  avoid String and heavily nested structures  like Map and nested case classes   If possible try to only use primitive types and index all non-primitives especially if you expect a lot of duplicates  Choose WrappedArray over nested structures whenever possible   Or even roll out your own serialisation - YOU will have the most information regarding how to efficiently back your data into bytes  USE IT   bit hacky  Again when caching  consider using a Dataset to cache your structure as it will use more efficient serialisation  This should be regarded as a hack when compared to the previous bullet point   Building your domain knowledge into your algo serialisation can minimise memory cache-space by 100x or 1000x  whereas all a Dataset will likely give is 2x - 5x in memory and 10x compressed  parquet  on disk   http   spark apache org docs 1 2 1 configuration html EDIT   So I can google myself easier  The following is also indicative of this problem  java lang OutOfMemoryError   GC overhead limit exceeded

User · Answer

You should configure offHeap memory settings as shown below   val spark   SparkSession       builder         master  local            config  spark executor memory    70g         config  spark driver memory    50g         config  spark memory offHeap enabled  true        config  spark memory offHeap size   16g            appName  sampleCodeForReference         getOrCreate     Give the driver memory and executor memory as per your machines RAM availability  You can increase the offHeap size if you are still facing the OutofMemory issue

User · Answer

From my understanding of the code provided above  it loads the file and does map operation and saves it back  There is no operation that requires shuffle  Also  there is no operation that requires data to be brought to the driver hence tuning anything related to shuffle or driver may have no impact  The driver does have issues when there are too many tasks but this was only till spark 2 0 2 version  There can be two things which are going wrong     There are only one or a few executors  Increase the number of executors so that they can be allocated to different slaves  If you are using yarn need to change num-executors config or if you are using spark standalone then need to tune num cores per executor and spark max cores conf  In standalone num executors   max cores    cores per executor   The number of partitions are very few or maybe only one  So if this is low even if we have multi-cores multi executors it will not be of much help as parallelization is dependent on the number of partitions  So increase the partitions by doing imageBundleRDD repartition 11

User · Answer

I suffered from this issue a lot when using dynamic resource allocation  I had thought it would utilize my cluster resources to best fit the application  But the truth is the dynamic resource allocation doesn t set the driver memory and keeps it to its default value  which is 1G  I resolved this issue by setting spark driver memory to a number that suits my driver s memory  for 32GB ram I set it to 18G   You can set it using spark submit command as follows  spark-submit --conf spark driver memory 18g  Very important note  this property will not be taken into consideration if you set it from code  according to Spark Documentation - Dynamically Loading Spark Properties   Spark properties mainly can be divided into two kinds  one is related to deploy  like    spark driver memory        spark executor instances     this kind of properties may not be affected when setting programmatically through SparkConf in runtime  or the behavior is depending on which cluster manager and deploy mode you choose  so it would be suggested to set through configuration file or spark-submit command line options  another is mainly related to Spark runtime control  like    spark task maxFailures     this kind of properties can be set in either way

User · Answer

I have few suggession for the above mentioned error     Check executor memory assigned as an executor might have to deal with partitions requiring more memory than what is assigned     Try to see if more shuffles are live as shuffles are expensive operations since they involve disk I O  data serialization  and network I O     Use Broadcast Joins     Avoid using groupByKeys and try to replace with ReduceByKey     Avoid using huge Java Objects wherever shuffling happens

User · Answer

You should increase the driver memory  In your  SPARK HOME conf folder you should find the file spark-defaults conf  edit and set the spark driver memory 4000m depending on the memory on your master  I think  This is what fixed the issue for me and everything runs smoothly

User · Answer

Broadly speaking  spark Executor JVM memory can be divided into two parts  Spark memory and User memory  This is controlled by property spark memory fraction - the value is between 0 and 1   When working with images or doing memory intensive processing in spark applications  consider decreasing the spark memory fraction  This will make more memory available to your application work  Spark can spill  so it will still work with less memory share    The second part of the problem is division of work  If possible  partition your data into smaller chunks  Smaller data possibly needs less memory  But if that is not possible  you are sacrifice compute for memory  Typically a single executor will be running multiple cores  Total memory of executors must be enough to handle memory requirements of all concurrent tasks  If increasing executor memory is not a option  you can decrease the cores per executor so that each task gets more memory to work with   Test with 1 core executors which have largest possible memory you can give and then keep increasing cores until you find the best core count

User · Answer

Have a look at the start up scripts a Java heap size is set there  it looks like you re not setting this before running Spark worker     Set SPARK MEM if it isn t already set since we also use it for this process SPARK MEM   SPARK MEM -512m  export SPARK MEM    Set JAVA OPTS to be able to load native libraries and to set heap size JAVA OPTS   OUR JAVA OPTS  JAVA OPTS   JAVA OPTS -Djava library path  SPARK LIBRARY PATH  JAVA OPTS   JAVA OPTS -Xms SPARK MEM -Xmx SPARK MEM    You can find the documentation to deploy scripts here

User · Answer

heap space errors generally occur due to either bringing too much data back to the driver or the executor  In your code it does not seem like you are bringing anything back to the driver  but instead you maybe overloading the executors that are mapping an input record row to another using the threeDReconstruction   method  I am not sure what is in the method definition but that is definitely causing this overloading of the executor  Now you have 2 options   edit your code to do the 3-D reconstruction in a more efficient manner  do no edit code  but give more memory to your executors  as well as give more memory-overhead   spark executor memory or spark driver memoryOverhead   I would advise being careful with the increase and use only as much as you need  Each job is unique in terms of its memory requirements  so I would advise empirically trying different values increasing every time by a power of 2  256M 512M 1G    and so on  You will arrive at a value for the executor memory that will work  Try re-running the job with this value 3 or 5 times before settling for this configuration

User · Answer

The location to set the memory heap size  at least in spark-1 0 0  is in conf spark-env  The relevant variables are SPARK EXECUTOR MEMORY  amp  SPARK DRIVER MEMORY  More docs are in the deployment guide  Also  don t forget to copy the configuration file to all the slave nodes

[out-of-memory] Spark java.lang.OutOfMemoryError: Java heap space

Examples related to out-of-memory

Examples related to apache-spark