How to light your 'Spark on a stick'

spark-defaults.sh

You do not need to make any changes to the spark-defaults.sh file! This section is just FYI.

Spark properties set in the spark-defaults.sh file control most Spark parameters.

The spark-defaults.sh file is located in the following directory: USB:\spark\conf\

Note, the actual file in the directory will be named spark-defaults.conf.template. In order to set configurations in this file, first copy the file to a new file named spark-defaults.sh and make the changes in the new file.

Example file:

spark.master            spark://5.6.7.8:7077
spark.executor.memory   512m
spark.eventLog.enabled  true
spark.serializer        org.apache.spark.serializer.KryoSerializer
spark.driver.memory              5g
spark.executor.extraJavaOptions  -XX:+PrintGCDetails -Dkey=value -Dnumbers="one two three"


Instead of setting Spark properties in this file, it is possible to also set properties during run time in the SparkConf or via the spark-submit command. For details, see: https://spark.apache.org/docs/latest/configuration.html

Properties set directly on the SparkConf take highest precedence, then flags passed to spark-submit or spark-shell, then options in the spark-defaults.conf file.