How to light your 'Spark on a stick'

Locate the Spark JVM

It is possible to use the Sysinternals tool, Process Explorer, to locate the Spark JVM process and specifically see what parameters the JVM was started with.

There is no need to run Process Explorer in this lab, but if you are curious, the following screenshot shows the java.exe process containing the Spark JVM and the command used the launch it:

The full command line used to launch the Spark JVM is below:

"java" -cp ";;F:\spark\bin..\conf;F:\spark\bin..\assembly\target\scala-2.10\spark-assembly-1.1.0-hadoop1.0.4.jar;;F:\spark\bin..\lib_managed\jars\datanucleus-api-jdo-3.2.1.jar;F:\spark\bin..\lib_managed\jars\datanucleus-core-3.2.2.jar;F:\spark\bin..\lib_managed\jars\datanucleus-rdbms-3.2.1.jar;" -XX:MaxPermSize=128m -Xms512m -Xmx512m org.apache.spark.deploy.SparkSubmit --class org.apache.spark.repl.Main spark-shell

  • Notice above that the Spark JVM is started with 512 MB of RAM (Xms attribute) and can grow to a max size of 512 MB (Xmx attribute)