Your USB 'Spark on a stick' contains a copy of the open source Apache 2.0 licensed Spark 1.1 project, including the higher libraries like: MLlib, SparkSQL, Spark Streaming, Tachyon, BlinkDB, SparkR, etc.
This gitbook will bootstrap you on your way to learning Spark from a DevOps perspective, so a little bit of development and a bit of operations.
You'll see how to:
The accompanying slides for the 1-day "Intro to Apache Spark" workshop can be found here: http://training.databricks.com/workshop/itas_workshop.pdf
There is also a 4-part recording of the "Intro to Apache Spark" training workshop from the San Francisco Spark Summit 2014 available here: https://www.youtube.com/watch?v=VWeWViFCzzg&list=PLTPXxbhUt-YWSgAUhrnkyphnh0oKIT8-j
Note, this is a Windows version of the lab document. There are comments included for running on OS X or linux, but in general you may need to tweak the instructions a bit for different OSes.
This document is licensed under Creative Commons, so feel free to print it, share it and especially add to it. This section covers how to make GitHub pull requests to help grow this document.