Installing Spark & Livy with Ansible

Deploying your own Spark Standalone cluster with Livy is a time consuming task due to the complexity of the deployments. Even for the guys who want to play around with spark and livy in their local machine or even in virtual machines this project will be useful.

In my recent project we are building a SAAS platform where we need to deploy dev and stage environments in different cloud providers. Therefore, I used docker and ansible for deployments. Spark can be run on top of the Hadoop ecosystem and as stand alone mode. Therefore, I created the latest version (2.4.7) of Spark Standalone docker image and the latest version of Livy (0.7.0 Incubating) image.

  1. Spark Standalone Docker Image
  2. Spark with Livy Docker Image

The following starter kit repository contains a simple ansible playbook to install a Spark Standalone cluster and Livy in docker. Current playbook contains only local ip to install docker and deploy spark cluster and livy. Based on your needs you will be able add more environments and automate your big data dev, stage and prod environments.

The repository README file contains more details about setting up the environments.

Thanks to Ansible we can scale the platform with one button click.

image

Starter Kit: Ansible Spark Livy