Code Monkey home page Code Monkey logo

spark_gce's Introduction

Spark GCE

Spark GCE is like Spark Ec2 but for those who run their cluster on Google Cloud.

  • Make Sure you have installed and authenticated gcutils where you are running this script.
  • Helps you launch a spark cluster in the Google Cloud
  • Attaches 500GB empty disk to all nodes in the cluster
  • Installs and Configures everything Automatically
  • Starts the Shark server Automatically

Spark GCE is a python script which will help you launch a spark cluster in the google cloud like the way spark_ec2 script does for AWS.

Usage

spark_gce.py project-name number-of-slaves slave-type master-type identity-file zone cluster-name

  • project-id: Project ID of the project where you are going to launch your spark cluster.

  • number-of-slave: Number of slaves that you want to launch.

  • slave-type: Instance type for the slave machines.

  • master-type: Instance type for the master node.

  • identity-file: Identity file to authenticate with your GCE instances, Usually resides at ~/.ssh/google_compute_engine once you authenticate using gcutils.

  • zone: Specify the zone where you are going to launch the cluster.

  • cluster-name: Name the cluster that you are going to launch.

spark_gce.py project-name cluster-name destroy

  • project-id: Project id of the project where the spark cluster is at.
  • cluster-name: Name of the cluster that you are going to destroy.

Installation

git clone https://github.com/sigmoidanalytics/spark_gce.git
cd spark_gce
python spark_gce.py

Need Help?

spark_gce's People

Contributors

joemathai avatar sigmoidanalytics avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

spark_gce's Issues

Hadoop Cluster Not Working

We are able to use Spark Cluster and run our job but Hadoop Cluster is not working as we are unable to access hdfs from master node.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.