Code Monkey home page Code Monkey logo

mlads2017s-mrsdeploy's Introduction

Operationalization using Microsoft R Server on single node machines and Spark clusters

Prerequisites

  • Please bring a wireless enabled laptop.
  • Download free Postman app for API development.
  • Make sure your machine has an ssh client with port-forwarding capability. On Mac or Linux, simply run the ssh command in a terminal window. On Windows, download plink.exe from here. Alternatively, see this page for details on the Windows shell options.
  • Provision a Linux CentOS Data Science VM (DSVM) on Azure Portal following these instructions.
    • Make sure to provision Standard DS12_V2 type.
    • IMPORTANT: For the VM user name please use remoteuser!

Connecting to the Data Science Virtual Machine on Microsoft Azure

We will provide Azure Data Science Virtual Machines (running Spark 2.0.2) for attendees to use during the tutorial. You will use your laptop to connect to your allocated virtual machine.

  1. Connect to your DSVM

    • Linux, Mac, or Windows Linux Shell: Command line to connect using ssh: Replace XXX with the public IP address of your Data Science Virtual Machine (e.g. [email protected])
    ssh -L localhost:8787:localhost:8787 remoteuser@XXX
    • Windows: Command line to connect with plink.exe - run the following commands in a Windows command prompt window - replace XXX with the public IP address of your Data Science Virtual Machine (e.g. [email protected])
    cd directory-containing-plink.exe
    .\plink.exe -L localhost:8787:localhost:8787 remoteuser@XXX

    See this page for details on the Windows shell options. We are creating an SSH tunnel to the VM by mapping localhost:8787 on the VM to the client machine. This is the port on the VM opened to RStudio Server.

  2. Once you are connected, become a root user on the cluster. In the SSH session, use the following command.

    sudo su -
  3. Download the course material from the git repository using the following command

    git clone https://github.com/vapaunic/mlads2017s-mrsdeploy.git
  4. Change the permissions on the custom script file and run the script. Use the following commands.

    cd mlads2017s-mrsdeploy
    chmod +x DSVM_Customization_Script.sh
    dos2unix ./DSVM_Customization_Script.sh
    
    ./DSVM_Customization_Script.sh
  5. After connecting via the above command lines, you can access RStudio Server by opening a web browser and typing the following URL. You will be prompted to sign in with your credentials.

    http://localhost:8787/ 

    RStudio Server

Suggested Reading prior to tutorial date

Microsoft R Server:

Microsoft R Server general information: https://msdn.microsoft.com/en-us/microsoft-r/rserver. Microsoft R Servers are installed on both Azure Linux DSVMs and HDInsight clusters (see below), and will be used to run R code in the tutorial.

R-Server Operationalization service:

Microsoft R Server operationalization service general information: https://msdn.microsoft.com/en-us/microsoft-r/operationalize/about

Configuring operationalization: https://msdn.microsoft.com/en-us/microsoft-r/operationalize/configuration-initial

SparkR (Spark 2.0.2):

SparkR general information: http://spark.apache.org/docs/latest/sparkr.html

SparkR 2.0.2 functions: https://spark.apache.org/docs/2.0.2/api/R/index.html

RevoScaleR:

RevoScaleR functions: https://msdn.microsoft.com/en-us/microsoft-r/scaler/scaler

Platforms & services for hands-on exercises or demos

Azure Linux DSVM (Data Science Virtual Machine)

Information on Linux DSVM: https://azuremarketplace.microsoft.com/en-us/marketplace/apps/microsoft-ads.linux-data-science-vm

The Linux DSVM has Spark (2.0.2) installed, as well as Yarn for job management, as well as HDFS. So, you can use the DSVM to run regular R code as well as code that run on Spark (e.g. using SparkR package). You will use DSVM as a single node Spark machine for hands-on exercises. We will provision these machines and assign them to you at the beginning of the tutorial.

mlads2017s-mrsdeploy's People

Contributors

vapaunic avatar

Stargazers

 avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.