Code Monkey home page Code Monkey logo

emr-oozie-sample's Introduction

EMR-OOZIE-SAMPLE
----------------

simple example of Elastic Map Reduce bootstrap actions for configuring apache oozie.

To use
======

see the makefile.  pretty straightforward, there is a boostrap action bash
script that will install and configure the oozie software on the emr cluster.  
the script is in the config subdirectory, and the makefile shows how to use it
when launching a emr cluster.

Prerequisites
=============

you need the following:
  - make
  - emr commandline tools (seem below for html link)
  - s3cmd (see below)


to run
======

% make create

will create a new emr cluster the oozie installed using the bootstrap action
in config/config-oozie.sh

% make destroy

will terminate the emr cluster 

% make ssh

this will ssh into the head node of the cluster.

to use oozie
============

the oozie software is installed in /opt/oozie...

the oozie-server, a web-based console, is running on port 11000.  once you create the cluster, you can set up a
ssh tunnel (using % make sshproxy) then open a web browser to http://localhost:11000.  

% make sshproxy

(runs the ssh command that sets a local tunnel)

% make socksproxy

(runns an ssh command with a socks proxy.  then configure your local browser to use a v4 socks proxy as localhost port 8888)


To submit a job, you can use the commandline oozie tool in /opt/oozie.../bin/ozzie

for example (once logged into the head node)::

# change user to user oozie
% sudo bash
% su - oozie

# untar the examples and copy them to hdfs
% cd /opt/oozie-3.1.3-incubating/
% tar -zxvf oozie-examples.tar.gz
% . /home/hadoop/.bashrc
% hadoop fs -mkdir /user/oozie/
% hadoop fs -put examples /user/oozie/

# edit the job.properties file for sample application
% vi examples/apps/map-reduce/job.properties

Here is what I used, for name node, look in /home/hadoop/conf/core-site.xml, 
and for the jobTracker look in /home/hadoop/conf/mapred-site.xml

nameNode=hdfs://10.245.53.236:9000
jobTracker=10.245.53.236:9001
queueName=default
examplesRoot=examples


# now run the job
% ./bin/oozie job  -oozie http://localhost:11000/oozie -config
./examples/apps/map-reduce/job.properties -run
job: 0000001-120927170547709-oozie-oozi-W

Now you can use the oozie web console, or the hadoop job tracker console to track the job.

from the commandline:
oozie@ip-10-245-53-236:/opt/oozie-3.1.3-incubating$ hadoop fs -get /user/oozie/examples/output-data/map-reduce/part-00000 .
oozie@ip-10-245-53-236:/opt/oozie-3.1.3-incubating$ more part-00000 
0	To be or not to be, that is the question;
42	Whether 'tis nobler in the mind to suffer
84	The slings and arrows of outrageous fortune,
129	Or to take arms against a sea of troubles,
172	And by opposing, end them. To die, to sleep;
217	No more; and by a sleep to say we end
...





External Links that are useful:
===============================

http://jayatiatblogs.blogspot.com/2011/05/oozie-installation.html - troubleshooting/tips/tricks with oozie installation

http://incubator.apache.org/oozie/docs/3.1.3/docs/DG_Examples.html - example oozie runs

http://docs.amazonwebservices.com/ElasticMapReduce/latest/DeveloperGuide/Bootstrap.html#PredefinedBootstrapActions_RunIf 
docs for bootstrap actions

http://incubator.apache.org/oozie/docs/3.1.3/docs/DG_QuickStart.html  - oozie documentation for install

http://aws.amazon.com/developertools/2264  - emr commandline tools


http://s3tools.org/download - s3cmd commandline tools 


emr-oozie-sample's People

Contributors

davideanastasia avatar lila avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.