Code Monkey home page Code Monkey logo

ex-05-pseudo-node-configuration-for-hadoop-on-ubuntu's Introduction

Ex-05-Pseudo-Node-Configuration-for-Hadoop-on-Ubuntu

AIM

To implement Pseudo Node configuration for Hadoop on ubuntu

Pre-requisites

a) jdk

Single-Node Configuration

  1. Create a dedicated user account for hadoop
    $sudo addgroup hadoop

    $sudo adduser --ingroup hadoop hduser

    $sudo usermod -a -G sudo hduser

    $su - hduser
  1. Install java1.8 in folder /usr/local
 $sudo chmod 777 /usr/local

 $cd /usr/local

 $sudo tar xvzf $HOME/Downloads/jdk1.8.tar.gz
  1. Install Hadoop
    $cd /usr/local

    $tar xvzf $HOME/Downloads/hadoop-2.5.1.tar.gz

    $sudo chmod 777 hadoop-2.5.1
  1. Set the hadoop environment variables: Include the following lines in the $HOME/.bashrc file #Set Hadoop-related environment variables
export HADOOP_HOME=/usr/local/hadoop-2.5.1

#Set JAVA home directory

    export JAVA_HOME=/usr/local/jdk1.8.0_31

#Add Hadoop bin/ directory to PATH

    export PATH=$PATH:$HADOOP_HOME/bin
  1. Set hadoop environment variables: Include the following lines /etc/profile file #--insert JAVA_HOME
    JAVA_HOME=/usr/local/jdk1.8.0_31

#--insert HADOOP_PREFIX

    HADOOP_PREFIX=/usr/local/hadoop-2.5.1

#--in PATH variable just append at the end of the line

    PATH=$PATH: $JAVA_HOME/bin:$HADOOP_PREFIX/bin

#--Append HADOOP_PREFIX at end of the export statement export

    PATH JAVA_HOME HADOOP_PREFIX
  1. Run the.bashrc & profile files from the $ prompt for updating the changes
    $ source $HOME/.bashrc

    $ source /etc/profile

Verify java & hadoop installation using

    $ java -version
    $ echo $HADOOP_PREFIX
    $ cd $HADOOP_PREFIX

    $ bin/hadoop version
  1. Configuration of the hadoop files: hadoop-env.sh, core-site.xml, mapred-site.xml, hdfs- site.xml and yarn-site.xml
   path ::	/usr/local/hadoop-2.5.1/etc/hadoop

a) hadoop-env.sh Include the following lines in hadoop-env.sh file

    export JAVA_HOME=/usr/local/jdk1.8.0_31
    export HADOOP_PREFIX=/usr/local/hadoop-2.5.1

b) core-site.xml Configure the directory for Hadoop to store its data files, the network ports it listens to, etc. Setup will use Hadoop’s Distributed File System (HDFS-single local machine)

    $ mkdir -p /app/hadoop/tmp
    $ chown hduser:hadoop /app/hadoop/tmp

Include the following lines in core-site.xml file between and tags

    <property>
            <name>hadoop.tmp.dir</name>
            <value>/app/hadoop/tmp</value>
    </property>
    <property>
            <name>fs.default.name</name>
            <value>hdfs://localhost:9000</value>
    </property>

c) mapred-site.xml

     $sudo cp mapred-site.xml.template mapred-site.xml

Include the following lines in mapred-site.xml file

    <property>
            <name>mapreduce.framework.name</name>
            <value>yarn</value>
    </property>

d) hdfs-site.xml Include the following lines in hdfs-site.xml file

    <property>
            <name>dfs.replication</name>
            <value>1</value>
    </property>

e) yarn-site.xml Include the following lines in yarn-site.xml file

    <property>
            <name>yarn.nodemanager.aux-services</name>
            <value>mapreduce_shuffle</value>
    </property>
  1. Format the Hadoop File system implemented on top of the local file system using
    $/usr/local/hadoop-2.5.1/bin/hadoop namenode –format
  1. Start Hadoop using
    $sbin/start-all.sh

Explore Hadoop using http://localhost:50070/ from the browser

  1. The commonly used HDFS Commands are as follows:
### mkdir :
##### Creates a directory in the given path:
    syntax: bin/hdfs dfs -mkdir <paths>

### ls
##### Lists the files in a given path
    syntax:bin/hdfs dfs -ls <args>

### cp
##### Copies files from source to destination. This command allows multiple sources as well in which case the destination must be a directory.
    syntax:bin/hdfs dfs -cp <source> <dest>
  1. Create a directory ‘/input’ in HDFS
    $bin/hdfs dfs -mkdir /input
  1. Copy the input files into the distributed file system
    $cd $HOME/Downloads/

    $ tar xvzf $HOME/Downloads/mrsampledata.tar.gz
  1. Run some of the examples provided
    $ bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce- examples-2.5.1.jar grep /input /output '(CSE)'
  1. Examine the output files

    $bin/hdfs dfs -put $HOME/Downloads/mrsampledata/* /input

Copy the output files from the distributed file system to the local file system and examine them:

    $ bin/hdfs dfs -get output output
    
    $ cat output/*

or View the output files on the distributed file system

    $ bin/hdfs dfs -cat /output/*

Result:

Thus, the implementation of Pseudo Node configuration for Hadoop on ubuntu is successfully executed.

ex-05-pseudo-node-configuration-for-hadoop-on-ubuntu's People

Contributors

drmalathisaravanan avatar ponnagarajanmurugan avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.