Code Monkey home page Code Monkey logo

hadoop-mapreduce-3's Introduction

ko-fi

Hadoop MapReduce Demo

Versions:

  • Hadoop 3.1.1
  • Java10

Set the following environment variables:

  • JAVA_HOME
  • Hadoop_HOME

For Windows:

Download Hadoop 3.1.1 binaries for windows at https://github.com/s911415/apache-Hadoop-3.1.0-winutils. Extract in Hadoop_HOME\bin and make sure to override the existing files.

For Ubuntu:

$ ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa
$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
$ chmod 0600 ~/.ssh/authorized_keys

1.) Create the following folders:

  • Hadoop_HOME/tmp
  • Hadoop_HOME/tmp/dfs/data
  • Hadoop_HOME/tmp/dfs/name

2.) Set the following properties: core-site.xml and hdfs-site.xml

<property>
  <name>fs.defaultFS</name>
  <value>hdfs://localhost:9001</value>
</property>

core-site.xml

<property>
	<name>Hadoop.tmp.dir</name>
	<value>Hadoop_HOME/tmp</value>
</property>

hdfs-site.xml

<property>
	<name>dfs.namenode.name.dir</name>
	<value>file:///Hadoop_HOME/tmp/dfs/name</value>
</property>
<property>
	<name>dfs.datanode.data.dir</name>
	<value>file:///Hadoop_HOME/tmp/dfs/data</value>
</property>
<!-- Will allow us to upload or create file later in hdfs -->
<property>
	<name>dfs.permissions</name>
	<value>false</value>
</property>

3.) Run Hadoop namenode -format Don't forget the file:/// prefix in hdfs-site.xml for windows. Otherwise, the format will fail.

4.) Run Hadoop_HOME/sbin/start-dfs.xml.

5.) If all goes well, you can check the log for the web port in the console. In my case it's http://localhost:9870.

6.) You can now upload any file in the #4 URL.

Now let's try to create a project that will test our Hadoop setup. Or download an already existing one. For example this project: https://www.guru99.com/create-your-first-Hadoop-program.html. It has a nice explanation with it, so let's try. I've repackaged it into a pom project and uploaded at Github at https://github.com/czetsuya/Hadoop-MapReduce.

1.) Clone the repository.

2.) Open the hdfs url from the #5 above, and create an input and output folder.

3.) In input folder, upload the file SalesJan2009 from the project's root folder.

4.) Run Hadoop jar Hadoop-mapreduce-0.0.1-SNAPSHOT.jar /input /output.

5.) Check the output from the URL and download the resulting file.

Here's a more complicated example https://github.com/rathboma/Hadoop-framework-examples/tree/master/java-mapreduce.

Common cause of problems:

  • Un-properly configured core-site or hdfs-site related to data and name node?
  • File / folder permission

hadoop-mapreduce-3's People

Contributors

czetsuya avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.