Code Monkey home page Code Monkey logo

mrjobdemolab2's Introduction

MrJobDemoLab2

python MrJob_CommonFriends.py ../Friends.csv

This demo shows how to find common friends between two individuals using Map-Reduce and MRJob. Data is given in the Friends.csv file, and is in the format:

Individual Friend
A B D E
B A D
D A B
E A

To find common friends, our mapper should produce keys that are combination of an individual and a friend, and the value for each key is the individual's friends. Note that we would like our keys to be tuples in sorted alphabetical order, so the reducer can combine the output of each key, as (A,B) is different from (B,A). For our example above, the mapper would then generate the following key value pairs:

Key Value
(A,B) [B D E]
(A,D) [B D E]
(A,E) [B D E]
---------- ------
(A,B) [A D]
(B,D) [A D]
---------- ------
(A,D) [A B]
(B,D) [A B]
---------- ------
(A,E) [A]
---------- ------

Once we have all the keys, the reducer takes all values associated with key, and we then take the interesection between the two sets. For example, let's take key (A,B). This key has values [[B, D, E],[A ,D]]. Taking the intersection of these two sets results in the common friends between A and B. THe following Table is the results of running our map-reduce job:

Resulting Key Common Friends
(A,B) [D]
(A,D) [B]
(A,E) []
(B,D) [A]

Looking above at our original table, individuals A and B only have friend D in common, so this is correct. In order to run the job locally, go to the /src file and run python MrJob_CommonFriends.py ../Friends.csv

mrjobdemolab2's People

Contributors

abhon avatar nancywen25 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.