Code Monkey home page Code Monkey logo

coursework's Introduction

Coursework for MIDS Scaling Up! Really Big Data

This is an index of coursework for the MIDS class "Scaling Up! Really Big Data". Please submit corrections if you find problems in the assignments. Submissions should be well-formed git pull requests.

Week 1: Introduction

Labs

  1. Provisioning in SoftLayer

Week 2: Cloud Computing 101

Homework

  1. Working with Cloud Resources

Labs

  1. Salt States and Docker deployment of the ELK stack

Week 3: Openstack Introduction

Labs

  1. Hadoop over OpenStack DevStack using Sahara

Week 4: Distributed Filesystems

Homework

This is a graded homework

  1. Part 1- GPFS setup
  2. Part 2- The Mumbler

Labs

There will be no in-class lab for this assignment

Week 5: Map Reduce and Hadoop

Homework

  1. Hadoop Distributed Sort with YARN and HDFS

Labs

(Complete the following in order)

  1. Load Google 2-gram dataset into HDFS
  2. Preprocess 2-gram data for Mumbler

Week 6: Apache Spark

Homework

  1. Apache Spark Introduction

Labs

  1. Machine Learning with Spark and MLLib

Week 7: Object Storage

Homework

  1. Object Storage

Labs

(Complete the following in order)

  1. Data Transfer Performance
  2. Rsync Investigation

Week 8: NoSQL

Homework

  1. NoSQL

Week 9: Spark Streaming

Homework

  1. Streaming Tweet Processing

Labs

  1. Spark Streaming and Cassandra

Week 10: Scaling Up

Homework

  1. Orchestrate with Brooklyn

Labs

  1. Brooklyn labs

Week 11: Spark ML Round 2

(Homework-free week!)

Labs

  1. Streaming Analytics with AlchemyAPI

Week 12: Search

Labs

  1. Crawling the Web with Nutch, Indexing with Solr

Homework

  1. Elasticsearch

Week 13: Genomics

Homework

  1. Genomics with ADAM

coursework's People

Contributors

anewm avatar bmwshop avatar dyejon avatar ericwhyne avatar jredmann avatar michaeldye avatar rboberg avatar rbraddes avatar rdejana avatar tkunicki avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.