Code Monkey home page Code Monkey logo

analysis's Introduction

ANALYSIS

As DS application demo part of the "Daas (Data as a service) repo", this repo using jupyter notebook (mainly) as media showing step-by-step analysis and ML/DL approaches on various data science subjects. The idea is : demo how does a data scientist deal with a new dataset, pre-process the data, do exploration analysis (EDA), then running suitable model and offering suggestions with business feasibility and acceptable statistical errors. (i.e. DS workflow : business understanding -> data preprocess -> EDA -> data understanding -> analysis/modeling ). Main focus of this project: 1) Statistics/ML analysis 2) ML theory/algorithms explanation 3) Spark op/ML demo

Quick Start

Quick_start.md

File Structure

├── DE_course       : Code for Udacity data engineer course 
├── DL_             : Deep learning relative projects  
├── DS_algorithms   : Build Data science model from scratch 
├── GPU             : GPU relative code 
├── ML_             : Machine learning relative projects  
├── README.md
├── R_              : R programming language relative projects 
├── SPARK_          : Pyspark basics/op/ML/ETL notebook demo projects
├── Statistics_     : Statistics relative projects 
├── archived        : Archived code/projects 
├── doc             : Doc for quick start, theory paper, pic.. and so on
├── ml_demo.py 
├── notebook        : Jupyter notebook relative projects (nb server/magic..)
├── project         : Archived projects 
├── pytorch_        : Pytorch relative projects 
├── tensorflow_     : Tensorflow relative projects
└── utility         : Utility scripts for ML/DL model tuning, DS plots...

Main Projects

Machine Learning

Tensorflow Demo

Statistics

Spark

spark op intro

  • Pyspark Basic 1 - Basic spark ops (transform & action): RDD,Map,FlatMap, Reduce,filter, Distinct, Intersection
  • Pyspark Basic 2 -Basic spark ops : load csv,dataframe,SparkSQL, transformation in [RDD, dataframe, SparkSQL]
  • Pyspark Basic 3 -Basic spark ops : Spark DataFrame OP

spark ML intro

spark APP

Development

  • dev

analysis's People

Contributors

yennanliu avatar yennanliuj avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.