Code Monkey home page Code Monkey logo

machine-learning-tutorial-notebooks's Introduction

Truly understanding Machine Learning

During my time studying Mechanical Engineering, Computer Engineering, Software Development, and finally breaking into the field of Data Science and Machine Learning, I have learned a lot about the best way the I learn. It is clear at this point that for a beginner, jumping right into a text book is rarely the best route to follow. The concepts, and more importantly just the language, will seem very difficult to comprehend, and most likely leave you discouraged that the material is simply outside of your grasp.

This is especially true in the field of Data Science and Machine Learning, where several other technical disciplines intertwine:

  • Statistics
  • Probability
  • Computer Science
  • Calculus
  • Linear Algebra

By jumping into formula heavy text books, and skipping out on the real world applications of Machine Learning (which lets be honest, are pretty awesome), there is no incentive to continue and push forward. This is where the Top down approach was introduced. Schools systems generally teach via a bottom up approach - giving students the small building block thats can be combined in the end to create a grand system. Again, this leaves the learner wanting more, and often struggling to connect the dots of why this small building is useful and why should I care?

The top down approach throws you right into the deep end, allowing you to work with premade algorithms and libraries, without fully understanding the math and intuitions behind the overall system. This, in my opinion, is a much better approach, but still leaves a bit to be desired. As with anything, I feel that balance is the key here. The goal should be to use real world examples to teach the mechanics of what is going on under the hood. To often concepts that are taught are treated as black boxes, and rote memorization is used to get through. This worked for a time, but in the field of Data Science and Machine Learning, it will not. There is no one size fits all - it is messy, chaotic, and unclear. And that is our job as a Data Scientist - to bring clarity to a problem, and help find a resolution.

With that said, when learning anything (especially since the advent of the internet), I find that there are many resources a person ends up utilizing to reach a point of mastery. There are open courses, blogs, text books, academic papers, youtube videos, tutorials, and so on. This is wonderful, but what I find is that they end up spread out all over the place and end up being very difficult to keep track of. For example, say while learning about Linear Regression you realize you don't fully understand the linear algebra behind it. You then search around, do a bit of googling, and find the following:

  • a youtube video with a nice animation of what it actually represents
  • a blog post going through a relevant example
  • a set of images that walk through the mechanics of matrix multiplication

At that point, you have a solid intuitive grasp of what is going on. Two weeks later, however, it has slightly faded and you don't remember where you found those resources. That is part of the goal of this repo. When learning anything, there are blocking points that we all hit- I wanted to specifically detail mine at the exact moments they occured, and then pull in the resources, links, images, summary's, etc, from all different avenues that I found helpful.

The goal at the end of the day is to build intuitions. Anyone can follow a basic process of predetermined steps and arrive at a solution. But we want to create intuitions of what is actually going on, so that if the situation was broken from its cookie cutter form we would be able to take that in stride and still understand what is going on. So with that said...

These notebooks are designed with two main purposes in mind:

  1. High light my exact journey to teach myself machine learning and data science
  2. Develop key intuitions about what is really happening.

With that said, here is one final quote to always remember (and one that I remind myself of as I put together each one of these notebooks). It is from Richard Feynman as he was attempting to explain Fermi-Dirac statistics.

Feynman was a truly great teacher. He prided himself on being able to devise ways to explain even the most profound ideas to beginning students. Once, I said to him, “Dick, explain to me, so that I can understand it, why spin one-half particles obey Fermi-Dirac statistics.” Sizing up his audience perfectly, Feynman said, “I’ll prepare a freshman lecture on it.” But he came back a few days later to say, “I couldn’t do it. I couldn’t reduce it to the freshman level. That means we don’t really understand it.”

How is this repo setup

This repo consists of two main directories

  1. A machine learning perspective
  2. A Statistical Persective

Machine Learning Perspective

This repo is mainly based on Andrew NG's free Machine Learning course. Link: https://www.coursera.org/learn/machine-learning

Statistical Learning Perspective

This repo is mainly based on the textbook Introduction to Statistical Learning with Applications in R. Link: http://www-bcf.usc.edu/~gareth/ISL/index.html

To gain maximal insight from the notebooks I recommend following along with the resources associated with each (I will detail my recommendations at the start of each notebook)

Setup Instructions

  1. Navigate to the directory where you would like to be storing this repo
  2. Run git clone https://github.com/NathanielDake/machine-learning-tutorial-notebooks.git
  3. Change directories into that notebook, i.e. cd machine-learning-tutorial-notebooks
  4. Run jupyter notebook
  5. This will spin up the notebook frontend as well as the kernel.
  6. Stay up to date with the most recent changes by running git pull

Good Luck!

I wish anyone following along with these tutorial the best of luck on the Journey to understanding Machine Learning.

machine-learning-tutorial-notebooks's People

Contributors

nathanieldake avatar

Watchers

James Cloos avatar geoHeil avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.