Code Monkey home page Code Monkey logo

aws-machine-learning-university-accelerated-tab's Introduction

logo

Machine Learning University: Accelerated Tabular Data Class

This repository contains slides, notebooks, and datasets for the Machine Learning University (MLU) Accelerated Tabular Data class. Our mission is to make Machine Learning accessible to everyone. We have courses available across many topics of machine learning and believe knowledge of ML can be a key enabler for success. This class is designed to help you get started with tabular data (spreadsheet-like tables), learn about widely used Machine Learning techniques for tabular data, and apply them to real-world problems.

YouTube

Watch all Tabular Data class video recordings in this YouTube playlist from our YouTube channel.

Playlist

Course Overview

There are three lectures and one final project for this class. Lecture 1

title studio lab
Introduction to ML -
Sample ML Model -
Model Evaluation Open In Studio Lab
Exploratory Data Analysis Open In Studio Lab
K Nearest Neighbors (KNN) Open In Studio Lab
Final Project Open In Studio Lab

Lecture 2

title studio lab
Feature Engineering Open In Studio Lab
Tree-based Models Open In Studio Lab
Bagging -
Hyperparameter Tuning -
AWS AI/ML Services Open In Studio Lab

Lecture 3

title studio lab
Optimization -
Regression Models -
Boosting -
Neural Networks NN Open In Studio Lab
MXNet Open In Studio Lab
AutoML Open In Studio Lab

Final Project: Practice working with a "real-world" tabular dataset for the final project. Final project dataset is in the data/final_project folder. For more details on the final project, check out this notebook.

Interactives/Visuals

Interested in visual, interactive explanations of core machine learning concepts? Check out our MLU-Explain articles to learn at your own pace!

Contribute

If you would like to contribute to the project, see CONTRIBUTING for more information.

License

The license for this repository depends on the section. Data set for the course is being provided to you by permission of Amazon and is subject to the terms of the Amazon License and Access. You are expressly prohibited from copying, modifying, selling, exporting or using this data set in any way other than for the purpose of completing this course. The lecture slides are released under the CC-BY-SA-4.0 License. The code examples are released under the MIT-0 License. See each section's LICENSE file for details.

aws-machine-learning-university-accelerated-tab's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

aws-machine-learning-university-accelerated-tab's Issues

Testing the model on data used to train it?

One thing we can do is to test the model with the data we used to train it, and use sklearn's metrics functions to examine the performance of the classifier.

Isn't this wrong? Shouldn't the data set be divided into training and validation sets?

Problem with final task dataset

After loading final project data

import pandas as pd
import numpy as np

import warnings
warnings.filterwarnings("ignore")
  
training_data = pd.read_csv('../data/final_project/training.csv')
test_data = pd.read_csv('../data/final_project/test_features.csv')
y_test = pd.read_csv('../data/final_project/y_test.csv')

print('The shape of the training dataset is:', training_data.shape)
print('The shape of the test dataset is:', test_data.shape)
print('The shape of the y_test is:', y_test.shape)

The shape of the training dataset is: (71538, 13)
The shape of the test dataset is: (23846, 12)
The shape of the y_test is: (23845, 1)

The number of samples for test features differs from y_test.
Is it correct?

YouTube closed captions in Dutch

Hi,
YouTube videos of this course has set Dutch automatic closed caption language, instead English. As a English second language learner, support of closed caption is valuable; this setup error prejudice either Google automatic translation to other languages.
Congratulations for the course content, it's great!

NB: I apologize to post this issue here, but I didn't find the correct channel to report this on YouTube.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.