Code Monkey home page Code Monkey logo

unsupervised-learning's Introduction

Unsupervised-Learning

Part 1

DOMAIN:

Automobile

CONTEXT:

The data concerns city-cycle fuel consumption in miles per gallon, to be predicted in terms of 3 multivalued discrete and 5 continuous attributes

DATA DESCRIPTION:

The data concerns city-cycle fuel consumption in miles per gallon

Attribute Information:

  1. mpg: continuous
  2. cylinders(cyl): multi-valued discrete
  3. displacement(disp): continuous
  4. horsepower(hp): continuous
  5. weight(wt): continuous
  6. acceleration(acc): continuous
  7. model year(yr): multi-valued discrete
  8. origin: multi-valued discrete
  9. car name: string (unique for each instance)

PROJECT OBJECTIVE:

Goal is to cluster the data and treat them as individual datasets to train Regression models to predict ‘mpg’

Part 2

DOMAIN:

Manufacturing

CONTEXT:

Company X curates and packages wine across various vineyards spread throughout the country.

DATA DESCRIPTION:

The data concerns the chemical composition of the wine and its respective quality.

Attribute Information:

  1. A, B, C, D: specific chemical composition measure of the wine
  2. Quality: quality of wine [ Low and High ]

PROJECT OBJECTIVE:

Goal is to build a synthetic data generation model using the existing data provided by the company.

Part 3

DOMAIN:

Automobile

CONTEXT:

The purpose is to classify a given silhouette as one of three types of vehicle, using a set of features extracted from the silhouette.The vehicle may be viewed from one of many different angles.

DATA DESCRIPTION:

The data contains features extracted from the silhouette of vehicles in different angles. Four "Corgie" model vehicles were used for the experiment: a double decker bus, Cheverolet van, Saab 9000 and an Opel Manta 400 cars. This particular combination of vehicles was chosen with the expectation that the bus, van and either one of the cars would be readily distinguishable, but it would be more difficult to distinguish between the cars.
All the features are numeric i.e. geometric features extracted from the silhouette.

PROJECT OBJECTIVE:

Apply dimensionality reduction technique – PCA and train a model using principal components instead of training the model using just the raw data.

Part 4

DOMAIN:

Sports management

CONTEXT:

Company X is a sports management company for international cricket.

DATA DESCRIPTION:

The data is collected belongs to batsman from IPL series conducted so far. Attribute Information:

  1. Runs: Runs score by the batsman
  2. Ave: Average runs scored by the batsman per match
  3. SR: strike rate of the batsman
  4. Fours: number of boundary/four scored
  5. Six: number of boundary/six scored
  6. HF: number of half centuries scored so far

PROJECT OBJECTIVE:

Goal is to build a data driven batsman ranking model for the sports management company to make business decisions.

Part 5

Questions:

  1. List of all possible dimensionality reduction techniques that can be implemented using python.
  2. Dimensionality reduction illustration on Text Data

unsupervised-learning's People

Contributors

shoaib555 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.