Code Monkey home page Code Monkey logo

illustration-dim-red's Introduction

Simple illustration of dimensionality reduction

The goal of this project is to illustrate dimensionality reduction in data analysis. It is a skeleton and it is not generalized. The user should modify the scripts in order to evaluate it for different data.

This simple project creates a plot from insideairbnb.com data for the city of Paris. The quantities observed are Price, Square Feet and Year.

Requirements

csvkit

python > 3.0

numpy > 1.16

gnuplot

Collecting the data

The scripts in this project were written for a very specific dataset, namely, insideairbnb data for Paris for the months of May from 2015 and 2018. The user can collect the data herself by visiting insideairbnb.com or to execute the script collect.sh.

Building the figures

The script paris-data-analysis.sh will collect the columns of interest for each year dataset, compute its first two principal components and create a plot with plane fitting and another with no plane fitting.

Usage:

sh paris-data-analysis.sh [Paris-Neighborhood]

Example:

sh paris-data-analysis.sh Élysée

alt

List of Neighborhoods:

Batignolles-Monceau

Bourse

Buttes-Chaumont

Buttes-Montmartre

Entrepôt

Élysée

Gobelins

Hôtel-de-Ville

Louvre

Luxembourg

Ménilmontant

Observatoire

Opéra

Palais-Bourbon

Panthéon

Passy

Popincourt

Reuilly

Temple

Vaugirard

Discussion

In this example, the plane could be used to approximate (predict) observations by storing fewer information than the original data (instead of 3n values, we need to store only 2n). The compression ratio becomes more important as we handle data with higher dimensions, the common example being images. A classical application is face recognition.

illustration-dim-red's People

Contributors

danoan avatar

Watchers

 avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.