Code Monkey home page Code Monkey logo

ofxtsne's Introduction

ofxTSNE

ofxTSNE is an addon for openframeworks which wraps the t-SNE (t-Distributed Stochastic Neighbor Embedding) algorithm by Laurens van der Maaten.

t-SNE is a technique for reducing the dimensionality of large, high-dimension datasets, typically to 2 or 3 dimensions. It has a similar function to Principal Component Analysis (see ofxPCA) which reduces a dataset's dimensionality by reorienting it along its principal axes, but differs in that it tends to better preserve point-wise distances, making it more suitable for visualization of high-dimensional data.

ofxTSNE is very simple to run, containing only one function. The harder part is getting data.

Examples

basic example

t-SNE toy data

example demonstrates how to use ofxTSNE by constructing a toy 100-dim dataset. It contains comments explaining what the parameters do and how to set them.

clever hack: try setting D=3 and instead of making points clustered around 10 centers, make the points random 3d points and map the point's color linearly from its 3d position.

clustering images

t-SNE images from Caltech-256

example-images applies t-SNE to a directory of images. It uses ofxCcv to encode each image as a compact (4096-dim) feature vector derived from a trained convolutional neural network. The resulting representation captures high-level similarities among images, enabling ofxTSNE to group them effectively according more to content (e.g. images of cats get clustered together), relatively invariant to changes in color, lighting, position, etc.

To run this example, you need to take a few extra steps.

  1. Get ofxCcv

  2. run the setup_ccv script to download the trained convnet.

    sh setup_ccv.sh

  3. Then you need to populate a folder called 'images' inside your data folder. Be careful to use small-sized images because the entire directory will be loaded into memory. I've provided a script which downloads 20 images each from 31 categories in Caltech-256. If you'd like to download those, run:

    python download_images.py

Or if you want to download a set of animals from the same source, open download_images.py and change the line categories = categories_random to categories = categories_animals.

Gridding clusters

You can easily find an optimal 2d grid which preserves your t-SNE clusters using a 2d assignment algorithm. @Quasimondo's RasterFairy or Kyle McDonald's ofxAssignment will work, or something similar. Applying RasterFairy to a t-SNE of the animals subset from Caltech-256 (see download_images.py), you get a result that looks like the following:

t-SNE animals grid

ofxtsne's People

Contributors

genekogan avatar

Stargazers

Pulkit Khandelwal avatar

Watchers

James Cloos avatar Mohammad Pezeshki avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.